[jira] [Commented] (KYLIN-2328) Reduce the size of metadata uploaded to distributed cache

2016-12-28 Thread Shaofeng SHI (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15784573#comment-15784573
 ] 

Shaofeng SHI commented on KYLIN-2328:
-

+1 good points; when there are lots of segment, each time submitting all 
segments' dict repeatedly is expensive, this patch optimized this very well.

> Reduce the size of metadata uploaded to distributed cache
> -
>
> Key: KYLIN-2328
> URL: https://issues.apache.org/jira/browse/KYLIN-2328
> Project: Kylin
>  Issue Type: Improvement
>  Components: Job Engine
>Affects Versions: all
>Reporter: Dayue Gao
>Assignee: Dayue Gao
> Fix For: v2.0.0
>
> Attachments: KYLIN-2328.patch
>
>
> Currently, each MR job uploads all the metadata belonging to a cube to 
> distributed cache. When the total size of metadata increases, the submission 
> time ("MapReduce Waiting" at Monitor UI) also increases and could become a 
> significant problem.
> We could actually optimize the amount of metadata uploaded according to the 
> type of job, for example
> * CuboidJob only needs dictionary of the building segment
> * CubeHFileJob doesn't need any dictionary



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KYLIN-2328) Reduce the size of metadata uploaded to distributed cache

2016-12-28 Thread Shaofeng SHI (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15784572#comment-15784572
 ] 

Shaofeng SHI commented on KYLIN-2328:
-

Hi Dayue, you're correct, KafkaFlatTableJob doesn't need cube metadata, it just 
persistent the messages from Kafka to HDFS.

> Reduce the size of metadata uploaded to distributed cache
> -
>
> Key: KYLIN-2328
> URL: https://issues.apache.org/jira/browse/KYLIN-2328
> Project: Kylin
>  Issue Type: Improvement
>  Components: Job Engine
>Affects Versions: all
>Reporter: Dayue Gao
>Assignee: Dayue Gao
> Fix For: v2.0.0
>
> Attachments: KYLIN-2328.patch
>
>
> Currently, each MR job uploads all the metadata belonging to a cube to 
> distributed cache. When the total size of metadata increases, the submission 
> time ("MapReduce Waiting" at Monitor UI) also increases and could become a 
> significant problem.
> We could actually optimize the amount of metadata uploaded according to the 
> type of job, for example
> * CuboidJob only needs dictionary of the building segment
> * CubeHFileJob doesn't need any dictionary



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KYLIN-2334) Investigate using HDFS as Kylin metadata store

2016-12-28 Thread Shaofeng SHI (JIRA)
Shaofeng SHI created KYLIN-2334:
---

 Summary: Investigate using HDFS as Kylin metadata store
 Key: KYLIN-2334
 URL: https://issues.apache.org/jira/browse/KYLIN-2334
 Project: Kylin
  Issue Type: Task
  Components: Metadata
Reporter: Shaofeng SHI
Assignee: Shaofeng SHI


Today Kylin's metadata is stored in HBase, each time when submitting a job, 
need dump the files to local and then submit to hadoop for parallel computing 
(avoid massive access to HBase from mappers)

If HDFS can natively be used as the storage for Kylin metadata (with similar 
read/write performance), the process will be much simplified.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KYLIN-2333) Kylin doesn't need 0-D cuboid, can remove that step

2016-12-28 Thread Shaofeng SHI (JIRA)
Shaofeng SHI created KYLIN-2333:
---

 Summary: Kylin doesn't need 0-D cuboid, can remove that step
 Key: KYLIN-2333
 URL: https://issues.apache.org/jira/browse/KYLIN-2333
 Project: Kylin
  Issue Type: Improvement
  Components: Job Engine
Reporter: Shaofeng SHI
Assignee: Dong Li


Small improvement, that step is redundant, it can be removed to reduce the 
build time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KYLIN-2333) Kylin doesn't need 0-D cuboid, can remove that step

2016-12-28 Thread Shaofeng SHI (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-2333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-2333:

Fix Version/s: v2.0.0

> Kylin doesn't need 0-D cuboid, can remove that step
> ---
>
> Key: KYLIN-2333
> URL: https://issues.apache.org/jira/browse/KYLIN-2333
> Project: Kylin
>  Issue Type: Improvement
>  Components: Job Engine
>Reporter: Shaofeng SHI
>Assignee: Shaofeng SHI
> Fix For: v2.0.0
>
>
> Small improvement, that step is redundant, it can be removed to reduce the 
> build time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KYLIN-2332) Refactor TupleConverter a bit

2016-12-28 Thread liyang (JIRA)
liyang created KYLIN-2332:
-

 Summary: Refactor TupleConverter a bit
 Key: KYLIN-2332
 URL: https://issues.apache.org/jira/browse/KYLIN-2332
 Project: Kylin
  Issue Type: Improvement
  Components: Storage - HBase
Reporter: liyang
Assignee: liyang






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KYLIN-2328) Reduce the size of metadata uploaded to distributed cache

2016-12-28 Thread Dayue Gao (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dayue Gao updated KYLIN-2328:
-
Attachment: KYLIN-2328.patch

Patch uploaded.

Have tested batch build and merge.

It seems that KafkaFlatTableJob doesn't require 
{{attachKylinPropsAndMetadata}}, but I'm not 100% sure and haven't tested 
streaming case. [~Shaofengshi], could you please take a look and confirm that?

> Reduce the size of metadata uploaded to distributed cache
> -
>
> Key: KYLIN-2328
> URL: https://issues.apache.org/jira/browse/KYLIN-2328
> Project: Kylin
>  Issue Type: Improvement
>  Components: Job Engine
>Affects Versions: all
>Reporter: Dayue Gao
>Assignee: Dayue Gao
> Fix For: v2.0.0
>
> Attachments: KYLIN-2328.patch
>
>
> Currently, each MR job uploads all the metadata belonging to a cube to 
> distributed cache. When the total size of metadata increases, the submission 
> time ("MapReduce Waiting" at Monitor UI) also increases and could become a 
> significant problem.
> We could actually optimize the amount of metadata uploaded according to the 
> type of job, for example
> * CuboidJob only needs dictionary of the building segment
> * CubeHFileJob doesn't need any dictionary



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KYLIN-2331) By layer Spark cubing

2016-12-28 Thread Shaofeng SHI (JIRA)
Shaofeng SHI created KYLIN-2331:
---

 Summary: By layer Spark cubing
 Key: KYLIN-2331
 URL: https://issues.apache.org/jira/browse/KYLIN-2331
 Project: Kylin
  Issue Type: New Feature
  Components: Job Engine
Reporter: Shaofeng SHI
Assignee: Shaofeng SHI


Using Apache Spark as the engine to build cube, with the by-layer iterative 
algorithm.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KYLIN-2330) CubeDesc returns redundant DerivedInfo

2016-12-28 Thread liyang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyang resolved KYLIN-2330.
---
   Resolution: Fixed
Fix Version/s: v2.0.0

> CubeDesc returns redundant DerivedInfo
> --
>
> Key: KYLIN-2330
> URL: https://issues.apache.org/jira/browse/KYLIN-2330
> Project: Kylin
>  Issue Type: Bug
>  Components: Metadata
>Reporter: liyang
>Assignee: liyang
> Fix For: v2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KYLIN-2330) CubeDesc returns redundant DerivedInfo

2016-12-28 Thread liyang (JIRA)
liyang created KYLIN-2330:
-

 Summary: CubeDesc returns redundant DerivedInfo
 Key: KYLIN-2330
 URL: https://issues.apache.org/jira/browse/KYLIN-2330
 Project: Kylin
  Issue Type: Bug
  Components: Metadata
Reporter: liyang
Assignee: liyang






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KYLIN-2287) Speed up model and cube list load in Web

2016-12-28 Thread kangkaisen (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-2287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15783104#comment-15783104
 ] 

kangkaisen commented on KYLIN-2287:
---

Hi, Jason.  I try to fix this css issue by setting {{overflow: visible}} to 
{{cube_model_trees}}. which fix this issue but make {{list-group}} visible at 
the same time.

If you have a better idea, please tell me. Thanks.

> Speed up model and cube list load in Web
> 
>
> Key: KYLIN-2287
> URL: https://issues.apache.org/jira/browse/KYLIN-2287
> Project: Kylin
>  Issue Type: Improvement
>  Components: Web 
>Affects Versions: v1.6.0
>Reporter: kangkaisen
>Assignee: kangkaisen
>Priority: Critical
> Fix For: v2.0.0
>
> Attachments: KYLIN-2287 model tab css issue.png, KYLIN-2287.patch
>
>
> Currently, if a project has more than one hundred cubes and models, the 
> "Model" page load will take a long time because there are a lot of http 
> requests. So we need to reduce and defer the http requests when initially 
> load "Model" page.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KYLIN-2323) Refine Table load/unload error message

2016-12-28 Thread Billy Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-2323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15782981#comment-15782981
 ] 

Billy Liu commented on KYLIN-2323:
--

Thanks [~zhongjian], the new patch updated. Please review at 
https://github.com/apache/kylin/commit/5bd47a1a532ea7c8c72202e9c708611c08372fcb

> Refine Table load/unload error message
> --
>
> Key: KYLIN-2323
> URL: https://issues.apache.org/jira/browse/KYLIN-2323
> Project: Kylin
>  Issue Type: Improvement
>  Components: REST Service
>Affects Versions: v1.6.0
>Reporter: Billy Liu
>Assignee: Billy Liu
>Priority: Minor
> Attachments: KYLIN-2323.patch
>
>
> There is no exception handling in TableController, so most of exceptions will 
> not be found in kylin.log, but kylin.out. The TableController should provide 
> more useful messages, and be stable when exception happens. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KYLIN-2329) Between 0.06 - 0.01 and 0.06 + 0.01, returns incorrect result

2016-12-28 Thread liyang (JIRA)
liyang created KYLIN-2329:
-

 Summary: Between 0.06 - 0.01 and 0.06 + 0.01, returns incorrect 
result
 Key: KYLIN-2329
 URL: https://issues.apache.org/jira/browse/KYLIN-2329
 Project: Kylin
  Issue Type: Bug
Reporter: liyang


A TPC-H query returns incorrect result:

{code}
select
sum(l_saleprice) as revenue
from
v_lineitem
where
l_shipdate >= '1993-01-01'
and l_shipdate < '1994-01-01'
and l_discount between 0.06 - 0.01 and 0.06 + 0.01
and l_quantity < 25;
{code}

The result becomes correct if change condition to below
{code}
and l_discount between 0.05 and 0.07
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KYLIN-2323) Refine Table load/unload error message

2016-12-28 Thread Zhong,Jason (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-2323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15782382#comment-15782382
 ] 

Zhong,Jason commented on KYLIN-2323:


Hi [~yimingliu] since TableController.loadHiveTables will return 
"result.loaded" only, but no "result.unloaded"

then in sourceMeta.js, TableService.loadHiveTable will cause null pointer 
exception since result['result.unloaded'] is undefined.


> Refine Table load/unload error message
> --
>
> Key: KYLIN-2323
> URL: https://issues.apache.org/jira/browse/KYLIN-2323
> Project: Kylin
>  Issue Type: Improvement
>  Components: REST Service
>Affects Versions: v1.6.0
>Reporter: Billy Liu
>Assignee: Billy Liu
>Priority: Minor
> Attachments: KYLIN-2323.patch
>
>
> There is no exception handling in TableController, so most of exceptions will 
> not be found in kylin.log, but kylin.out. The TableController should provide 
> more useful messages, and be stable when exception happens. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)