[jira] [Commented] (KYLIN-2328) Reduce the size of metadata uploaded to distributed cache
[ https://issues.apache.org/jira/browse/KYLIN-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15784573#comment-15784573 ] Shaofeng SHI commented on KYLIN-2328: - +1 good points; when there are lots of segment, each time submitting all segments' dict repeatedly is expensive, this patch optimized this very well. > Reduce the size of metadata uploaded to distributed cache > - > > Key: KYLIN-2328 > URL: https://issues.apache.org/jira/browse/KYLIN-2328 > Project: Kylin > Issue Type: Improvement > Components: Job Engine >Affects Versions: all >Reporter: Dayue Gao >Assignee: Dayue Gao > Fix For: v2.0.0 > > Attachments: KYLIN-2328.patch > > > Currently, each MR job uploads all the metadata belonging to a cube to > distributed cache. When the total size of metadata increases, the submission > time ("MapReduce Waiting" at Monitor UI) also increases and could become a > significant problem. > We could actually optimize the amount of metadata uploaded according to the > type of job, for example > * CuboidJob only needs dictionary of the building segment > * CubeHFileJob doesn't need any dictionary -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-2328) Reduce the size of metadata uploaded to distributed cache
[ https://issues.apache.org/jira/browse/KYLIN-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15784572#comment-15784572 ] Shaofeng SHI commented on KYLIN-2328: - Hi Dayue, you're correct, KafkaFlatTableJob doesn't need cube metadata, it just persistent the messages from Kafka to HDFS. > Reduce the size of metadata uploaded to distributed cache > - > > Key: KYLIN-2328 > URL: https://issues.apache.org/jira/browse/KYLIN-2328 > Project: Kylin > Issue Type: Improvement > Components: Job Engine >Affects Versions: all >Reporter: Dayue Gao >Assignee: Dayue Gao > Fix For: v2.0.0 > > Attachments: KYLIN-2328.patch > > > Currently, each MR job uploads all the metadata belonging to a cube to > distributed cache. When the total size of metadata increases, the submission > time ("MapReduce Waiting" at Monitor UI) also increases and could become a > significant problem. > We could actually optimize the amount of metadata uploaded according to the > type of job, for example > * CuboidJob only needs dictionary of the building segment > * CubeHFileJob doesn't need any dictionary -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KYLIN-2334) Investigate using HDFS as Kylin metadata store
Shaofeng SHI created KYLIN-2334: --- Summary: Investigate using HDFS as Kylin metadata store Key: KYLIN-2334 URL: https://issues.apache.org/jira/browse/KYLIN-2334 Project: Kylin Issue Type: Task Components: Metadata Reporter: Shaofeng SHI Assignee: Shaofeng SHI Today Kylin's metadata is stored in HBase, each time when submitting a job, need dump the files to local and then submit to hadoop for parallel computing (avoid massive access to HBase from mappers) If HDFS can natively be used as the storage for Kylin metadata (with similar read/write performance), the process will be much simplified. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KYLIN-2333) Kylin doesn't need 0-D cuboid, can remove that step
Shaofeng SHI created KYLIN-2333: --- Summary: Kylin doesn't need 0-D cuboid, can remove that step Key: KYLIN-2333 URL: https://issues.apache.org/jira/browse/KYLIN-2333 Project: Kylin Issue Type: Improvement Components: Job Engine Reporter: Shaofeng SHI Assignee: Dong Li Small improvement, that step is redundant, it can be removed to reduce the build time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KYLIN-2333) Kylin doesn't need 0-D cuboid, can remove that step
[ https://issues.apache.org/jira/browse/KYLIN-2333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shaofeng SHI updated KYLIN-2333: Fix Version/s: v2.0.0 > Kylin doesn't need 0-D cuboid, can remove that step > --- > > Key: KYLIN-2333 > URL: https://issues.apache.org/jira/browse/KYLIN-2333 > Project: Kylin > Issue Type: Improvement > Components: Job Engine >Reporter: Shaofeng SHI >Assignee: Shaofeng SHI > Fix For: v2.0.0 > > > Small improvement, that step is redundant, it can be removed to reduce the > build time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KYLIN-2332) Refactor TupleConverter a bit
liyang created KYLIN-2332: - Summary: Refactor TupleConverter a bit Key: KYLIN-2332 URL: https://issues.apache.org/jira/browse/KYLIN-2332 Project: Kylin Issue Type: Improvement Components: Storage - HBase Reporter: liyang Assignee: liyang -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KYLIN-2328) Reduce the size of metadata uploaded to distributed cache
[ https://issues.apache.org/jira/browse/KYLIN-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dayue Gao updated KYLIN-2328: - Attachment: KYLIN-2328.patch Patch uploaded. Have tested batch build and merge. It seems that KafkaFlatTableJob doesn't require {{attachKylinPropsAndMetadata}}, but I'm not 100% sure and haven't tested streaming case. [~Shaofengshi], could you please take a look and confirm that? > Reduce the size of metadata uploaded to distributed cache > - > > Key: KYLIN-2328 > URL: https://issues.apache.org/jira/browse/KYLIN-2328 > Project: Kylin > Issue Type: Improvement > Components: Job Engine >Affects Versions: all >Reporter: Dayue Gao >Assignee: Dayue Gao > Fix For: v2.0.0 > > Attachments: KYLIN-2328.patch > > > Currently, each MR job uploads all the metadata belonging to a cube to > distributed cache. When the total size of metadata increases, the submission > time ("MapReduce Waiting" at Monitor UI) also increases and could become a > significant problem. > We could actually optimize the amount of metadata uploaded according to the > type of job, for example > * CuboidJob only needs dictionary of the building segment > * CubeHFileJob doesn't need any dictionary -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KYLIN-2331) By layer Spark cubing
Shaofeng SHI created KYLIN-2331: --- Summary: By layer Spark cubing Key: KYLIN-2331 URL: https://issues.apache.org/jira/browse/KYLIN-2331 Project: Kylin Issue Type: New Feature Components: Job Engine Reporter: Shaofeng SHI Assignee: Shaofeng SHI Using Apache Spark as the engine to build cube, with the by-layer iterative algorithm. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KYLIN-2330) CubeDesc returns redundant DerivedInfo
[ https://issues.apache.org/jira/browse/KYLIN-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liyang resolved KYLIN-2330. --- Resolution: Fixed Fix Version/s: v2.0.0 > CubeDesc returns redundant DerivedInfo > -- > > Key: KYLIN-2330 > URL: https://issues.apache.org/jira/browse/KYLIN-2330 > Project: Kylin > Issue Type: Bug > Components: Metadata >Reporter: liyang >Assignee: liyang > Fix For: v2.0.0 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KYLIN-2330) CubeDesc returns redundant DerivedInfo
liyang created KYLIN-2330: - Summary: CubeDesc returns redundant DerivedInfo Key: KYLIN-2330 URL: https://issues.apache.org/jira/browse/KYLIN-2330 Project: Kylin Issue Type: Bug Components: Metadata Reporter: liyang Assignee: liyang -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-2287) Speed up model and cube list load in Web
[ https://issues.apache.org/jira/browse/KYLIN-2287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15783104#comment-15783104 ] kangkaisen commented on KYLIN-2287: --- Hi, Jason. I try to fix this css issue by setting {{overflow: visible}} to {{cube_model_trees}}. which fix this issue but make {{list-group}} visible at the same time. If you have a better idea, please tell me. Thanks. > Speed up model and cube list load in Web > > > Key: KYLIN-2287 > URL: https://issues.apache.org/jira/browse/KYLIN-2287 > Project: Kylin > Issue Type: Improvement > Components: Web >Affects Versions: v1.6.0 >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Critical > Fix For: v2.0.0 > > Attachments: KYLIN-2287 model tab css issue.png, KYLIN-2287.patch > > > Currently, if a project has more than one hundred cubes and models, the > "Model" page load will take a long time because there are a lot of http > requests. So we need to reduce and defer the http requests when initially > load "Model" page. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-2323) Refine Table load/unload error message
[ https://issues.apache.org/jira/browse/KYLIN-2323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15782981#comment-15782981 ] Billy Liu commented on KYLIN-2323: -- Thanks [~zhongjian], the new patch updated. Please review at https://github.com/apache/kylin/commit/5bd47a1a532ea7c8c72202e9c708611c08372fcb > Refine Table load/unload error message > -- > > Key: KYLIN-2323 > URL: https://issues.apache.org/jira/browse/KYLIN-2323 > Project: Kylin > Issue Type: Improvement > Components: REST Service >Affects Versions: v1.6.0 >Reporter: Billy Liu >Assignee: Billy Liu >Priority: Minor > Attachments: KYLIN-2323.patch > > > There is no exception handling in TableController, so most of exceptions will > not be found in kylin.log, but kylin.out. The TableController should provide > more useful messages, and be stable when exception happens. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KYLIN-2329) Between 0.06 - 0.01 and 0.06 + 0.01, returns incorrect result
liyang created KYLIN-2329: - Summary: Between 0.06 - 0.01 and 0.06 + 0.01, returns incorrect result Key: KYLIN-2329 URL: https://issues.apache.org/jira/browse/KYLIN-2329 Project: Kylin Issue Type: Bug Reporter: liyang A TPC-H query returns incorrect result: {code} select sum(l_saleprice) as revenue from v_lineitem where l_shipdate >= '1993-01-01' and l_shipdate < '1994-01-01' and l_discount between 0.06 - 0.01 and 0.06 + 0.01 and l_quantity < 25; {code} The result becomes correct if change condition to below {code} and l_discount between 0.05 and 0.07 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-2323) Refine Table load/unload error message
[ https://issues.apache.org/jira/browse/KYLIN-2323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15782382#comment-15782382 ] Zhong,Jason commented on KYLIN-2323: Hi [~yimingliu] since TableController.loadHiveTables will return "result.loaded" only, but no "result.unloaded" then in sourceMeta.js, TableService.loadHiveTable will cause null pointer exception since result['result.unloaded'] is undefined. > Refine Table load/unload error message > -- > > Key: KYLIN-2323 > URL: https://issues.apache.org/jira/browse/KYLIN-2323 > Project: Kylin > Issue Type: Improvement > Components: REST Service >Affects Versions: v1.6.0 >Reporter: Billy Liu >Assignee: Billy Liu >Priority: Minor > Attachments: KYLIN-2323.patch > > > There is no exception handling in TableController, so most of exceptions will > not be found in kylin.log, but kylin.out. The TableController should provide > more useful messages, and be stable when exception happens. -- This message was sent by Atlassian JIRA (v6.3.4#6332)