[jira] [Commented] (KYLIN-4322) Cost–benefit of compression HBase result
[ https://issues.apache.org/jira/browse/KYLIN-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17027246#comment-17027246 ] ASF subversion and git services commented on KYLIN-4322: Commit 26cf1f8ed217c96329d8dcbd8a00ef1d67023fca in kylin's branch refs/heads/master from Kang [ https://gitbox.apache.org/repos/asf?p=kylin.git;h=26cf1f8 ] Revert "KYLIN-4322: set storage.hbase.endpoint-compress-result default value … (#1033)" This reverts commit f41c6c8198e5cad295e9212c6a0047d83bd54ae2. > Cost–benefit of compression HBase result > > > Key: KYLIN-4322 > URL: https://issues.apache.org/jira/browse/KYLIN-4322 > Project: Kylin > Issue Type: Bug >Reporter: ZhouKang >Assignee: ZhouKang >Priority: Major > Fix For: v3.1.0 > > > kylin.storage.hbase.endpoint-compress-result is TRUE as default. > In our production environment, when the hbase scan result is larger than > 200M, it will take more than 10s to compress data. > We can find this by hbase's log: > ||Size||avg rate||min rate||avg time||max time|| > |<1M|0.12|0.25|0.18ms|0.7s| > |1M ~ 10M|0.39|0.97|0.2s|0.6s| > |10M ~ 100M|0.47|0.81|2s|6.3s| > |>100M|0.95|0.96|15.7s|24.8s| > Notice: > # rate: compressed data size / origin data size > # when the source data size is < 1M, compressed data may larger than the > source data. So the table(Row 1) only calculate then compressed data less > than the source data > # In our environment, 65% compression data (<1M) is larger than source data > When source data is less then 10M, the latency of data transmission is > acceptability. When data is larger then 100M, it will take a long time to > compress data. > > So, I think kylin.storage.hbase.endpoint-compress-result should be FALSE by > default; > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-4322) Cost–benefit of compression HBase result
[ https://issues.apache.org/jira/browse/KYLIN-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17027245#comment-17027245 ] ASF GitHub Bot commented on KYLIN-4322: --- nichunen commented on pull request #1084: Revert "KYLIN-4322: set storage.hbase.endpoint-compress-result default value …" URL: https://github.com/apache/kylin/pull/1084 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Cost–benefit of compression HBase result > > > Key: KYLIN-4322 > URL: https://issues.apache.org/jira/browse/KYLIN-4322 > Project: Kylin > Issue Type: Bug >Reporter: ZhouKang >Assignee: ZhouKang >Priority: Major > Fix For: v3.1.0 > > > kylin.storage.hbase.endpoint-compress-result is TRUE as default. > In our production environment, when the hbase scan result is larger than > 200M, it will take more than 10s to compress data. > We can find this by hbase's log: > ||Size||avg rate||min rate||avg time||max time|| > |<1M|0.12|0.25|0.18ms|0.7s| > |1M ~ 10M|0.39|0.97|0.2s|0.6s| > |10M ~ 100M|0.47|0.81|2s|6.3s| > |>100M|0.95|0.96|15.7s|24.8s| > Notice: > # rate: compressed data size / origin data size > # when the source data size is < 1M, compressed data may larger than the > source data. So the table(Row 1) only calculate then compressed data less > than the source data > # In our environment, 65% compression data (<1M) is larger than source data > When source data is less then 10M, the latency of data transmission is > acceptability. When data is larger then 100M, it will take a long time to > compress data. > > So, I think kylin.storage.hbase.endpoint-compress-result should be FALSE by > default; > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-4322) Cost–benefit of compression HBase result
[ https://issues.apache.org/jira/browse/KYLIN-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17027168#comment-17027168 ] ZhouKang commented on KYLIN-4322: - Thank you [~shaofengshi] I will do more work in testing and repairment > Cost–benefit of compression HBase result > > > Key: KYLIN-4322 > URL: https://issues.apache.org/jira/browse/KYLIN-4322 > Project: Kylin > Issue Type: Bug >Reporter: ZhouKang >Assignee: ZhouKang >Priority: Major > Fix For: v3.1.0 > > > kylin.storage.hbase.endpoint-compress-result is TRUE as default. > In our production environment, when the hbase scan result is larger than > 200M, it will take more than 10s to compress data. > We can find this by hbase's log: > ||Size||avg rate||min rate||avg time||max time|| > |<1M|0.12|0.25|0.18ms|0.7s| > |1M ~ 10M|0.39|0.97|0.2s|0.6s| > |10M ~ 100M|0.47|0.81|2s|6.3s| > |>100M|0.95|0.96|15.7s|24.8s| > Notice: > # rate: compressed data size / origin data size > # when the source data size is < 1M, compressed data may larger than the > source data. So the table(Row 1) only calculate then compressed data less > than the source data > # In our environment, 65% compression data (<1M) is larger than source data > When source data is less then 10M, the latency of data transmission is > acceptability. When data is larger then 100M, it will take a long time to > compress data. > > So, I think kylin.storage.hbase.endpoint-compress-result should be FALSE by > default; > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-4322) Cost–benefit of compression HBase result
[ https://issues.apache.org/jira/browse/KYLIN-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17027152#comment-17027152 ] ASF GitHub Bot commented on KYLIN-4322: --- zhoukangcn commented on pull request #1084: Revert "KYLIN-4322: set storage.hbase.endpoint-compress-result default value …" URL: https://github.com/apache/kylin/pull/1084 Reverts apache/kylin#1033 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Cost–benefit of compression HBase result > > > Key: KYLIN-4322 > URL: https://issues.apache.org/jira/browse/KYLIN-4322 > Project: Kylin > Issue Type: Bug >Reporter: ZhouKang >Assignee: ZhouKang >Priority: Major > Fix For: v3.1.0 > > > kylin.storage.hbase.endpoint-compress-result is TRUE as default. > In our production environment, when the hbase scan result is larger than > 200M, it will take more than 10s to compress data. > We can find this by hbase's log: > ||Size||avg rate||min rate||avg time||max time|| > |<1M|0.12|0.25|0.18ms|0.7s| > |1M ~ 10M|0.39|0.97|0.2s|0.6s| > |10M ~ 100M|0.47|0.81|2s|6.3s| > |>100M|0.95|0.96|15.7s|24.8s| > Notice: > # rate: compressed data size / origin data size > # when the source data size is < 1M, compressed data may larger than the > source data. So the table(Row 1) only calculate then compressed data less > than the source data > # In our environment, 65% compression data (<1M) is larger than source data > When source data is less then 10M, the latency of data transmission is > acceptability. When data is larger then 100M, it will take a long time to > compress data. > > So, I think kylin.storage.hbase.endpoint-compress-result should be FALSE by > default; > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-4322) Cost–benefit of compression HBase result
[ https://issues.apache.org/jira/browse/KYLIN-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17014983#comment-17014983 ] ASF GitHub Bot commented on KYLIN-4322: --- nichunen commented on pull request #1033: KYLIN-4322: set storage.hbase.endpoint-compress-result default value … URL: https://github.com/apache/kylin/pull/1033 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Cost–benefit of compression HBase result > > > Key: KYLIN-4322 > URL: https://issues.apache.org/jira/browse/KYLIN-4322 > Project: Kylin > Issue Type: Bug >Reporter: ZhouKang >Priority: Major > > kylin.storage.hbase.endpoint-compress-result is TRUE as default. > In our production environment, when the hbase scan result is larger than > 200M, it will take more than 10s to compress data. > We can find this by hbase's log: > ||Size||avg rate||min rate||avg time||max time|| > |<1M|0.12|0.25|0.18ms|0.7s| > |1M ~ 10M|0.39|0.97|0.2s|0.6s| > |10M ~ 100M|0.47|0.81|2s|6.3s| > |>100M|0.95|0.96|15.7s|24.8s| > Notice: > # rate: compressed data size / origin data size > # when the source data size is < 1M, compressed data may larger than the > source data. So the table(Row 1) only calculate then compressed data less > than the source data > # In our environment, 65% compression data (<1M) is larger than source data > When source data is less then 10M, the latency of data transmission is > acceptability. When data is larger then 100M, it will take a long time to > compress data. > > So, I think kylin.storage.hbase.endpoint-compress-result should be FALSE by > default; > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-4322) Cost–benefit of compression HBase result
[ https://issues.apache.org/jira/browse/KYLIN-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17014984#comment-17014984 ] ASF subversion and git services commented on KYLIN-4322: Commit f41c6c8198e5cad295e9212c6a0047d83bd54ae2 in kylin's branch refs/heads/master from Kang [ https://gitbox.apache.org/repos/asf?p=kylin.git;h=f41c6c8 ] KYLIN-4322: set storage.hbase.endpoint-compress-result default value … (#1033) * KYLIN-4322: set storage.hbase.endpoint-compress-result default value false * KYLIN-4322: update UT > Cost–benefit of compression HBase result > > > Key: KYLIN-4322 > URL: https://issues.apache.org/jira/browse/KYLIN-4322 > Project: Kylin > Issue Type: Bug >Reporter: ZhouKang >Priority: Major > > kylin.storage.hbase.endpoint-compress-result is TRUE as default. > In our production environment, when the hbase scan result is larger than > 200M, it will take more than 10s to compress data. > We can find this by hbase's log: > ||Size||avg rate||min rate||avg time||max time|| > |<1M|0.12|0.25|0.18ms|0.7s| > |1M ~ 10M|0.39|0.97|0.2s|0.6s| > |10M ~ 100M|0.47|0.81|2s|6.3s| > |>100M|0.95|0.96|15.7s|24.8s| > Notice: > # rate: compressed data size / origin data size > # when the source data size is < 1M, compressed data may larger than the > source data. So the table(Row 1) only calculate then compressed data less > than the source data > # In our environment, 65% compression data (<1M) is larger than source data > When source data is less then 10M, the latency of data transmission is > acceptability. When data is larger then 100M, it will take a long time to > compress data. > > So, I think kylin.storage.hbase.endpoint-compress-result should be FALSE by > default; > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-4322) Cost–benefit of compression HBase result
[ https://issues.apache.org/jira/browse/KYLIN-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17014985#comment-17014985 ] ASF subversion and git services commented on KYLIN-4322: Commit f41c6c8198e5cad295e9212c6a0047d83bd54ae2 in kylin's branch refs/heads/master from Kang [ https://gitbox.apache.org/repos/asf?p=kylin.git;h=f41c6c8 ] KYLIN-4322: set storage.hbase.endpoint-compress-result default value … (#1033) * KYLIN-4322: set storage.hbase.endpoint-compress-result default value false * KYLIN-4322: update UT > Cost–benefit of compression HBase result > > > Key: KYLIN-4322 > URL: https://issues.apache.org/jira/browse/KYLIN-4322 > Project: Kylin > Issue Type: Bug >Reporter: ZhouKang >Priority: Major > > kylin.storage.hbase.endpoint-compress-result is TRUE as default. > In our production environment, when the hbase scan result is larger than > 200M, it will take more than 10s to compress data. > We can find this by hbase's log: > ||Size||avg rate||min rate||avg time||max time|| > |<1M|0.12|0.25|0.18ms|0.7s| > |1M ~ 10M|0.39|0.97|0.2s|0.6s| > |10M ~ 100M|0.47|0.81|2s|6.3s| > |>100M|0.95|0.96|15.7s|24.8s| > Notice: > # rate: compressed data size / origin data size > # when the source data size is < 1M, compressed data may larger than the > source data. So the table(Row 1) only calculate then compressed data less > than the source data > # In our environment, 65% compression data (<1M) is larger than source data > When source data is less then 10M, the latency of data transmission is > acceptability. When data is larger then 100M, it will take a long time to > compress data. > > So, I think kylin.storage.hbase.endpoint-compress-result should be FALSE by > default; > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-4322) Cost–benefit of compression HBase result
[ https://issues.apache.org/jira/browse/KYLIN-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17014986#comment-17014986 ] ASF subversion and git services commented on KYLIN-4322: Commit f41c6c8198e5cad295e9212c6a0047d83bd54ae2 in kylin's branch refs/heads/master from Kang [ https://gitbox.apache.org/repos/asf?p=kylin.git;h=f41c6c8 ] KYLIN-4322: set storage.hbase.endpoint-compress-result default value … (#1033) * KYLIN-4322: set storage.hbase.endpoint-compress-result default value false * KYLIN-4322: update UT > Cost–benefit of compression HBase result > > > Key: KYLIN-4322 > URL: https://issues.apache.org/jira/browse/KYLIN-4322 > Project: Kylin > Issue Type: Bug >Reporter: ZhouKang >Priority: Major > > kylin.storage.hbase.endpoint-compress-result is TRUE as default. > In our production environment, when the hbase scan result is larger than > 200M, it will take more than 10s to compress data. > We can find this by hbase's log: > ||Size||avg rate||min rate||avg time||max time|| > |<1M|0.12|0.25|0.18ms|0.7s| > |1M ~ 10M|0.39|0.97|0.2s|0.6s| > |10M ~ 100M|0.47|0.81|2s|6.3s| > |>100M|0.95|0.96|15.7s|24.8s| > Notice: > # rate: compressed data size / origin data size > # when the source data size is < 1M, compressed data may larger than the > source data. So the table(Row 1) only calculate then compressed data less > than the source data > # In our environment, 65% compression data (<1M) is larger than source data > When source data is less then 10M, the latency of data transmission is > acceptability. When data is larger then 100M, it will take a long time to > compress data. > > So, I think kylin.storage.hbase.endpoint-compress-result should be FALSE by > default; > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-4322) Cost–benefit of compression HBase result
[ https://issues.apache.org/jira/browse/KYLIN-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17007182#comment-17007182 ] Yuzhang QIU commented on KYLIN-4322: I think it's same with KYLIN-3512 > Cost–benefit of compression HBase result > > > Key: KYLIN-4322 > URL: https://issues.apache.org/jira/browse/KYLIN-4322 > Project: Kylin > Issue Type: Bug >Reporter: ZhouKang >Priority: Major > > kylin.storage.hbase.endpoint-compress-result is TRUE as default. > In our production environment, when the hbase scan result is larger than > 200M, it will take more than 10s to compress data. > We can find this by hbase's log: > ||Size||avg rate||min rate||avg time||max time|| > |<1M|0.12|0.25|0.18ms|0.7s| > |1M ~ 10M|0.39|0.97|0.2s|0.6s| > |10M ~ 100M|0.47|0.81|2s|6.3s| > |>100M|0.95|0.96|15.7s|24.8s| > Notice: > # rate: compressed data size / origin data size > # when the source data size is < 1M, compressed data may larger than the > source data. So the table(Row 1) only calculate then compressed data less > than the source data > # In our environment, 65% compression data (<1M) is larger than source data > When source data is less then 10M, the latency of data transmission is > acceptability. When data is larger then 100M, it will take a long time to > compress data. > > So, I think kylin.storage.hbase.endpoint-compress-result should be FALSE by > default; > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KYLIN-4322) Cost–benefit of compression HBase result
[ https://issues.apache.org/jira/browse/KYLIN-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17005968#comment-17005968 ] ASF GitHub Bot commented on KYLIN-4322: --- zhoukangcn commented on pull request #1033: KYLIN-4322: set storage.hbase.endpoint-compress-result default value … URL: https://github.com/apache/kylin/pull/1033 …false see: https://issues.apache.org/jira/browse/KYLIN-4322 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Cost–benefit of compression HBase result > > > Key: KYLIN-4322 > URL: https://issues.apache.org/jira/browse/KYLIN-4322 > Project: Kylin > Issue Type: Bug >Reporter: ZhouKang >Priority: Major > > kylin.storage.hbase.endpoint-compress-result is TRUE as default. > In our production environment, when the hbase scan result is larger than > 200M, it will take more than 10s to compress data. > We can find this by hbase's log: > ||Size||avg rate||max rate||avg time||max time|| > |<1M|0.12|0.25|0.18ms|0.7s| > |1M ~ 10M|0.39|0.97|0.2s|0.6s| > |10M ~ 100M|0.47|0.81|2s|6.3s| > |>100M|0.95|0.96|15.7s|24.8s| > rate: compressed data size / origin data size > AND please NOTICE that, > when the source data size is less than 1M, 65% compression data is larger > than source data. > When source data is less then 10M, the latency of data transmission is > acceptability. When data is larger then 100M, it will take a long time to > compress data. > > So, I think kylin.storage.hbase.endpoint-compress-result should be FALSE by > default; > -- This message was sent by Atlassian Jira (v8.3.4#803005)