[ 
https://issues.apache.org/jira/browse/KYLIN-1455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dayue Gao closed KYLIN-1455.
----------------------------
    Resolution: Won't Fix

> HBase ScanMetrics are not properly logged in query log
> ------------------------------------------------------
>
>                 Key: KYLIN-1455
>                 URL: https://issues.apache.org/jira/browse/KYLIN-1455
>             Project: Kylin
>          Issue Type: Improvement
>          Components: Storage - HBase
>    Affects Versions: v1.2
>            Reporter: Dayue Gao
>            Assignee: Dayue Gao
>            Priority: Minor
>         Attachments: KYLIN-1455-1.x-staging.patch
>
>
> HBase's ScanMetrics provide users valuable information when troubleshooting 
> query performance issues. But I found it was not properly logged, sometimes 
> missing from the log, sometimes duplicated.
> Below is an example of duplicated scan log, this is due to 
> {{CubeSegmentTupleIterator#closeScanner()}} method is invoked two times, 
> first in hasNext(), second in close().
> {noformat}
> [http-bio-8080-exec-8]:[2016-02-26 
> 17:31:50,227][DEBUG][org.apache.kylin.storage.hbase.CubeSegmentTupleIterator.closeScanner(CubeSegmentTupleIterator.java:146)]
>  - Scan 
> {"loadColumnFamiliesOnDemand":null,"filter":"FuzzyRowFilter{fuzzyKeysData={\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\xC7\\x00\\x00\\x00\\x17\\x00\\x00\\x17\\x00:\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\x00\\x00\\x00\\xFF\\x00\\x00\\xFF\\x00}},
>  
> ","startRow":"\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\xC7\\x00\\x00\\x00\\x17\\x00\\x00\\x17\\x00","stopRow":"\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\xC7\\x08\\x8F\\xFF\\x17\\xFF\\xFF\\x17\\xFF\\x00","batch":-1,"cacheBlocks":true,"totalColumns":1,"maxResultSize":5242880,"families":{"F1":["M"]},"caching":1024,"maxVersions":1,"timeRange":[0,9223372036854775807]}
> [http-bio-8080-exec-8]:[2016-02-26 
> 17:31:50,229][DEBUG][org.apache.kylin.storage.hbase.CubeSegmentTupleIterator.closeScanner(CubeSegmentTupleIterator.java:150)]
>  - HBase Metrics: count=17357, ms=3194, bytes=905594, remote_bytes=905594, 
> regions=1, not_serving_region=0, rpc=19, rpc_retries=0, remote_rpc=19, 
> remote_rpc_retries=0
> [http-bio-8080-exec-8]:[2016-02-26 
> 17:32:58,016][DEBUG][org.apache.kylin.storage.hbase.CubeSegmentTupleIterator.closeScanner(CubeSegmentTupleIterator.java:146)]
>  - Scan 
> {"loadColumnFamiliesOnDemand":null,"filter":"FuzzyRowFilter{fuzzyKeysData={\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\xC7\\x00\\x00\\x00\\x17\\x00\\x00\\x17\\x00:\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\x00\\x00\\x00\\xFF\\x00\\x00\\xFF\\x00}},
>  
> ","startRow":"\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\xC7\\x00\\x00\\x00\\x17\\x00\\x00\\x17\\x00","stopRow":"\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\xC7\\x08\\x8F\\xFF\\x17\\xFF\\xFF\\x17\\xFF\\x00","batch":-1,"cacheBlocks":true,"totalColumns":1,"maxResultSize":5242880,"families":{"F1":["M"]},"caching":1024,"maxVersions":1,"timeRange":[0,9223372036854775807]}
> [http-bio-8080-exec-8]:[2016-02-26 
> 17:33:04,443][DEBUG][org.apache.kylin.storage.hbase.CubeSegmentTupleIterator.closeScanner(CubeSegmentTupleIterator.java:150)]
>  - HBase Metrics: count=17357, ms=3194, bytes=905594, remote_bytes=905594, 
> regions=1, not_serving_region=0, rpc=19, rpc_retries=0, remote_rpc=19, 
> remote_rpc_retries=0
> {noformat}
> And sometimes ScanMetrics is missing from the log, showed below. I think this 
> is due to {{CubeSegmentTupleIterator#closeScanner()}} trying to get 
> ScanMetrics before close the current ResultScanner. After looking into HBase 
> client source, I found that ScanMetrics will not be written out until the 
> scanner is closed or exhausted (no cache entries). So it'd be better to get 
> ScanMetrics after closing the scanner.
> {noformat}
> [http-bio-8080-exec-2]:[2016-02-26 
> 17:18:43,928][DEBUG][org.apache.kylin.storage.hbase.CubeSegmentTupleIterator.closeScanner(CubeSegmentTupleIterator.java:146)]
>  - Scan 
> {"loadColumnFamiliesOnDemand":null,"filter":"FuzzyRowFilter{fuzzyKeysData={\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\xC7\\x00\\x00\\x00\\x17\\x00\\x00\\x17\\x00:\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\xFF\\x00\\x00\\x00\\xFF\\x00\\x00\\xFF\\x00}},
>  
> ","startRow":"\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\xC7\\x01\\x1C\\x04\\x0Cx\\x03\\x08Y","stopRow":"\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\xC7\\x08\\x8F\\xFF\\x17\\xFF\\xFF\\x17\\xFF\\x00","batch":-1,"cacheBlocks":true,"totalColumns":1,"maxResultSize":5242880,"families":{"F1":["M"]},"caching":1024,"maxVersions":1,"timeRange":[0,9223372036854775807]}
> [http-bio-8080-exec-2]:[2016-02-26 
> 17:19:38,228][INFO][org.apache.kylin.rest.service.QueryService.logQuery(QueryService.java:242)]
>  -
> ==========================[QUERY]===============================
> {noformat}
> This should be easy to fix, I will submit a patch for this.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to