[ 
https://issues.apache.org/jira/browse/PHOENIX-2683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15147145#comment-15147145
 ] 

James Taylor commented on PHOENIX-2683:
---------------------------------------

Thanks, [~ankit.singhal]. Looks good. One minor thing we can cleanup based on 
the new guidepost format (unrelated to this change):
- No need to copy the row key or create ImmutableBytesWritable here, instead 
just modify addGuidePosts to take parameters for byte[] rowArray, int offset, 
int length:
{code}
diff --git 
a/phoenix-core/src/main/java/org/apache/phoenix/schema/stats/StatisticsCollector.java
 
b/phoenix-core/src/main/java/org/apache/phoenix/schema/stats/StatisticsCollector.java
index 3462f22..676ff77 100644
--- 
a/phoenix-core/src/main/java/org/apache/phoenix/schema/stats/StatisticsCollector.java
+++ 
b/phoenix-core/src/main/java/org/apache/phoenix/schema/stats/StatisticsCollector.java
@@ -190,8 +190,9 @@ public class StatisticsCollector {
             if (byteCount >= guidepostDepth) {
                 byte[] row = ByteUtil.copyKeyBytesIfNecessary(
                         new ImmutableBytesWritable(kv.getRowArray(), 
kv.getRowOffset(), kv.getRowLength()));
-                if (gps.getSecond().addGuidePosts(row, byteCount)) {
+                if (gps.getSecond().addGuidePosts(row, byteCount, 
gps.getSecond().getRowCount())) {
                     gps.setFirst(0l);
+                    gps.getSecond().resetRowCount();
                 }
             }
         }
{code}
- Also, a few tests around this would be good.

> store rowCount and byteCount at guidePost level
> -----------------------------------------------
>
>                 Key: PHOENIX-2683
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2683
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Ankit Singhal
>            Assignee: Ankit Singhal
>            Priority: Minor
>             Fix For: 4.7.0
>
>         Attachments: PHOENIX-2683.patch
>
>
> The GUIDE_POSTS_WIDTH and GUIDE_POSTS_ROW_COUNT should contain the number
> of bytes and number of rows which were traversed since the last guidepost.
> So given some start key and stop key from a scan and knowledge that a given
> column family is used in a query, you should be able to run a query like
> this:
> SELECT SUM(GUIDE_POSTS_WIDTH) bytes_traversed,
>     SUM(GUIDE_POSTS_ROW_COUNT) rows_traversed
> FROM SYSTEM.STATS
> WHERE COLUMN_FAMILY = :1
> AND GUIDE_POST_KEY >= :2
> AND GUIDE_POST_KEY < :3
> where :1 is the column family, :2 is the start row of the scan, and :3 is
> the stop row of the scan. The result of the query should tell you the
> bytes_traversed and the rows_traversed with a granularity of the
> phoenix.stats.guidepost.width config parameter.
> Description is copied from dev mail thread.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to