[jira] [Commented] (PHOENIX-2417) Compress memory used by row key byte[] of guideposts

ASF GitHub Bot (JIRA) Sat, 16 Jan 2016 14:46:14 -0800

    [ 
https://issues.apache.org/jira/browse/PHOENIX-2417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15103474#comment-15103474
 ]


ASF GitHub Bot commented on PHOENIX-2417:
-----------------------------------------

Github user ankitsinghal commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/147#discussion_r49938247
  
    --- Diff: phoenix-protocol/src/main/PTable.proto ---
    @@ -52,11 +52,12 @@ message PColumn {
     
     message PTableStats {
       required bytes key = 1;
    -  repeated bytes values = 2;
    +  optional bytes guidePosts = 2;
    --- End diff --
    
    yes it makes sense, as truncate of system.stats is necessary. I'll try this 
tomorrow.
    
    Yes ,as per your previous comments , I'll be keeping "values" field at 
position 2, so this will automatically ensure that empty PGuidePosts is 
returned when client older than 4.7 is used right?
    As per below code from version older than 4.7 (PTableImpl.createFromProto() 
)
    GuidePostsInfo info =
                        new GuidePostsInfo(guidePostsByteCount, value, 
rowCount);//Prior 4.7 version :- empty "value" list.
    
    3) I have created a GuidePostsInfoWriter and you can review the changes in 
this pull request now.
    
    



> Compress memory used by row key byte[] of guideposts
> ----------------------------------------------------
>
>                 Key: PHOENIX-2417
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2417
>             Project: Phoenix
>          Issue Type: Sub-task
>            Reporter: James Taylor
>            Assignee: Ankit Singhal
>             Fix For: 4.7.0
>
>         Attachments: PHOENIX-2417.patch, PHOENIX-2417_encoder.diff, 
> PHOENIX-2417_v2_wip.patch
>
>
> We've found that smaller guideposts are better in terms of minimizing any 
> increase in latency for point scans. However, this increases the amount of 
> memory significantly when caching the guideposts on the client. Guidepost are 
> equidistant row keys in the form of raw byte[] which are likely to have a 
> large percentage of their leading bytes in common (as they're stored in 
> sorted order. We should use a simple compression technique to mitigate this. 
> I noticed that Apache Parquet has a run length encoding - perhaps we can use 
> that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PHOENIX-2417) Compress memory used by row key byte[] of guideposts

Reply via email to