[jira] [Commented] (PHOENIX-2417) Compress memory used by row key byte[] of guideposts

ASF GitHub Bot (JIRA) Wed, 20 Jan 2016 07:52:38 -0800

    [ 
https://issues.apache.org/jira/browse/PHOENIX-2417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108741#comment-15108741
 ]


ASF GitHub Bot commented on PHOENIX-2417:
-----------------------------------------

Github user JamesRTaylor commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/147#discussion_r50269285
  
    --- Diff: 
phoenix-core/src/main/java/org/apache/phoenix/util/UpgradeUtil.java ---
    @@ -1215,4 +1218,67 @@ public static void 
addRowKeyOrderOptimizableCell(List<Mutation> tableMetadata, b
                     MetaDataEndpointImpl.ROW_KEY_ORDER_OPTIMIZABLE_BYTES, 
PBoolean.INSTANCE.toBytes(true));
             tableMetadata.add(put);
         }
    +
    +    public static boolean truncateStats(HTableInterface metaTable, 
HTableInterface statsTable)
    +            throws IOException, InterruptedException {
    +        List<Cell> columnCells = metaTable
    +                .get(new Get(SchemaUtil.getTableKey(null, 
PhoenixDatabaseMetaData.SYSTEM_SCHEMA_NAME,
    +                        PhoenixDatabaseMetaData.SYSTEM_CATALOG_TABLE)))
    +                
.getColumnCells(PhoenixDatabaseMetaData.TABLE_FAMILY_BYTES, 
QueryConstants.EMPTY_COLUMN_BYTES);
    +        if (!columnCells.isEmpty()
    +                && columnCells.get(0).getTimestamp() < 
MetaDataProtocol.MIN_SYSTEM_TABLE_TIMESTAMP_4_7_0) {
    --- End diff --
    
    The if isn't needed because the client-side code that truncates the stats 
table has been removed. This is the only place we do it. If stats building gets 
triggered *before* the client-side upgrade code has run (for example, through 
compaction), then it will build using the new logic with the new schema. It 
should be fine, b/c the code uses straight HBase APIs, not Phoenix APIs. Since 
we always send back empty guideposts for the protobuf field that old clients 
will be looking at, clients will just get empty stats until the client-side 
upgrade code runs.


> Compress memory used by row key byte[] of guideposts
> ----------------------------------------------------
>
>                 Key: PHOENIX-2417
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2417
>             Project: Phoenix
>          Issue Type: Sub-task
>            Reporter: James Taylor
>            Assignee: Ankit Singhal
>             Fix For: 4.7.0
>
>         Attachments: PHOENIX-2417.patch, PHOENIX-2417_encoder.diff, 
> PHOENIX-2417_rebased.patch, PHOENIX-2417_v2_wip.patch, StatsUpgrade_wip.patch
>
>
> We've found that smaller guideposts are better in terms of minimizing any 
> increase in latency for point scans. However, this increases the amount of 
> memory significantly when caching the guideposts on the client. Guidepost are 
> equidistant row keys in the form of raw byte[] which are likely to have a 
> large percentage of their leading bytes in common (as they're stored in 
> sorted order. We should use a simple compression technique to mitigate this. 
> I noticed that Apache Parquet has a run length encoding - perhaps we can use 
> that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (PHOENIX-2417) Compress memory used by row key byte[] of guideposts

Reply via email to