[ 
https://issues.apache.org/jira/browse/PHOENIX-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15830800#comment-15830800
 ] 

James Taylor commented on PHOENIX-3583:
---------------------------------------

Thanks for the explanation, [~elserj]. Now I understand what you're getting at. 
There's a small bit of code that decides whether to tack on the IndexMaintainer 
to the mutations themselves (as an attribute) or make a separate, single RPC 
per region server to cache them for usage when the mutations are processed:
{code}
    public static boolean useIndexMetadataCache(PhoenixConnection connection, 
List<? extends Mutation> mutations, int indexMetaDataByteLength) {
        ReadOnlyProps props = connection.getQueryServices().getProps();
        int threshold = props.getInt(INDEX_MUTATE_BATCH_SIZE_THRESHOLD_ATTRIB, 
QueryServicesOptions.DEFAULT_INDEX_MUTATE_BATCH_SIZE_THRESHOLD);
        return (indexMetaDataByteLength > ServerCacheClient.UUID_LENGTH && 
mutations.size() > threshold);
    }
{code}
So the value of INDEX_MUTATE_BATCH_SIZE_THRESHOLD_ATTRIB determines the number 
of rows above which a separate RPC is made. The default is only 3 rows. Perhaps 
we should bump that up substantially if the RPCs are becoming a bottleneck? It 
would have the affect of making the payload larger (by numRowsInBatchToRS * 
sizeofIndexMaintainer). Unfortunately, there's no mechanism in HBase to add an 
attribute only to the RPC to the RS as opposed to having to repeat it on every 
mutation (HBASE-9291).

> Prepare IndexMaintainer on server itself
> ----------------------------------------
>
>                 Key: PHOENIX-3583
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3583
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Ankit Singhal
>            Assignee: Ankit Singhal
>         Attachments: PHOENIX-3583.patch
>
>
> -- reuse the cache of PTable and it's lifecycle.
> -- With the new implementation, we will be doing RPC to meta table per mini 
> batch which could be an overhead, but the same configuration 
> "updateCacheFrequency" can be used to control a frequency of touching 
> SYSTEM.CATALOG endpoint for updated Ptable or index maintainers. 
> -- It is expected that 99% of the time the table is old and RPC will be 
> returned with an empty result(so it may be less costly), as opposed to the 
> current implementation where we have to send the index maintainer payload to 
> each region server per upsert batch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to