[
https://issues.apache.org/jira/browse/HBASE-9291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748396#comment-13748396
]
Anoop Sam John commented on HBASE-9291:
---------------------------------------
bq.especially with Hash Joins, the information will potentially be huge
Yes there some thing like call once for a RS and cache there will be very much
required. The smaller table data which we need to pass and store at RS can be
really big..
> Enable client to setAttribute that is sent once to each region server
> ---------------------------------------------------------------------
>
> Key: HBASE-9291
> URL: https://issues.apache.org/jira/browse/HBASE-9291
> Project: HBase
> Issue Type: New Feature
> Components: IPC/RPC
> Reporter: James Taylor
>
> Currently a Scan and Mutation allow the client to set its own attributes that
> get passed through the RPC layer and are accessible from a coprocessor. This
> is very handy, but breaks down if the amount of information is large, since
> this information ends up being sent again and again to every region. Clients
> can work around this with an endpoint "pre" and "post" coprocessor invocation
> that:
> 1) sends the information and caches it on the region server in the "pre"
> invocation
> 2) invokes the Scan or sends the batch of Mutations, and then
> 3) removes it in the "post" invocation.
> In this case, the client is forced to identify all region servers (ideally,
> all region servers that will be involved in the Scan/Mutation), make extra
> RPC calls, manage the caching of the information on the region server,
> age-out the information (in case the client dies before step (3) that clears
> the cached information), and must deal with the possibility of a split
> occurring while this operation is in-progress.
> Instead, it'd be much better if an attribute could be identified as a "region
> server" attribute in OperationWithAttributes and the HBase RPC layer would
> take care of doing the above.
> The use case where the above are necessary in Phoenix include:
> 1) Hash joins, where the results of the smaller side of a join scan are
> packaged up and sent to each region server, and
> 2) Secondary indexing, where the metadata of knowing a) which column
> family/column qualifier pairs and b) which part of the row key contributes to
> which indexes are sent to each region server that will process a batched put.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira