[GitHub] [hbase] virajjasani commented on a diff in pull request #4924: HBASE-27529 Attach WAL extended attributes to mutations at replication sink

GitBox Mon, 12 Dec 2022 18:28:43 -0800


virajjasani commented on code in PR #4924:
URL: https://github.com/apache/hbase/pull/4924#discussion_r1046599163



##########
hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java:
##########
@@ -265,6 +266,11 @@ public void replicateEntries(List<WALEntry> entries, final 
CellScanner cells,
               mutation.setClusterIds(clusterIds);
               mutation.setAttribute(ReplicationUtils.REPLICATION_ATTR_NAME,
                 HConstants.EMPTY_BYTE_ARRAY);
+              if (attributeList != null) {

Review Comment:
   > And if phoenix wants this feature, please describe more about at lease one 
usage. Let's see how to better support it.
   
   So let me take a simple example. Phoenix supports multi-tenancy by providing 
multi-tenant connection. Hence, same table can be shared by multiple tenants. 
This requires tenant id to be a rowkey prefix in hbase rowkey. When a 
particular tenant creates tenant connection and writes data, those data are not 
visible to other tenants.
   Tenant id is very basic entity in phoenix. Now let's say we want to 
introduce some level of caching in phoenix server side, the cache would contain 
tenant id.
   
   For HBase sink cluster, mutation is already created for specific table 
because table name is available in WALKey. For phoenix, let's say table name is 
available and mutation has full rowkey as well but it's very difficult to 
derive tenant id from this rowkey. Hence, phoenix coproc needs to know of 
tenant id attribute with mutation so that it can form tenant level connection 
or derive further attributes at sink side and so on.
   
   In addition to tenant id, there are a few more imp metadata that is required 
at sink side coproc, without which it's not possible to be aware of how the 
data is meant to be used.
   
   > How do you read these attributes at the source cluster's coproc?
   
   If we take same example as above, multi tenant connection is already created 
by client and hence tenant id is always available with all operations performed 
at source cluster coproc. Besides some of the imp attributes are already 
attached as mutation attributes at source side. But when the mutations are 
generated at sink side, while hbase metadata is available, coproc like phoenix 
metadata is not available and it would be great if we can attach these metadata 
attributes to sink mutation, otherwise sink coproc would not be able to derive 
them.
   Making this configurable feature with proper documentation is also fine, if 
you don't mind. WDYT?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hbase] virajjasani commented on a diff in pull request #4924: HBASE-27529 Attach WAL extended attributes to mutations at replication sink

Reply via email to