The best way IMO, would be to pass these stuff as attributes within Put/Delete ( Any Mutation) and make the CP to process it and convert as Tags and attach to Cells before write to region.. This is how cell level ACL/ TTL etc works.. I agree to Andy on keeping tags not exposed to clients.
Anoop On Mon, Oct 19, 2020 at 11:45 PM Andrew Purtell <[email protected]> wrote: > Thanks for the clarification. > > My opinion about client use of cell tags remains unchanged. Also, the offer > to assist with any coprocessor side API issues with using tags on the > server side. > > > On Mon, Oct 19, 2020 at 11:07 AM Rushabh Shah <[email protected] > > > wrote: > > > > > Thank you Andrew and Geoffrey for the comments. > > > > > > > Why not add values to Deletes instead. Values (in the key value sense) > > are ignored for Deletes, so could serve as annotations as > > you like. Nobody has proposed changing value semantics on Puts or > whatever. > > > > I think I didn't do a good job in explaining the use case and focused my > > discussion/question on Deletes _only_. > > The current requirement is to add tags to Deletes but we want a solution > > which can be extensible to Puts also. Since the annotation we want to add > > (i.e source of operation) is not limited to only Deletes. > > Here are the use cases we considered to add tags for puts mutations other > > than adding source of mutation information: > > 1. Identifying whether the put came from primary cluster or replicated > > cluster so that we can make the backup tool more smarter and not backup > the > > same put twice in source and replicated cluster. > > 2. We have a multi-tenancy concept in Phoenix. We want to track whether > > the upsert (put operation in hbase) came from Global or Tenant > connection. > > > > > There should be no limitations on tag use to coprocessors. If there are > > API issues in that regard we can certainly improve the situation. > > > > I am writing POC for this. Thank you for the suggestion, Andrew ! > > > > > > Rushabh Shah > > > > > > > > > > > > On Mon, Oct 19, 2020 at 10:09 AM Andrew Purtell <[email protected]> > > wrote: > > > >> Just to be clear I think we are talking past each other somehow. You ask > >> to > >> add tags to Deletes. Why not add values to Deletes instead. Values (in > the > >> key value sense) are ignored for Deletes, so could serve as annotations > as > >> you like. Nobody has proposed changing value semantics on Puts or > >> whatever. > >> > >> On Mon, Oct 19, 2020 at 10:07 AM Andrew Purtell <[email protected]> > >> wrote: > >> > >> > Because tags are meant to be a server side internal feature. There is > no > >> > strong technical rationale to change here because values in Deletes > can > >> > serve just as well as tags. Unless there is something I am missing. If > >> > there were it could be reasonable to reconsider. In the absence of an > >> > actual need, it is not. > >> > > >> > On Mon, Oct 19, 2020 at 9:34 AM Geoffrey Jacoby <[email protected]> > >> > wrote: > >> > > >> >> I completely understand why HBase wouldn't want to expose tags that > it > >> >> uses > >> >> for internal security purposes, like ACLs or visibility, to clients. > >> >> However, making _all_ tags be off-limits seems to me to limit quite a > >> few > >> >> useful features. > >> >> > >> >> Overloading the delete marker's value solves one particular problem, > >> but > >> >> not the general case, because it can't be extended to Puts, which > >> already > >> >> use their value field for real data. The motivating example in > >> HBASE-25118 > >> >> is distinguishing a bulk delete from customer operations. But there > are > >> >> times we may want to distinguish an ETL or bulk write from customer > >> >> operations. > >> >> > >> >> Let's say I have a batch job that does an ETL into a cluster at the > >> same > >> >> time the cluster is taking other writes. I want to be really sure > that > >> all > >> >> my data got loaded properly, so I generate a checksum from the ETL > >> dataset > >> >> before I load it. After the ETL, I want to generate a checksum for > the > >> >> loaded data on the cluster and compare. So I need to write a Filter > >> that > >> >> distinguishes the loaded data from any other operations going on at > the > >> >> same time. (Let's assume I'm scanning raw and have major compaction > >> >> disabled so nothing gets purged, and there's nothing distinguishing > >> about > >> >> the data itself) > >> >> > >> >> The simplest way to do this would be to have a (hopefully tiny) > >> Cell-level > >> >> annotation that identifies that it originally came from my ETL. > That's > >> >> exactly what the Tag array field would provide. Now, I could hack > >> >> something > >> >> into the Put value and change all my applications to ignore part of > the > >> >> value array, but that assumes that I have full control over the > value's > >> >> format (not true if I'm using, say, Phoenix). And like using the > Delete > >> >> value, that's just hacking my own proprietary "Tag" capability into > >> HBase > >> >> when a real one already exists. > >> >> > >> >> So I'm curious why, so long as HBase internal tags continue to be > >> >> suppressed, is the Tag capability a bad thing to expose? > >> >> > >> >> Geoffrey > >> >> > >> >> > >> >> > >> >> On Fri, Oct 16, 2020 at 12:58 PM Andrew Purtell <[email protected] > > > >> >> wrote: > >> >> > >> >> > I responded on the JIRA. > >> >> > > >> >> > You would be far better served adapting values for your proposal > >> >> instead of > >> >> > tags. Tags are not a client side feature. Tags were and are > designed > >> for > >> >> > server side use only, and are stripped from client inbound and > >> outbound > >> >> > RPCs. > >> >> > > >> >> > On Wed, Oct 14, 2020 at 9:40 AM Rushabh Shah > >> >> > <[email protected]> wrote: > >> >> > > >> >> > > Thank you Ram for your response ! > >> >> > > > >> >> > > > For your case, is there a possibility to have yournew feature > as > >> a > >> >> > first > >> >> > > class feature using Tags? Just asking? > >> >> > > > >> >> > > Could you elaborate what you mean by first class feature ? > >> >> > > > >> >> > > > >> >> > > Rushabh Shah > >> >> > > > >> >> > > - Software Engineering SMTS | Salesforce > >> >> > > - > >> >> > > - Mobile: 213 422 9052 > >> >> > > > >> >> > > > >> >> > > > >> >> > > On Wed, Oct 14, 2020 at 9:35 AM ramkrishna vasudevan < > >> >> > > [email protected]> wrote: > >> >> > > > >> >> > > > Hi Rushabh > >> >> > > > > >> >> > > > If I remember correctly, the decision was not to expose tags > for > >> >> > clients > >> >> > > > directly. All the tags were used as internal to the cell > >> formation > >> >> at > >> >> > the > >> >> > > > server side (for eg ACL and Visibility labels). > >> >> > > > > >> >> > > > For your case, is there a possibility to have yournew feature > as > >> a > >> >> > first > >> >> > > > class feature using Tags? Just asking? > >> >> > > > > >> >> > > > Regards > >> >> > > > Ram > >> >> > > > > >> >> > > > On Wed, Oct 14, 2020 at 8:17 PM Rushabh Shah > >> >> > > > <[email protected]> wrote: > >> >> > > > > >> >> > > > > Hi Everyone, > >> >> > > > > I want to understand how to use the Hbase Cell Tags feature. > We > >> >> have > >> >> > a > >> >> > > > use > >> >> > > > > case to identify the source of deletes (not the same as > >> >> authenticated > >> >> > > > > kerberos user). I have added more details about my use case > in > >> >> > > > HBASE-25118 > >> >> > > > > < > >> > https://urldefense.com/v3/__https://issues.apache.org/jira/browse/HBASE-25118__;!!DCbAVzZNrAf4!V2iIQazj8qFPfcWmVLWpBOzzwGBvzI10YK12zylzivOx5CtFKzLg4GspEwIHxtiKewm5$ > >> >. At my day > >> >> job > >> >> > we > >> >> > > > use > >> >> > > > > Phoenix to interact with hbase and we are passing this > >> information > >> >> > via > >> >> > > > > Phoenix ConnectionProperties. We are exploring the Cell Tags > >> >> feature > >> >> > to > >> >> > > > add > >> >> > > > > this metadata to Hbase Cells (only to Delete Markers as of > >> now). > >> >> > > > > > >> >> > > > > Via HBASE-18995 < > >> >> > >> > https://urldefense.com/v3/__https://issues.apache.org/jira/browse/HBASE-18995__;!!DCbAVzZNrAf4!V2iIQazj8qFPfcWmVLWpBOzzwGBvzI10YK12zylzivOx5CtFKzLg4GspEwIHxh-WrVaS$ > >> >, > >> >> > > we > >> >> > > > > have moved all the createCell methods which use Tag(s) as an > >> >> argument > >> >> > > to > >> >> > > > > PrivateCellUtil class and made the InterfaceAudience of that > >> class > >> >> > > > Private. > >> >> > > > > I saw some discussion on that jira > >> >> > > > > < > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > >> > https://urldefense.com/v3/__https://issues.apache.org/jira/browse/HBASE-18995?focusedCommentId=16219960&page=com.atlassian.jira.plugin.system.issuetabpanels*3Acomment-tabpanel*comment-16219960__;JSM!!DCbAVzZNrAf4!V2iIQazj8qFPfcWmVLWpBOzzwGBvzI10YK12zylzivOx5CtFKzLg4GspEwIHxgxlIPkT$ > >> >> > > > > >] > >> >> > > > > to expose some methods as LimitedPrivate accessible to CP but > >> was > >> >> > > decided > >> >> > > > > to do it later. We only expose CellBuilderFactory > >> >> > > > > < > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > >> > https://urldefense.com/v3/__https://github.com/apache/hbase/blob/master/hbase-common/src/main/java/org/apache/hadoop/hbase/CellBuilderFactory.java__;!!DCbAVzZNrAf4!V2iIQazj8qFPfcWmVLWpBOzzwGBvzI10YK12zylzivOx5CtFKzLg4GspEwIHxmk_kjqy$ > >> >> > > > > > > >> >> > > > > which returns which returns an instance of CellBuilder > >> >> > > > > < > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > >> > https://urldefense.com/v3/__https://github.com/apache/hbase/blob/master/hbase-common/src/main/java/org/apache/hadoop/hbase/CellBuilder.java__;!!DCbAVzZNrAf4!V2iIQazj8qFPfcWmVLWpBOzzwGBvzI10YK12zylzivOx5CtFKzLg4GspEwIHxjaO1J6B$ > >> >> > > > > > > >> >> > > > > which doesn't have a setTags method. Also the code is vastly > >> >> > different > >> >> > > in > >> >> > > > > branch-1. > >> >> > > > > > >> >> > > > > Could someone please educate me on how to populate tags from > >> the > >> >> > client > >> >> > > > > side (i.e Phoenix) while creating a Delete object ? > >> >> > > > > Thank you ! > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > > >> >> > -- > >> >> > Best regards, > >> >> > Andrew > >> >> > > >> >> > Words like orphans lost among the crosstalk, meaning torn from > >> truth's > >> >> > decrepit hands > >> >> > - A23, Crosstalk > >> >> > > >> >> > >> > > >> > > >> > -- > >> > Best regards, > >> > Andrew > >> > > >> > Words like orphans lost among the crosstalk, meaning torn from truth's > >> > decrepit hands > >> > - A23, Crosstalk > >> > > >> > >> > >> -- > >> Best regards, > >> Andrew > >> > >> Words like orphans lost among the crosstalk, meaning torn from truth's > >> decrepit hands > >> - A23, Crosstalk > >> > > > > -- > Best regards, > Andrew > > Words like orphans lost among the crosstalk, meaning torn from truth's > decrepit hands > - A23, Crosstalk >
