Re: Help: setting hbase row timestamp in phoenix upserts ?

Pedro Boado Wed, 11 Jul 2018 07:57:22 -0700

I hadn't seen this Jira. Yes that is essentially it.

On Wed, 11 Jul 2018, 15:49 James Taylor, <[email protected]> wrote:


> I think the answer is PHOENIX-4552. There's an outline of the work involved
> on the JIRA. I think passing through data like that for hints would get
> unwieldy quickly.
>
> On Tue, Jul 10, 2018 at 1:31 PM, Pedro Boado <[email protected]>
> wrote:
>
> > Hi guys, just a refloat from the @user list.
> >
> > May it be of interest having this functionality for defining HBase
> > timestamps in a per row basis as part of an UPSERT VALUES?
> >
> > For a table defined as
> > CREATE TABLE T0001 ( k VARCHAR PRIMARY KEY, v INTEGER)
> >
> > Allow a hint to extract and override hbase put timestamp through a
> > "virtual" column?
> > UPSERT /*+ ROW_TIMESTAMP(ts) */ INTO T0001(k,v,ts) VALUES
> > ('a',1, 1531253959043)
> >
> > If the column existed and had appropiate type it would also be populated
> > with the same value.
> >
> > Thanks,
> > Pedro.
> >
> >
> > On Fri, 1 Dec 2017 at 07:15, James Taylor <[email protected]>
> wrote:
> >
> > > The only way I can think of accomplishing this is by using the raw
> HBase
> > > APIs to write the data but using our utilities to write it in a Phoenix
> > > compatible manner. For example, you could run an UPSERT VALUES
> statement,
> > > use the PhoenixRuntime.getUncommittedDataIterator()method to get the
> > Cells
> > > that would have been written, update the Cell timestamp as needed, and
> do
> > > an htable.batch() call to commit them.
> > >
> > > On Wed, Nov 29, 2017 at 11:46 AM Pedro Boado <[email protected]>
> > > wrote:
> > >
> > >> Hi,
> > >>
> > >> I'm looking for a little bit of help trying to get some light over
> > >> ROW_TIMESTAMP.
> > >>
> > >> Some background over the problem ( simplified ) : I'm working in a
> > >> project that needs to create a "enriched" replica of a RBDMS table
> > based on
> > >> a stream of cdc changes off that table.
> > >>
> > >> Each cdc event contains the timestamp of the change plus all the
> column
> > >> values 'before' and 'after' the change . And each event is pushed to a
> > >> kafka topic.  Because of certain "non-negotiable" design decisions
> kafka
> > >> guarantees delivering each event at least once, but doesn't guarantee
> > >> ordering for changes over the same row in the source table.
> > >>
> > >> The final step of the kafka-based flow is sinking the information into
> > >> HBase/Phoenix.
> > >>
> > >> As I cannot get in order delivery guarantee from Kafka I need to use
> the
> > >> cdc event timestamp to ensure that HBase keeps the latest change over
> a
> > row.
> > >>
> > >> This fits perfectly well with an HBase table design with VERSIONS=1
> and
> > >> using the source event timestamp as HBase row/cells timestamp
> > >>
> > >> The thing is that I cannot find a way to define the value of the HBase
> > >> cell from a Phoenix upsert.
> > >>
> > >> I came across the ROW_TIMESTAMP functionality, but I've just found (
> I'm
> > >> devastated now ) that the ROW_TIMESTAMP columns store the date in both
> > >> hbase's cell timestamp and in the primary key, meaning that I cannot
> > >> leverage that functionality to keep only the latest change.
> > >>
> > >> Is there a way of defining hbase's row timestamp when doing the
> UPSERT -
> > >> even by setting it through some obscure hidden jdbc property - ?
> > >>
> > >> I want to avoid by all means doing a checkAndPut as the volume of
> > changes
> > >> is going to be quite bug.
> > >>
> > >>
> > >>
> > >> --
> > >> Un saludo.
> > >> Pedro Boado.
> > >>
> > >
> >
> > --
> > Un saludo.
> > Pedro Boado.
> >
>

Re: Help: setting hbase row timestamp in phoenix upserts ?

Reply via email to