We have a scenario where we want to consume only delta updates from Hive
tables.
- Multiple producers are updating data in Hive table
- Multiple consumer reading data from the Hive table

Consumption pattern:
- Get all data that has been updated since last time I read.

Is there any mechanism in Hive 3.0 which can enable above consumption
pattern?

I see there is a construct of row__id(writeid, bucketid, rowid) in ACID
tables.
- Can row__id be used in this scenario?
- How is the "writeid" generated?
- Is there some meta information which captures the time when the rows were
actually visible for read?

Reply via email to