Delta: We are trying to bring two different databases in synch. So in real time we insert data in 2 dbs(totally different format). But in the night we run a batch job and do cross checking if db2(which is actually Hbase) is missing a row or two we will insert it.
Data Matching: We need to do user verification - i.e. when a new user inserted we will check his demographics and based on that we conclude user already exist or not. -Jignesh On Thu, Mar 21, 2013 at 12:20 PM, Andrew Purtell <[email protected]>wrote: > I think you may need to provide just a bit more information about your > use case. Could you define a bit more 'delta' and 'data matching'? > > In a sense, every bulk load is a delta: updates for insert into a > larger table, representing a set of changes as a batch. > > We could consider the existing HBase mechanisms for handling > multiversioning to be a simple "data matching functionality" via > simple existence testing by coordinate, although I know that is not > what you mean (but I don't know what you mean precisely). > > * - coordinate: { row, column, qualifier, timestamp } > > On 3/21/13, Jignesh Patel <[email protected]> wrote: > > We have a requirement to support data matching while loading deltas to > > HBase. > > I see there is a utility to support bulk loading. > > http://hbase.apache.org/book/arch.bulk.load.html > > > > But is there any way to support daily delta loading? > > Is there any open sourced MDM software which can be integrated with > HBase? > > > > Does Hbase has any data matching functionality? > > > > -Jignesh > > >
