Hi Dean,

There's no good "event log" capability right now, though a JIRA was
recently filed to discuss this.

One thing you could do is run periodic MR jobs that scan with a
timestamp range to find any edits made in the last hour. Then from
that MR job, update the RDBMS directly from the map tasks. It wouldn't
be like real time replication, but if stale reads from the RDBMS are
fine, it seems like it could probably work for you.

Keep in mind that most MR clusters can easily DOS most databases - so
you may want to do something to throttle the MR tasks.

-Todd

On Tue, Dec 7, 2010 at 1:55 PM, Hiller, Dean (Contractor)
<[email protected]> wrote:
>
> We are going to move 7 terabytes(set to grow to 35 when our SLA goes
> from 2 years to 10 years of storage) from an RDBMS to hadoop/hbase type
> system and I was wondering if anyone knew of how to get events from
> hbase on persisted/modified entities so that changes can be replicated
> to our RDBMS easily.  (ie. We want to keep the RDBMS system in tact for
> a week or so during the switch so we could switch back if any problems
> occurred with our hbase application).  Ie. It is purely a safety net
> when switching customers over.
>
>
>
> Thanks for any advice in this area,
>
> Dean
>
>
> This message and any attachments are intended only for the use of the 
> addressee and
> may contain information that is privileged and confidential. If the reader of 
> the
> message is not the intended recipient or an authorized representative of the
> intended recipient, you are hereby notified that any dissemination of this
> communication is strictly prohibited. If you have received this communication 
> in
> error, please notify us immediately by e-mail and delete the message and any
> attachments from your system.
>



--
Todd Lipcon
Software Engineer, Cloudera

Reply via email to