Funny you should mention that one. I had that in my head but wasn't sure if it applied or not. Thanks, Dean
-----Original Message----- From: Steven Noels [mailto:[email protected]] Sent: Wednesday, December 08, 2010 6:42 AM To: user Subject: Re: slow move from rdbm to hadoop/hbase(is there replication strategies for this?) On Tue, Dec 7, 2010 at 10:55 PM, Hiller, Dean (Contractor) <[email protected]> wrote: > We are going to move 7 terabytes(set to grow to 35 when our SLA goes > from 2 years to 10 years of storage) from an RDBMS to hadoop/hbase type > system and I was wondering if anyone knew of how to get events from > hbase on persisted/modified entities so that changes can be replicated > to our RDBMS easily. Again, I would suggest you to take a look at the RowLog library I mentioned in my previous post. We use it as a message queue to asynchronously feed SOLR indices, which somehow sounds as what you need during your transition week. The Rowlog processor scans a WAL of row update entries at regular intervals, giving you a near-real-time up-to-dateness of your RDBMS replication queue. Nothing out of the box though, you'll have some code to write but at the very least the tricky bits are solved already for you. As Todd is suggesting, you will need to be careful to not overload your RDBMS, but doing it using a tighter integrated mechanism than mass M/R might reduce that chance. Kind regards, Steven. -- Steven Noels http://outerthought.org/ Open Source Content Applications Makers of Kauri, Daisy CMS and Lily This message and any attachments are intended only for the use of the addressee and may contain information that is privileged and confidential. If the reader of the message is not the intended recipient or an authorized representative of the intended recipient, you are hereby notified that any dissemination of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by e-mail and delete the message and any attachments from your system.
