Re: Replicating data into HBase

Billy Pearson Fri, 17 Apr 2009 18:11:04 -0700

If you data is not to complex with multi fields etc. you could try to usemysql bin logs just usemysqlbinlog http://dev.mysql.com/doc/refman/5.0/en/mysqlbinlog.html toprocess bin logs and generatea text version of the logs and process them with a map and then reduce in tothe table. thiswould not provide live data but you could run a simple shell script toprocessthe bin logs then delete or move them if you needed to sync up you couldcall mysql to start a new bin log the shellscript could be ran as a cron job and it would pick up the latest bin logand start the job.


I would use linux command
find /binlog/location/*.bin -mmin +5
to find the logs that are ready to process.
That will give you all the bin logs that have not been modflyed in 5 mins

If your insert/update querys are not to complex to process it would besimple


Billy

"Brian Forney" <[email protected]> wrote inmessage news:[email protected]...

Ryan,
Thanks. Yep, I've read the Bigtable paper (now and in 2006) andunderstand that HBase and Bigtable are essentially large maps and do notuse the relational model.
Still interested in hearing if others have successfully done this. (I'mmostly looking for ways to speed up the implementation of a one- wayreplication: from a relational DB to HBase.)
Thanks,
Brian

On Apr 17, 2009, at 5:45 PM, Ryan Rawson wrote:
HBase is not a relational database, so many things that are in a SQL
database dont exist.

eg:
- sequences
- secondary declarative keys
- joins
- advance query features such as order by, group by
- operators of any kind

Given conventions (eg: naming of index tables), it might be possible  to
semi-automatedly convert data, but it might not efficiently takeadvantage
of HBase's unique schema-less design.
I suggest you have a look at the Google's bigtable paper, as it has thesame
underlying model that HBase does.

Good luck!
On Fri, Apr 17, 2009 at 3:30 PM, Brian Forney<[email protected]> wrote:
Hi all,
I'd like to replicate a large dataset from a relational database intoHBasefor better throughput of MapReduce jobs. Has anyone had successreplicating
from a relational database (in my case SQL Server) to HBase?

Thanks,
Brian

Re: Replicating data into HBase

Reply via email to