That script depends on pgq, which is a postgres specific event queue. It's handy for tracking table changes. If there is something similar for sql server it might be helpful.
2009/4/18 stack <[email protected]>: > You might take a look at Tim Sells' postgres to hbase uploader scripts here > for ideas: > http://svn.apache.org/viewvc/hadoop/hbase/trunk/src/examples/uploaders/ > St.Ack > > 2009/4/18 Billy Pearson <[email protected]> > >> If you data is not to complex with multi fields etc. you could try to use >> mysql bin logs just use >> mysqlbinlog http://dev.mysql.com/doc/refman/5.0/en/mysqlbinlog.html to >> process bin logs and generate >> a text version of the logs and process them with a map and then reduce in >> to the table. this >> would not provide live data but you could run a simple shell script to >> process >> the bin logs then delete or move them if you needed to sync up you could >> call mysql to start a new bin log the shell >> script could be ran as a cron job and it would pick up the latest bin log >> and start the job. >> >> I would use linux command >> find /binlog/location/*.bin -mmin +5 >> to find the logs that are ready to process. >> That will give you all the bin logs that have not been modflyed in 5 mins >> >> If your insert/update querys are not to complex to process it would be >> simple >> >> Billy >> >> >> >> "Brian Forney" <[email protected]> wrote in message >> news:[email protected]... >> >> Ryan, >>> >>> Thanks. Yep, I've read the Bigtable paper (now and in 2006) and understand >>> that HBase and Bigtable are essentially large maps and do not use the >>> relational model. >>> >>> Still interested in hearing if others have successfully done this. (I'm >>> mostly looking for ways to speed up the implementation of a one- way >>> replication: from a relational DB to HBase.) >>> >>> Thanks, >>> Brian >>> >>> On Apr 17, 2009, at 5:45 PM, Ryan Rawson wrote: >>> >>> HBase is not a relational database, so many things that are in a SQL >>>> database dont exist. >>>> >>>> eg: >>>> - sequences >>>> - secondary declarative keys >>>> - joins >>>> - advance query features such as order by, group by >>>> - operators of any kind >>>> >>>> Given conventions (eg: naming of index tables), it might be possible to >>>> semi-automatedly convert data, but it might not efficiently take >>>> advantage >>>> of HBase's unique schema-less design. >>>> >>>> I suggest you have a look at the Google's bigtable paper, as it has the >>>> same >>>> underlying model that HBase does. >>>> >>>> Good luck! >>>> >>>> >>>> On Fri, Apr 17, 2009 at 3:30 PM, Brian Forney <[email protected]> >>>> wrote: >>>> >>>> Hi all, >>>>> >>>>> I'd like to replicate a large dataset from a relational database into >>>>> HBase >>>>> for better throughput of MapReduce jobs. Has anyone had success >>>>> replicating >>>>> from a relational database (in my case SQL Server) to HBase? >>>>> >>>>> Thanks, >>>>> Brian >>>>> >>>>> >>> >>> >> >> >
