2008/7/26 stack <[EMAIL PROTECTED]>: > Tim Sell wrote: >> >> It would be handy to be able to easily dump data from postgresql >> straight to hbase. Then keep the data in hbase up to date. >> >> I've made a simple python tool called hbreplic (I'm very willing to >> come up with an easier to type name). >> > > How do you pronounce that? > >> It has two main purposes, bootstrap, where it copies columns from >> postgresql tables to hbase. >> And, play, where it processes incoming insert, update and delete >> events on the postgresql tables and update hbase with them. >> >> The hbase table/family/column layout is whatever you want it to be. >> The hbase row keys at the moment are taken from a specified postgresql >> column (presumably the primary key, but not enforced), with an >> optional prefix. >> >> It handles schema changes, in that it doesn't care what the table >> looks like as long as the table has the columns that you specify in an >> ini file. >> >> It makes use of PgQ which is part of skytools (a bunch of postgresql >> database tools released by skype). >> PgQ is a queuing management thing for events. >> >> It depends on python, skytools, and thrift. >> >> It's pretty rudimentary at the moment, but easy to use. >> We'd like to open source it and make it better. >> >> Would people be interested in this? >> Is there some kind of hbase contrib we could potentially add this to? >> >> On Monday we'll probably make the source available somewhere with >> instructions. >> > > It sounds excellent Tim. A nice contrib. If you want to add it, add it to > a JIRA and I'll add it under hbase/contrib. Add a bit of doc. so browsers > can figure what it is -- especially since current name gives no clue what it > does (smile). > St.Ack >
I know it's a horrible name isn't it? It's also really annoying to type at the command line. I trip over my fingers. I'll brainstorm a something pronounceable and add a jira on Monday :) ~Tim.
