Hi Reid, Congratulations on getting it done, and thank you for reporting back with the details of your success!
Thanks, Nick On Sat, Apr 27, 2019 at 8:17 PM Reid Chan <[email protected]> wrote: > Many thanks to Anil, Nick, Varun. > > I successfully finished the table migration from Phoenix 4.7 to Phoenix > 4.14 by overcoming two problems: > a. table in 4.7 without namespace mapping can be mapped to namespace > mapping table. > b. table in 4.7 can be read in 4.14. > > And here to share my experience: > > 1. In order to support `a`, i added a new feature in hbck. What it does is > modifying the meta recorded in .tableinfo and .regioninfo in table. Then a > command like this `hbck -upgradeNamespace(new feature) -fixMeta > -fixAssignments` will make the table namespace from > 'default:NAMESPACE.TABLE` to ` NAMESPACE:TABLE` from hbase perspective. > > 2. Solving `b` was a pretty hard job, since i didn't have a good knowledge > of phoenix source code. But i found the cause finally which is tricky, > after a lot of code tracking. The reason why table from 4.7 can't be read > in 4.14 is `QualifierEncodingScheme`. 4.7 doesn't have that > feature(introduced in 4.10), and the underlying data in hfile, of course, > don't have the encoded qualifier as well, while 4.14 is enable this feature > by default, the SCAN constructed in 4.14 will use the encoded qualifier as > column condition which reads nothing from hbase. So, the solution is adding > a `COLUMN_ENCODED_BYTES=NONE` at the end of the original CREATE STATEMENT. > > If hbase community allowed, i'm willing to contribute my new feature to > hbck, then we may have an official solution of table migration. (Or it may > not count as a solution, but we can have a document, at least?) > > Let me list the procedure of table migration from lower version (<4.10) to > higher version (>=4.10) Phoenix: > 1. Taking snapshot from src cluster, exporting snapshot to dst cluster, > restoring snapshot in dst cluster. > 2. run `-upgradeNamespace` in dst cluster. (no need if you don't want > namespace mapping feature) > 3. CREATE TABLE ***, COLUMN_ENCODED_BYTES=NONE; > 4. Enjoy your queries. > > > Cheers! > > > -------------------------- > > Best regards, > R.C > > > > ________________________________________ > From: Varun Rao <[email protected]> > Sent: 03 April 2019 00:31 > To: [email protected] > Subject: Re: About mapping a phoenix table > > Hello Reid, > > I dont know if this fits your use case but there is a way of copying data > from a Phoenix Table to another Phoenix table in another cluster if data is > not present yet in either table. > > We can use the fact that Phoenix stores its metadata using HBase tables. > Therefore by enabling replication on the underlying source HBase table, > adding the destination cluster as a peer in the source hbase cluster, and > setting the source hbase table's replication scope to 1 any data flowing > into the source phoenix table will be copied to a destination phoenix > table. > > 1) Create Phoenix tables on source and destination cluster > 2) Set replication=true on source hbase cluster through cm > 3) Add peer on source cluster via hbase shell > > > *add_peer '1’, CLUSTER_KEY => "some_node:2181:/hbase", TABLE_CFS => > {"TABLE_NAME" => ["column_1", "column_2"]}* > > Here you can specify which columns you would like to copy over > > 4) Disable the source Phoenix table, set replication scope as 1, re enable > it > > *disable ‘SOURCE_TABLE’* > > *alter ‘BIGTABLE_PHOENIX’, {NAME=>’w_a’, REPLICATION_SCOPE=>’1’}, > {NAME=>’w_b’, REPLICATION_SCOPE=>’1’}* > > *enable ‘BIGTABLE_PHOENIX’* > > 5) Send data. In my test case I used psql to send 2 million records from > CSV to the phoenix source table > > *phoenix-psql.py -t BIGTABLE_PHOENIX localhost:2181 wine_mag.csv * > > > 6) You can now see the same data in the source and target cluster > > > Thanks > Yours Truly, > Varun Rao > > > On Tue, Apr 2, 2019 at 12:04 PM Nick Dimiduk <[email protected]> wrote: > > > Hi Reid, > > > > I'll throw my +1 onto Anil's Approach #1. I followed this path recently > to > > migrate all of our production data. Migrating Phoenix metadata by > creating > > tables manually on the destination is a little clunky, but HBase > Snapshots > > are quite easy to work with. > > > > Good luck, > > Nick > > > > On Tue, Apr 2, 2019 at 5:26 AM anil gupta <[email protected]> wrote: > > > > > Hey Reid, > > > AFAIK, there is no official Phoenix tool to copy table between > clusters. > > > IMO, it would be great to have an official tool to copy tables. > > > In our case, source and destination clusters are running Phoenix4.7. > IMO, > > > copy between 4.7-4.14 might have some version incompatibility. So, you > > > might need to test following in non-prod first. > > > > > > Approach 1: We usually move tables by taking a snapshot of hbase table, > > > exporting the snapshot to remote cluster, create Phoenix table, delete > > > underlying hbase table, and restoring the snapshot. Please keep in mind > > > that you will need to do similar exercise if your table has secondary > > > indexes since they are stored in another hbase table. Also, make sure > > that > > > you don’t have any live traffic to Phoenix table in destination cluster > > > until restoring of snapshot and verification of data in table. > > > > > > Approach 2: Use copyTable util of hbase. In this case, you will just > need > > > to create Phoenix table on remote cluster and then kick off hbase copy > > > table. In this approach also, you will need to perform copyTable for > each > > > secondary index. > > > > > > We usually use approach1 because it’s usually faster and doesn’t puts > > > write load on cluster. > > > > > > HTH, > > > Anil Gupta > > > > > > > On Apr 2, 2019, at 4:32 AM, Reid Chan <[email protected]> > wrote: > > > > > > > > Hi team, > > > > > > > > I'm trying to transport a phoenix table between two clusters, by > > copying > > > all related hbase files on hdfs from cluster A to cluster B. > > > > But after i executed CreateTableStatement in phoenix, phoenix failed > to > > > map those files into table, and `select *` got nothing. > > > > > > > > The questions are, > > > > Is there a proper way or tool to do the table transportation? > > > > If answer is no, can team provide some code pointers if i want to > > > implement it? > > > > Or reason why is this infeasible? > > > > > > > > FYI, > > > > both hbase version are both 1.x but different in minor version, > > > > phoenix version gap is huge, 4.7.0 and 4.14.1. > > > > > > > > Any suggestions are appreciated! > > > > Thanks > > > > > > > > > > > > -------------------------- > > > > > > > > Best regards, > > > > R.C > > > > > > > > > > > > > >
