My mistake - I thought you were going to do an in-place upgrade. No, no post processing will be required if the data is UPSERT'ed through the standard Phoenix APIs.
I don't know how/why your row keys aren't the correct number of bytes. Maybe your row key is different than what you think? Thanks, James On Fri, Dec 11, 2015 at 1:36 PM, Kristoffer Sjögren <[email protected]> wrote: > My concern regarding the row key is that they are 23 bytes in reality > instead of 26 bytes? Hmm. Guess I could write a value manually using > SQL and inspect it afterwards using asynchbase to find where those 3 > missing bytes went. > > Not sure I understand the post procedure. I'm not trying to do an > in-place upgrade. The table will be created from scratch in Phoenix > 4.4.0 / HBase 1.1.2. The data will be read from the old installation > using asynchbase and manually UPSERT'ed from that raw data using SQL. > > The reason I do it this way is to avoid such manual post procedures. > So just to be clear, will this approach still require those post > procedures? > > > Thanks James. > > On Fri, Dec 11, 2015 at 5:28 PM, James Taylor <[email protected]> > wrote: > > Your analysis of the row key structure is correct. Those are all fixed > types > > (4 + 4 + 8 +8 + 2 = 26 bytes for the key). > > > > If you're going from 0.94 to 0.98, there's stuff you need to do to get > your > > data into the new format. Best to ask about this on the HBase user list > or > > look it up in the reference docs. > > > > Once you get your data moved over, I'd recommend reissuing your DDL, > > specifying a CURRENT_SCN at connection time of a timestamp in millis > prior > > to any of your cell timestamps (so that Phoenix doesn't set empty key > value > > markers for every row). If your using any date/time types then your DDL > > should specify them as their UNSIGNED equivalent. If you have DESC row > key > > declarations for variable length types or BINARY, there's some post > > processing steps you'll need to do, but otherwise you should be good to > go > > (FWIW, we had an internal team successfully go through this about 6mo > back). > > > > Thanks, > > James > > > > On Friday, December 11, 2015, Kristoffer Sjögren <[email protected]> > wrote: > >> > >> My plan is to try use asynchbase to read the raw data and then upsert > >> it using Phoenix SQL. > >> > >> However, when I read the old table the data types for the row key > >> doesn't add up. > >> > >> CREATE TABLE T1 (C1 INTEGER NOT NULL, C2 INTEGER NOT NULL, C3 BIGINT > >> NOT NULL, C4 BIGINT NOT NULL, C5 CHAR(2) NOT NULL, V BIGINT CONSTRAINT > >> PK PRIMARY KEY ( C1, C2, C3, C4, C5 )) > >> > >> That's 4 + 4 + 8 +8 + 2 = 26 bytes for the key. But the actual key > >> that I read from HBase is only 23 bytes. > >> > >> [0, -128, 0, 0, 0, -44, 4, 123, -32, -128, 0, 0, 10, -128, 0, 0, 0, 0, > >> 0, 0, 0, 32, 32] > >> > >> Maybe the data type definitions as described on the phoenix site have > >> changed since version 2.2.3? Or some data type may be variable in > >> size? > >> > >> > >> On Thu, Dec 10, 2015 at 4:49 PM, Kristoffer Sjögren <[email protected]> > >> wrote: > >> > Hi > >> > > >> > We're in the process of upgrading from Phoenix 2.2.3 / HBase 0.96 to > >> > Phoneix 4.4.0 / HBase 1.1.2 and wanted to know the simplest/easiest > >> > way to copy data from old-to-new table. > >> > > >> > The tables contain only a few hundred million rows so it's OK to > >> > export locally and then upsert. > >> > > >> > Cheers, > >> > -Kristoffer >
