So more experimentation over the long weekend on this. If I load sample data into the new cluster table manually through the shell, column filters work as expected.
Obviously not a solution to the problem. Anyone have any ideas or things I should be looking at? The regionserver logs show nothing unusual. Is there another export/import chain I could try? Thanks, Zack On Sun, May 24, 2015, at 11:43 AM, [email protected] wrote: > Hello all- > > I'm hoping someone can point me in the right direction as I've exhausted > all my knowledge and abilities on the topic... > > I've inherited an old, poorly configured and brittle CDH4 cluster > running HBase 0.92. I'm attempting to migrate the data to a new Ambari > cluster running HBase 0.98. I'm attempting to do this without changing > anything on the old cluster as I have hard enough time keeping it > running as is. Also, due to configuration issues with the old cluster > (on AWS), a direct HBase to HBase table copy, or even HDFS to HDFS copy > is out of the question at the moment. > > I was able to use the export task on the old cluster to dump the HBase > tables to HDFS, which I then distcp s3n copied up to S3, then back down > to the new cluster, then used the HBase importer. This appears to work > fine... > > ... except that on the new cluster table scans with column filters do > not work. > > A sample row looks something this: > A:9223370612274019807:twtr:56935907581904486 column=x:twitter:username, > timestamp=1424592575087, value=Bilo Selhi > > Unfortunately, even though I can see the column is properly defined, I > cannot filter on it: > > hbase(main):015:0> scan 'content' , {LIMIT=>10, > COLUMNS=>'x:twitter:username'} > ROW COLUMN+CELL > 0 row(s) in 352.7990 seconds > > Any ideas what the heck is going here? > > Here's the rough process I used for the export/import: > Old cluster: > $ hbase org.apache.hadoop.hbase.mapreduce.Driver export content > hdfs:///hbase_content > $ hadoop distcp -Dfs.s3n.awsAccessKeyId='xxxx' > -Dfs.s3n.awsSecretAccessKey='xxxx' -i hdfs:///hbase_content > s3n://hbase_content > > New cluster: > $ hadoop distcp -Dfs.s3n.awsAccessKeyId='xxxx' > -Dfs.s3n.awsSecretAccessKey='xxxx' -i s3n://hbase_content > hdfs:///hbase_content > $ hbase -Dhbase.import.version=0.94 > org.apache.hadoop.hbase.mapreduce.Driver import content > hdfs:///hbase_content > > Thanks! > Z
