I have an HBase table I've defined as an external table in Hive, and I'm having trouble determining the proper escaping of newlines in the byte arrays.
The primary use-case of this table is writing via the HBase client API, then reading via HiveQL select queries against HiveServer2. I've found that if I leave the newlines alone (as just \n), then a query utilizing a WHERE clause creates extraneous rows with NULL values, but writing them to HBase as \\n makes the queries return the correct rows, but they stay escaped in the query result. I was expecting to need to escape them since I'm writing to HBase outside of Hive, but I also expected them to come back out of Hive without needed an extra un-escaping step. Running Hive 0.10 from CDH4.2.1, table structure looks like: CREATE EXTERNAL TABLE blog_post ( id STRUCT<blog_name: STRING, post_id: STRING>, blog_name STRING, post_id STRING, body STRING ) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ( 'hbase.columns.mapping' = ':key,post:blog_name,post:post_id,post:body', 'hbase.table.default.storage.type' = 'binary' ) TBLPROPERTIES ( 'hbase.table.name' = 'blog_post' ); Example query: SELECT * FROM blog_post WHERE blog_name = 'testblog'; Thanks, Rob Roland