Looks like your table is using text storage format. Binary data needs to be stored as base64 in TextInputformat, so those values are probably being interpreted as base64 strings.
________________________________ From: Ujjwal Wadhawan <uwadha...@gmail.com> Sent: Monday, September 14, 2015 2:32 PM To: user@hive.apache.org Subject: binary column data consistency in hive table copy Hi all, I recently observed a behavior in hive that I'll like to share and get inputs. Scenario: Say you have a hive table with a binary column. create table binsource (bincol binary); and some input data $ cat /nis3/home/ujjwal2/test2/binin 10000101 121 10 1011 Asfs Let's load the data in the table LOAD DATA LOCAL INPATH '/home/ujjwal2/test2/binin' OVERWRITE INTO TABLE binsource; When I do a select * on hive CLI, I see following characters (see image) [http://puu.sh/k6HBw/877367d595.png] The underlying HDFS file still has the actual input though. [cid:image001.png@01D0EF10.AE2240E0] Now I make a copy of this table using command "create table ujjwal2.bintarget as select * from ujjwal2.binsource;". [http://puu.sh/k6HEj/b34a8bd4a0.png] ISSUE: Now when I see the underlying file create on HDFS for bintarget, I see some extra characters. [cid:image006.png@01D0EF10.AE2240E0] In may combinations I have tried, the extra characters are in "=", "w" and "A". 10000101 120= 1w== 1011 Asfs Does anyone know what these characters signify ? Best, Ujjwal