Thanks Nitin. UDF is a good solution. I was wondering if there was a builtin support for hive since it is the default flume format for flume avro sink.
Thanks, Deepak On Wed, Nov 13, 2013 at 1:15 PM, Nitin Pawar <[email protected]>wrote: > sorry hit send to soon .. > > correction rather than just changing your table definition. > > > On Wed, Nov 13, 2013 at 6:45 PM, Nitin Pawar <[email protected]>wrote: > >> Not really sure there is a direct way to concat anything other than >> strings in hive unless typecasting them to string. >> >> So you may want to keep the datatype of array elements to strings and >> try. else you may want to build your own udf to do it which looks more >> elegant way rather than just typecasting it. >> >> >> On Wed, Nov 13, 2013 at 5:18 PM, Deepak Subhramanian < >> [email protected]> wrote: >> >>> >>> >>> Hi, >>> >>> Anyone tried reading the default avro output from flume in Hive. >>> >>> I am using Flume to generate events in the default flume avro output >>> format. Bytes in avro schema are stored as array<tinyint> in Hive when I >>> use avroserde for hive . How do I convert array<tinyint> to string to read >>> the flume body data. I am using hive version 0.10 >>> >>> CREATE external TABLE flume_avro_test ROW FORMAT >>> > SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' >>> > STORED AS >>> > INPUTFORMAT >>> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' >>> > OUTPUTFORMAT >>> 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' >>> > LOCATION '/testlogs/2013/11/08/17' >>> > TBLPROPERTIES >>> ('avro.schema.literal'='{"type":"record","name":"Event","fields":[{"name":"headers","type":{"type":"map","values":"string"}},{"name":"body","type":"bytes"}]}'); >>> >>> >>> describe flume_avro_test >>> > ; >>> OK >>> headers map<string,string> from deserializer >>> body array<tinyint> from deserializer >>> >>> Thanks, >>> Deepak Subhramanian >>> >> >> >> >> -- >> Nitin Pawar >> > > > > -- > Nitin Pawar > -- Deepak Subhramanian
