Re: Writing to an ORCFile using MapReduce + HCatalog APIs

2014-04-07 Thread Eugene Koifman
If you are writing from Pig using HCatStorer you don't need to create HCatSchema. https://cwiki.apache.org/confluence/display/Hive/HCatalog+LoadStore#HCatalogLoadStore-Usage.1has examples on how to do it. So if you create a Hive table that use ORC you should be able to write your Pig cursor to

Indexes in hive

2014-04-07 Thread saquib khan
Dear Friends, I have created the indexes but I am not able to use them. Does hive have an index optimizer? For a simple query in Oracle/Postgres, when we do EXPLAIN it gives the Query Plan with indexes but with Hive it does not show me the indexes. How can we make sure that the indexes are being

Re: Writing to an ORCFile using MapReduce + HCatalog APIs

2014-04-07 Thread Abhishek Girish
Thanks Eugene. I had gone over the link previously. I am not looking for a way to do this via Pig CLI. My data is going to be read by a *MapReduce job* (by Pig i meant custom Pig source code) and hence need input readers and output writers. I need to be able to write using HCatOutputFormat. I

hive genericudf : possible to output column headers ?

2014-04-07 Thread Viral Bajaria
Hi, I have a use-case where I want a UDF to return dynamic number of columns based on parameter. Consider a udf function called return_cols which takes two params, fields and a comma-separated list of output fields eg. select return_cols(table-field, foo,bar) from table or select

get_json_object for nested field returning a String instead of an Array

2014-04-07 Thread Narayanan K
Hi all I am using get_json_object to read a json text file. I have created the external table as below : CREATE EXTERNAL TABLE EXT_TABLE ( json string) PARTITIONED BY (dt string) LOCATION '/users/abc/'; The json data has some fields that are not simple fields but fields which are nested fields

Re: get_json_object for nested field returning a String instead of an Array

2014-04-07 Thread Peyman Mohajerian
perhaps: https://github.com/rcongiu/Hive-JSON-Serde On Mon, Apr 7, 2014 at 6:52 PM, Narayanan K knarayana...@gmail.com wrote: Hi all I am using get_json_object to read a json text file. I have created the external table as below : CREATE EXTERNAL TABLE EXT_TABLE ( json string)

Re: get_json_object for nested field returning a String instead of an Array

2014-04-07 Thread Narayanan K
Thanks Peyman. Actually the problem with Hive-Json-Serde is that we need to provide the entire schema upfront while creating the table. My requirement is that we just project/aggregate on the fields using get_json_object after creating the external table without schema. This way the external