Re: JSON data to HIVE table

2014-01-07 Thread Rok Kralj
Also, if you have large or dynamic schemas which are a pain to write by hand, you can use this simple tool: https://github.com/strelec/hive-serde-gen 2014/1/7 Roberto Congiu roberto.con...@openx.com Also https://github.com/rcongiu/Hive-JSON-Serde ;) On Mon, Jan 6, 2014 at 12:00 PM, Russell

partitioned by usage

2014-01-07 Thread Kishore kumar
Hi Experts, As per this link http://stackoverflow.com/questions/10276584/hive-table-partition-with-column-in-the-middle I understood that we can create the partitioned table when we have already a non partitioned table, if it is correct, when we will use partitioned by clause to create a new

Re: partitioned by usage

2014-01-07 Thread Nitin Pawar
can you put your question in with an example? On Tue, Jan 7, 2014 at 2:43 PM, Kishore kumar kish...@techdigita.in wrote: Hi Experts, As per this link http://stackoverflow.com/questions/10276584/hive-table-partition-with-column-in-the-middle I understood that we can create the

Re: any standalone utility/tool to read ORC footer/index/data ?

2014-01-07 Thread Nitin Pawar
as of now none that I am aware of. You can look at the test cases in hive code to read orcfile and may be use something similar On Tue, Jan 7, 2014 at 5:17 AM, Tongjie Chen tongjie.c...@gmail.com wrote: Hi all, Is they any utility/tool out there to inspect orc file? Thanks, Tongjie

Re: partitioned by usage

2014-01-07 Thread Kishore kumar
How to create partitioned table without creating intermediate table? simply.. On Tue, Jan 7, 2014 at 10:21 AM, Nitin Pawar nitinpawar...@gmail.comwrote: can you put your question in with an example? On Tue, Jan 7, 2014 at 2:43 PM, Kishore kumar kish...@techdigita.inwrote: Hi Experts, As

Re: partitioned by usage

2014-01-07 Thread Nitin Pawar
its something like this create table xyz (a int, b string) partitioned by (c string); LOAD DATA LOCAL INPATH 'abc' INTO TABLE xyx PARTITION(c=abc); remember if your data has multiple values on partition column and you do not want to write mapreduce code or pig scripts then you will need a

How to generate json/complex object type from hive table

2014-01-07 Thread Bogala, Chandra Reddy
Hi, How to generate json data from a table data that's in hive? For example, if I have data in table format (below) and want to generate data in json format below. I want to group by person name and fill the STRUCT and ARRAY with that person. So finally I should get one row per person. I tried

Re: any standalone utility/tool to read ORC footer/index/data ?

2014-01-07 Thread Prasanth Jayachandran
You can use ORC file dump utility to analyze ORC files.. Use following command to use file dump hive —orcfiledump hdfs-location-to-orc-file Thanks Prasanth Jayachandran On Jan 7, 2014, at 3:53 PM, Nitin Pawar nitinpawar...@gmail.com wrote: as of now none that I am aware of. You can look

Re: any standalone utility/tool to read ORC footer/index/data ?

2014-01-07 Thread Nitin Pawar
Thanks Prasanth for rectifying my error On Tue, Jan 7, 2014 at 4:49 PM, Prasanth Jayachandran pjayachand...@hortonworks.com wrote: You can use ORC file dump utility to analyze ORC files.. Use following command to use file dump hive —orcfiledump hdfs-location-to-orc-file Thanks Prasanth

Re: Help on loading data stream to hive table.

2014-01-07 Thread Alan Gates
I am not wise enough in the ways of Storm to tell you how you should partition data across bolts. However, there is no need in Hive for all data for a partition to be in the same file, only in the same directory. So if each bolt creates a file for each partition and then all those files are

Re: JSON data to HIVE table

2014-01-07 Thread Raj Hadoop
  All,   If I have to load JSON data to a Hive table (default record format while creating the table) - is that a requirement to convert each JSON record into one line.   How would I do this ?     Thanks, Raj From: Rok Kralj rok.kr...@gmail.com To:

Re: Help on loading data stream to hive table.

2014-01-07 Thread Peyman Mohajerian
You may find summingbird relevant, I'm still investigating it: https://blog.twitter.com/2013/streaming-mapreduce-with-summingbird On Tue, Jan 7, 2014 at 11:39 AM, Alan Gates ga...@hortonworks.com wrote: I am not wise enough in the ways of Storm to tell you how you should partition data across

Re: working with HIVE VARIALBE: Pls suggest

2014-01-07 Thread Stephen Sprague
wow. that's pretty clever! a+ for ingenuity! :) As a side bar I tend to use shell variables more often than not using this idiom. #!/bin/bash id=blah hive SQL select foo from bar where id='$id'; quit; SQL and if you wanted the output in a *shell* variable then: #!/bin/bash id=blah

Re: JSON data to HIVE table

2014-01-07 Thread Jay Vyas
One nice way to do this stuff is using a special SERDE, possible like the JsonSerde: A simpler scenario, where you have to load a multidelimiter CSV file, can be addressed using the RegexSerde : which maps columns to each group matching. In your case, the JsonSerde could be used essentially to

Re: Help on loading data stream to hive table.

2014-01-07 Thread Chen Wang
Alan, The reason I am trying to write to the same file is that i don't want to persist each entry as a small file to hdfs. It will make hive loading very inefficient, right? (although i could do file merging in a separate job). My current thought is that i probably could set up a timer(say 6min)

RE: working with HIVE VARIALBE: Pls suggest

2014-01-07 Thread Sun, Rui
Cool! Thanks for sharing. From: Stephen Sprague [mailto:sprag...@gmail.com] Sent: Wednesday, January 08, 2014 5:12 AM To: user@hive.apache.org Subject: Re: working with HIVE VARIALBE: Pls suggest wow. that's pretty clever! a+ for ingenuity! :) As a side bar I tend to use shell variables more

Re: [DISCUSS] Proposed Changes to the Apache Hive Project Bylaws

2014-01-07 Thread Thejas Nair
After thinking some more about it, I am not sure if we need to have a hard and fast rule of 24 hours before commit. I think we should let committers make a call on if this is a trivial, safe and non controversial change and commit it in less than 24 hours in such cases. In case of larger changes,