Re: Cannot build hive

2014-01-27 Thread Gunther Hagleitner
Thomas, You have to use -Phadoop-1 or -Phadoop-2 as an option when building hive. Thanks, Gunther. On Mon, Jan 27, 2014 at 11:48 PM, Thomas Larsson wrote: > Thanks Andrew, I'll try that. > > > On Tue, Jan 28, 2014 at 8:46 AM, Andrew Mains > wrote: > >> Hi Thomas, >> >> Check out the developer

Re: Cannot build hive

2014-01-27 Thread Thomas Larsson
Yes, that works. In other words RTFM! On Tue, Jan 28, 2014 at 8:48 AM, Thomas Larsson wrote: > Thanks Andrew, I'll try that. > > > On Tue, Jan 28, 2014 at 8:46 AM, Andrew Mains > wrote: > >> Hi Thomas, >> >> Check out the developer guide and FAQ: https://cwiki.apache.org/ >> confluence/display/

Re: Cannot build hive

2014-01-27 Thread Thomas Larsson
Thanks Andrew, I'll try that. On Tue, Jan 28, 2014 at 8:46 AM, Andrew Mains wrote: > Hi Thomas, > > Check out the developer guide and FAQ: https://cwiki.apache.org/ > confluence/display/Hive/DeveloperGuide#DeveloperGuide- > CompilingandRunningHive, https://cwiki.apache.org/confluence/display/Hiv

Re: Cannot build hive

2014-01-27 Thread Andrew Mains
Hi Thomas, Check out the developer guide and FAQ: https://cwiki.apache.org/confluence/display/Hive/DeveloperGuide#DeveloperGuide-CompilingandRunningHive, https://cwiki.apache.org/confluence/display/Hive/HiveDeveloperFAQ . The instructions on the FAQ ought to work for the latest code (at least,

Cannot build hive

2014-01-27 Thread Thomas Larsson
Hello. I just checked out the hive git repo and ran "mvn clean install" from the project root. It fails in module common in which the class HiveStringUtils tries to use the classes org.apache.hadoop.fs.Path and org.apache.hadoop.fs.Text which cannot be found. Is there anything special I need to d

Re: HIVE+MAPREDUCE

2014-01-27 Thread Thejas Nair
You can use hcatalog to write into hive tables from mapreduce. See https://cwiki.apache.org/confluence/display/Hive/HCatalog+InputOutput . Another example is in https://gist.github.com/thejasmn/7607406 On Tue, Jan 21, 2014 at 12:21 AM, Ranjini Rathinam wrote: > Hi, > > Need to load the data into

Re: [ANNOUNCE] New Hive Committers - Sergey Shelukhin and Jason Dere

2014-01-27 Thread Hari Subramaniyan
Congrats Jason and Sergey! On Mon, Jan 27, 2014 at 11:41 AM, Jason Dere wrote: > Thanks everyone! > > Jason > > On Jan 27, 2014, at 11:01 AM, Eugene Koifman > wrote: > > Congratulations Sergey and Jason! > > > On Mon, Jan 27, 2014 at 10:58 AM, Thejas Nair wrote: > >> Congrats Jason and Sergey!

Re: RCFile vs SequenceFile vs text files

2014-01-27 Thread Thilina Gunarathne
Thanks Edward. I'm actually populating this table periodically from another temporary table and OCR sounds like a good fit. But unfortunately we are stuck with Hive 0.9. I wonder how easy/hard to use the data stored as RCFile or ORC with Java MapReduce? thanks, Thilina On Mon, Jan 27, 2014 at 3

Re: RCFile vs SequenceFile vs text files

2014-01-27 Thread Edward Capriolo
The thing about OCR is that it is great for tables created from other tables, (like the other columnar formats) but if you are logging directly to HDFS, a columnar format is not easy (possible) to write directly. Normally people store data in a very direct row oriented form and then there first map

How to access a secure hive metastore from a mapreduce job.

2014-01-27 Thread Thomas Larsson
Hello, We have a hive metastore that is secured with kerberos. I need to access this from a mapreduce job but don't know how to authenticate. Due to our cluster setup I am currently looking at JDBC instead of HCatalog, but as far as I know, it is a bit of a hassle to do this. For example, how is t

Re: RCFile vs SequenceFile vs text files

2014-01-27 Thread Edward Capriolo
In general, use Sequence Files + with GZip or Snappy Compression. On Mon, Jan 27, 2014 at 2:44 PM, Thilina Gunarathne wrote: > Thanks Eric and Sharath for the pointers to ORC. Unfortunately ORC would > not be an option for us as our cluster still runs Hive 0.9 and we won't be > migrating any tim

Re: RCFile vs SequenceFile vs text files

2014-01-27 Thread Thilina Gunarathne
Thanks Eric and Sharath for the pointers to ORC. Unfortunately ORC would not be an option for us as our cluster still runs Hive 0.9 and we won't be migrating any time soon. thanks, Thilina On Mon, Jan 27, 2014 at 2:35 PM, Sharath Punreddy wrote: > Quick insights: > > > http://hortonworks.com/bl

Re: [ANNOUNCE] New Hive Committers - Sergey Shelukhin and Jason Dere

2014-01-27 Thread Jason Dere
Thanks everyone! Jason On Jan 27, 2014, at 11:01 AM, Eugene Koifman wrote: > Congratulations Sergey and Jason! > > > On Mon, Jan 27, 2014 at 10:58 AM, Thejas Nair wrote: > Congrats Jason and Sergey! > Well deserved! > Looking forward to your help in getting the patch available counts > down

Re: RCFile vs SequenceFile vs text files

2014-01-27 Thread Sharath Punreddy
Quick insights: http://hortonworks.com/blog/orcfile-in-hdp-2-better-compression-better-performance/ On Mon, Jan 27, 2014 at 1:29 PM, Eric Hanson (BIG DATA) < eric.n.han...@microsoft.com> wrote: > It sounds like ORC would be best. > > > > -Eric > > > > *From:* Thilina Gunarath

RE: RCFile vs SequenceFile vs text files

2014-01-27 Thread Eric Hanson (BIG DATA)
It sounds like ORC would be best. -Eric From: Thilina Gunarathne [mailto:cset...@gmail.com] Sent: Monday, January 27, 2014 11:05 AM To: user@hive.apache.org Subject: RCFile vs SequenceFile vs text files Dear all, We are trying to pick the right data storage format for the Hive ta

RCFile vs SequenceFile vs text files

2014-01-27 Thread Thilina Gunarathne
Dear all, We are trying to pick the right data storage format for the Hive table with the following requirement and would really appreciate any insights you can provide to help our decision. 1. ~50Billion records per month. ~14 columns per record and each record is ~100 bytes. Table is partitione

Re: [ANNOUNCE] New Hive Committers - Sergey Shelukhin and Jason Dere

2014-01-27 Thread Eugene Koifman
Congratulations Sergey and Jason! On Mon, Jan 27, 2014 at 10:58 AM, Thejas Nair wrote: > Congrats Jason and Sergey! > Well deserved! > Looking forward to your help in getting the patch available counts > down (its at 225 now)! > > > On Mon, Jan 27, 2014 at 10:55 AM, Vaibhav Gumashta > wrote: >

Re: [ANNOUNCE] New Hive Committers - Sergey Shelukhin and Jason Dere

2014-01-27 Thread Thejas Nair
Congrats Jason and Sergey! Well deserved! Looking forward to your help in getting the patch available counts down (its at 225 now)! On Mon, Jan 27, 2014 at 10:55 AM, Vaibhav Gumashta wrote: > Congrats Sergey and Jason! > > --Vaibhav > > > On Mon, Jan 27, 2014 at 10:47 AM, Vikram Dixit > wrote:

Re: [ANNOUNCE] New Hive Committers - Sergey Shelukhin and Jason Dere

2014-01-27 Thread Vaibhav Gumashta
Congrats Sergey and Jason! --Vaibhav On Mon, Jan 27, 2014 at 10:47 AM, Vikram Dixit wrote: > Congrats Sergey and Jason! > > Thanks > Vikram. > > On Jan 27, 2014, at 8:36 AM, Carl Steinbach wrote: > > > The Apache Hive PMC has voted to make Sergey Shelukhin and Jason Dere > > committers on the A

Re: [ANNOUNCE] New Hive Committers - Sergey Shelukhin and Jason Dere

2014-01-27 Thread Vikram Dixit
Congrats Sergey and Jason! Thanks Vikram. On Jan 27, 2014, at 8:36 AM, Carl Steinbach wrote: > The Apache Hive PMC has voted to make Sergey Shelukhin and Jason Dere > committers on the Apache Hive Project. > > Please join me in congratulating Sergey and Jason! > > Thanks. > > Carl -- CONFI

Re: [ANNOUNCE] New Hive Committers - Sergey Shelukhin and Jason Dere

2014-01-27 Thread Gunther Hagleitner
Congratulations Sergey and Jason! Thanks, Gunther. On Mon, Jan 27, 2014 at 10:20 AM, Prasanth Jayachandran < pjayachand...@hortonworks.com> wrote: > Congrats!! Sergey and Jason.. > Thanks > Prasanth Jayachandran > > On Jan 27, 2014, at 10:19 AM, Sergey Shelukhin > wrote: > > > Thanks guys! > >

Re: [ANNOUNCE] New Hive Committers - Sergey Shelukhin and Jason Dere

2014-01-27 Thread Prasanth Jayachandran
Congrats!! Sergey and Jason.. Thanks Prasanth Jayachandran On Jan 27, 2014, at 10:19 AM, Sergey Shelukhin wrote: > Thanks guys! > > > On Mon, Jan 27, 2014 at 9:24 AM, Jarek Jarcec Cecho wrote: > Congratulations Sergey and Jason, good job! > > Jarcec > > On Mon, Jan 27, 2014 at 08:36:37AM -0

Re: [ANNOUNCE] New Hive Committers - Sergey Shelukhin and Jason Dere

2014-01-27 Thread Sergey Shelukhin
Thanks guys! On Mon, Jan 27, 2014 at 9:24 AM, Jarek Jarcec Cecho wrote: > Congratulations Sergey and Jason, good job! > > Jarcec > > On Mon, Jan 27, 2014 at 08:36:37AM -0800, Carl Steinbach wrote: > > The Apache Hive PMC has voted to make Sergey Shelukhin and Jason Dere > > committers on the Apa

Re: [ANNOUNCE] New Hive Committers - Sergey Shelukhin and Jason Dere

2014-01-27 Thread Jarek Jarcec Cecho
Congratulations Sergey and Jason, good job! Jarcec On Mon, Jan 27, 2014 at 08:36:37AM -0800, Carl Steinbach wrote: > The Apache Hive PMC has voted to make Sergey Shelukhin and Jason Dere > committers on the Apache Hive Project. > > Please join me in congratulating Sergey and Jason! > > Thanks.

[ANNOUNCE] New Hive Committers - Sergey Shelukhin and Jason Dere

2014-01-27 Thread Carl Steinbach
The Apache Hive PMC has voted to make Sergey Shelukhin and Jason Dere committers on the Apache Hive Project. Please join me in congratulating Sergey and Jason! Thanks. Carl

Indexes, again

2014-01-27 Thread Peter Marron
Hi, I am using Hadoop 1.0.4 and Hive 0.11.0. I am trying to create my own indexes. Given the problems that I have had in the past I thought it best to try and do things slowly. So I created my own class which derived from TableBasedIndexHandler I copied all the methods from CompactIndexHandler

HIVE versus SQL DB

2014-01-27 Thread Felipe Gutierrez
Hi, I am in a project that has three databases with flat files. Our plan is to normalize these DB in one. We will need to follow the Data warehouse concept (ETL - Extraction, Transform, Load). We are thinking to use Hadoop at the Transform step, because we need to relate datas from the three

Re: DESCRIBE EXTENDED show numRows=0

2014-01-27 Thread Lefty Leverenz
Can the ANALYZE statement be used to gather statistics if hive.stats.autogather was 'false' when the data was loaded? (See the wiki's Statistics in Hive doc: Existing Tables .) -- Lefty On Sun, Jan 26, 2014 at 8

How to create partitioned external table when using AvroSerDe?

2014-01-27 Thread George Agnelli
I have some avro data files partitioned by date such as: /items//mm/dd/part-r-0.avro I want to create a Hive table on this data, so I have a create table statement: CREATE EXTERNAL TABLE items PARTITIONED BY (year STRING, month STRING, day STRING) ROW FORMAT SERDE 'org.apache.hadoop.hive

Re: Usage of TIMESTAMP

2014-01-27 Thread Petter von Dolwitz (Hem)
Hi Jason, thank you for sorting this out for me! /Petter 2014/1/24 Jason Dere > See HIVE-2558 - the comparison between timestamp and string was done by > converting both values to a number. > i think this behavior should be changed, as of Hive-0.12 > > Jason > > On Jan 23, 2014, at 6:20 AM, P