Re: inserting dynamic partitions - need more reducers

2015-03-12 Thread Alex Bohr
YES! That did it, I will be adding that one to our global config. Good to see they defaulted it to false in 0.14. Thanks Prasanth On Thu, Mar 12, 2015 at 9:29 PM, Prasanth Jayachandran < pjayachand...@hortonworks.com> wrote: > Hi > > Can you try with hive.optimize.sort.dynamic.partition set to

Re: inserting dynamic partitions - need more reducers

2015-03-12 Thread Prasanth Jayachandran
Hi Can you try with hive.optimize.sort.dynamic.partition set to false? Thanks Prasanth On Thu, Mar 12, 2015 at 9:02 PM -0700, "Alex Bohr" mailto:a...@gradientx.com>> wrote: I'm inserting from an unpartitioned table with a 6 hours of data into a table partitioned by hour. The source table

inserting dynamic partitions - need more reducers

2015-03-12 Thread Alex Bohr
I'm inserting from an unpartitioned table with a 6 hours of data into a table partitioned by hour. The source table is 400M rows and 500GB so it's needs a lot of reducers working on the data - Hive chose 544 which sounds good. But 538 reducers did nothing and the other 6 are working for over an h

Question on hive query correlation optimization

2015-03-12 Thread canan chen
I use the following sql with mr engine and find that it would invoke 3 mr jobs. But as my understanding the join and group by operator could be done in the same mr job since they are using the same key. So not sure why here still 3 mr jobs, anyone know that ? Thanks select s2.name,count(1) as cn

Re: when start hive could not generate log file

2015-03-12 Thread Jianfeng (Jeff) Zhang
By default, hive.log is located in /tmp/${user}/hive.log Best Regard, Jeff Zhang From: zhangjp mailto:smart...@hotmail.com>> Reply-To: "user@hive.apache.org" mailto:user@hive.apache.org>> Date: Wednesday, March 11, 2015 at 7:12 PM To: "user@hive.apache.org

sqoop import to hive being killed by resource manager

2015-03-12 Thread Steve Howard
Hi All, We have not been able to get what is in the subject line to run. This is on hive 0.14. While pulling a billion row table from Oracle using 12 splits on the primary key, each job continually runs out of memory such as below... 15/03/13 00:22:23 INFO mapreduce.Job: Task Id : attempt_14260

Error creating a partitioned view

2015-03-12 Thread Buntu Dev
I got a 'log' table which is currently partitioned by year, month and day. I'm looking to create a partitioned view on top of 'log' table but running into this error: hive> CREATE VIEW log_view PARTITIONED ON (pagename,year,month,day) AS SELECT pagename year,month,day,uid,properties FROM log

Re: Bucket pruning

2015-03-12 Thread Gopal Vijayaraghavan
Hi, No and it¹s a shame because we¹re stuck on some compatibility details with this. The primary issue is the fact that the InputFormat is very generic and offers no way to communicate StorageDescriptor or bucketing. The split generation for something SequenceFileInputFormat lives inside MapRedu

filter on bucketed column

2015-03-12 Thread cobby cohen
bucketed column seems great but i dont understand why they are being used for just for optimizing joins and not where clause (filter).i have a huge table (billions of records)  which includes a field with medium cardinality (~100,000). user usually filter with that field (at least). using partit

Re: Any getting-started with UDAF development

2015-03-12 Thread Jason Dere
I think the Java code is likely still fine for the examples (though someone who actually knows about UDAFs might want to correct me here). If anything is out of date, it would be the build/test commands, which have switched from using ant to maven. Jason On Mar 11, 2015, at 4:01 AM, shahab mai

insert overwrite local question

2015-03-12 Thread Garry Chen
Hi, I am using hive-1.1.0. Is there a way to let the insert overwrite local statement write to a given file name like syz.txt instead of default name 0_0? Thank you very much for your input. Garry

Hive map-join

2015-03-12 Thread Guillermo Ortiz
Hello, I'm executing a join of two tables. -table1 sizes 130Gb -table2 sizes 1.5Gb In HDFS table1 is just one text file and table2 it's ten files. I'd like to execute a map-join and load in memory table2 use esp; set hive.auto.convert.join=true; #set hive.auto.convert.join.noconditionaltask = t

Re: is there a way to read Hive configurations from the REST APIs?

2015-03-12 Thread Lefty Leverenz
See these wikidocs for the *set* command: Commands & Beeline Hive Commands . -- Lefty On Thu, Mar 12, 2015 at 2

Bucket pruning

2015-03-12 Thread Daniel Haviv
Hi, We created a bucketed table and when we select in the following way: select * from testtble where bucket_col ='X'; We observe that there all of the table is being read and not just the specific bucket. Does Hive support such a feature ? Thanks, Daniel

RE: is there a way to read Hive configurations from the REST APIs?

2015-03-12 Thread Mich Talebzadeh
You can use the command *set* in hive to get the behaviour. You can also do the same through beeline. HTH Mich Talebzadeh http://talebzadehmich.wordpress.com Publications due shortly: Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and Coherence Cache NOTE

is there a way to read Hive configurations from the REST APIs?

2015-03-12 Thread Xiaoyong Zhu
Hi experts Don't know if there is a way to read the Hive configurations from the REST APIs? The reason is that we are trying to get the Hive configurations in order to perform some behaviors accordingly... Xiaoyong