Re: Skewed Tables

2014-04-28 Thread Lefty Leverenz
Prasanth, Hive's user docs are wiki-only at this point so there's no version control. We just add notes about which release introduced or changed something. For an example see the beginning of the Skewed

Python UDFS

2014-04-28 Thread Sreenath
Is adding too many files(python udfs) an overhead for hive query execution ? -- Sreenath S Kamath

Re: Hive 0.12 ORC Heap Issues on Write

2014-04-28 Thread John Omernik
Prasanth - This is easily the best and most complete explanation I've received to any online posted question ever. I know that sounds like a an overstatement, but this answer is awesome. :) I really appreciate your insight on this. My only follow-up is asking how the memory.pool percentage

subscribe

2014-04-28 Thread Nivedhita Sathyamurthy

HIVE_PLAN file not found

2014-04-28 Thread Nivedhita Sathyamurthy
Hello, I am using hive 0.12 and when I try to run 'select * from table_name;' the query executes fine. But when I try to run any other query that starts a map reduce job, I get the following error: java.lang.RuntimeException: java.io.FileNotFoundException:

Re: subscribe

2014-04-28 Thread Abhishek Girish
Hi Nivedita, Please send an email to user-subscr...@hive.apache.org to subscribe. -Abhishek On Mon, Apr 28, 2014 at 8:15 AM, Nivedhita Sathyamurthy nivedhitasat...@gmail.com wrote:

Re: Problem adding jar using pyhs2

2014-04-28 Thread David Engel
Thanks for your response. We've essentially done your first suggestion in the past by copying or symlinking our jar into Hive's lib directory. It works, but we'd like a better way for different users to to use different versions of our jar during development. Perhaps that's not possible,

set hive.cli.print.header=true

2014-04-28 Thread Kishore kumar
Hi, I am using cloudera4.5 with cloudera manager4.8, I want to set hive.cli.print.header=true, where should i specify, is it possible to specify in CM, or from cli what is location, please help me. -- Thanks, *Kishore *

Cannot Upgrade a Hive UDF without cluster restart. UDF is possibly cached.

2014-04-28 Thread David Zaebst
Hi all, We have a few Hive UDFs where I work. These are deployed by a bootstrap script so that the JAR files are in Hive's CLASSPATH before the server starts. This works to load the UDF whenever a cluster is started and then the UDF can be loaded with the ADD JAR and CREATE TEMPORARY FUNCTION

Re: set hive.cli.print.header=true

2014-04-28 Thread Matouk IFTISSEN
Hé, edit a file .hiverc in /path_to_hive/conf/ and set in this file what do you want : add jar /path_to/hive-contrib-*.jar; set hive.cli.print.header=true; --print the filds name set hive.cli.print.current.db=true; -- pint data base name And then copy this file .hiverc to all hive nodes in

Re: analyze hive tables with null values in partition columns

2014-04-28 Thread Dileep Kumar
I have a table that has partition based on column ss_sold_date_sk which has null value partition as well. When I run the analyze ..compute stat it fails with attached exception. Is there a way to avoid this or bypass this exception, also what would be the impact on query performance of stat

Re: Problem adding jar using pyhs2

2014-04-28 Thread Brad Ruderman
Hi David- Can you test the code? It is working for me. Make sure your jar is in HDFS and you are using the FQDN for referencing it. import pyhs2 with pyhs2.connect(host='127.0.0.1', port=1, authMechanism=PLAIN, user='root',

Re: Hive 0.12 ORC Heap Issues on Write

2014-04-28 Thread Prasanth Jayachandran
Glad that presentation was useful to you :) hive.exec.orc.memory.pool is the fraction of memory that ORC writers are allowed to use. If your heap size is 1GB and if the hive.exec.orc.memory.pool is set to 0.5 then ORC writers can use maximum of 500MB memory. If there are more ORC writers and

Re: set hive.cli.print.header=true

2014-04-28 Thread Adrian Hains
For what it's worth, you can also set this per-client by putting this .hiverc file in the users home directory. E.g. I have a file /home/ahains/.hiverc that enables hive.cli.print.header. Cheers, -a On Mon, Apr 28, 2014 at 9:13 AM, Matouk IFTISSEN matouk.iftis...@ysance.com wrote: Hé, edit a

Re: Skewed Tables

2014-04-28 Thread Prasanth Jayachandran
Lefty, I have updated the hive wiki in few places to say we should use stored as directories for list bucketing features. There are two different optimizations that uses SKEWED BY” keyword. One is skewed join optimization and other is list bucketing optimization. I think we need to mention this

Issues running Tez Hadoop 2.2.0

2014-04-28 Thread Bryan Jeffrey
All, I attempted to setup Tez per the instructions on the following page: http://tez.incubator.apache.org/install.html After changing my mapreduce framework name to 'yarn-tez', I am getting the following error: [root@viper ~]# /opt/hadoop/latest-hadoop/bin/hadoop job -list DEPRECATED: Use of

Re: too many mappers

2014-04-28 Thread Surendra , Manchikanti
Hi, to decrease the number of Mappers created , Increase the mapred.max.split. size value. Regards, Surendra M -- Surendra Manchikanti On Thu, Apr 17, 2014 at 2:33 AM, Tatarinov, Igor itatari...@ebay.comwrote: For some reason, I can’t decrease the number of mappers in Hive (0.12) and