Re: installing hive-0.10.0 from source

2013-02-26 Thread Eric Chu
from source instruction or the release notes. Eric On Tue, Feb 26, 2013 at 11:38 AM, Eric Chu e...@rocketfuel.com wrote: Hi, I tried to build Hive0.10.0 from source by doing the following: svn co http://svn.apache.org/repos/asf/hive/trunk hive sudo ant package It built fine and I got

java.lang.ClassNotFoundException: org.apache.hadoop.mapreduce.util.HostUtil for Hive 0.10.0

2013-02-26 Thread Eric Chu
(+hue-user since this issue prevents me from successfully installing Hue from source) Hi, I recently did the following with both the Hive-0.10 and Hive-0.9, and had a problem with 0.10 that I didn't see with 0.9 - Checked out the respective branch from github - Did an ant package -

No java compiler available exception for HWI

2013-03-29 Thread Eric Chu
Hi, I'm running Hive 0.10 and I want to support HWI (besides CLI and HUE). When I started HWI I didn't get any error. However, when I went to Hive Server Address:/hwi on my browser I saw the error below complaining about No Java compiler available. My JAVA_HOME is set to

unresolved dependency from ivy when building hive 0.10

2013-04-30 Thread Eric Chu
Hi, After upgrading Hive to 0.10 we often observe the following build error. We notice that when we run our dumpcache job (which includes deleting the ivy cache), the problem is *sometimes, but not always,* resolved. Does anyone know the cause of this problem or know of a better solution? Thanks

DataNucleus patches for Hive

2013-08-06 Thread Eric Chu
Hi, I'm a bit confused about what DataNucleus patches we should get for *Hive 0.11 with JDK 6*. It'd be great if people working on that could shed some light on the subject. Thanks in advance! After installing Hive 0.11 and applying the patch for HIVE-4619 (or else MR queries will result in

Re: DataNucleus patches for Hive

2013-08-20 Thread Eric Chu
*. To avoid that, you need to do ant very-clean before building hive. --Xuefu On Tue, Aug 6, 2013 at 4:39 PM, Eric Chu e...@rocketfuel.com wrote: Hi, I'm a bit confused about what DataNucleus patches we should get for *Hive 0.11 with JDK 6*. It'd be great if people working on that could

Insert into ORC partition from RCFile partition

2013-10-03 Thread Eric Chu
Hi, We're trying to convert our fact tables partitioned by date from RCFile to ORCFile. Since they are really big in size and we retain the last N days (partitions) of data, we don't want to re-process existing partitions. There are two approaches using Hive ALTER and INSERT commands that I'm

Re: query resulting in many small output files causes timeout error in Hue

2013-11-21 Thread Eric Chu
On Thu, Nov 21, 2013 at 10:55 AM, Tim timrobertson...@gmail.com wrote: Or setting reducers to 1 and doing a GROUP BY all columns forces a single file too. Tim, Sent from my iPhone (which makes terrible auto-correct spelling mistakes) On 21 Nov 2013, at 18:27, Eric Chu e

query resulting in many small output files causes timeout error in Hue

2013-11-21 Thread Eric Chu
Hi, We often have map-only queries that result in a large number of small output files (in the thousands). Although this doesn't affect CLI, when users try to view/download the query result in Hue, Hue would time out in trying to read all these small files. We tried to set the following

difference between partition by and distribute by in rank()

2014-07-11 Thread Eric Chu
Does anyone know what *rank() over(distribute by p_mfgr sort by p_name) * does exactly and how it's different from *rank() over(partition by p_mfgr order by p_name)*? Thanks, Eric

Re: difference between partition by and distribute by in rank()

2014-07-11 Thread Eric Chu
. Thanks Rekha From: Eric Chu e...@rocketfuel.com Reply-To: user@hive.apache.org user@hive.apache.org Date: Friday, July 11, 2014 at 1:38 PM To: hive-u...@hadoop.apache.org hive-u...@hadoop.apache.org Subject: difference between partition by and distribute by in rank() Does anyone know

tablesample query on bucketed ORC table needs execute access in Hive 0.13

2014-11-14 Thread Eric Chu
In Hive 0.13 when I do a tablesample query on an ORC table, such as select x from orc_table tablesample (bucket 32 out of 64) where date=1106; I'll get the following error saying I'm trying to run with EXECUTE access but the files have only READ access for non owner. Why would a simple select