My main concern here was that HADOOP_HOME is deprecated since hadoop 0.23. So I was hoping it could actually function as documented.
FWIW, I found this bug[1] that addresses exactly this issue. The attached patch makes HADOOP_HOME not required and auto-detects hadoop from the path. This seems to have ben patched to 0.10.0. [1] https://issues.apache.org/jira/browse/HIVE-2757 On Wed, Jul 18, 2012 at 12:50 PM, kulkarni.swar...@gmail.com < kulkarni.swar...@gmail.com> wrote: > Hm. Yeah I tried out with a few version 0.7 -> 0.9 and seems like they all > do. May be we should just update the documentation then? > > > On Wed, Jul 18, 2012 at 12:34 PM, Vinod Singh <vi...@vinodsingh.com>wrote: > >> We are using Hive 0.7.1 and there HADOOP_HOME must be exported so that >> it is available as environment variable. >> >> Thanks, >> Vinod >> >> >> On Wed, Jul 18, 2012 at 10:48 PM, Nitin Pawar <nitinpawar...@gmail.com>wrote: >> >>> from hive trunk i can only see this >>> I am not sure I am 100% sure but I remember setting up HADOOP_HOME always >>> >>> >>> http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapRedTask.java >>> >>> >>> String hadoopExec = conf.getVar(HiveConf.ConfVars.HADOOPBIN); >>> >>> this change was introduced in 0.8 >>> >>> from >>> http://svn.apache.org/repos/asf/hive/branches/branch-0.9/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java >>> >>> <http://svn.apache.org/repos/asf/hive/branches/branch-0.8/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java> >>> >>> HADOOPBIN("hadoop.bin.path", System.getenv("HADOOP_HOME") + "/bin/hadoop"), >>> >>> On Wed, Jul 18, 2012 at 10:38 PM, kulkarni.swar...@gmail.com < >>> kulkarni.swar...@gmail.com> wrote: >>> >>>> 0.9 >>>> >>>> >>>> On Wed, Jul 18, 2012 at 12:04 PM, Nitin Pawar >>>> <nitinpawar...@gmail.com>wrote: >>>> >>>>> this also depends on what version of hive you are using >>>>> >>>>> >>>>> On Wed, Jul 18, 2012 at 10:33 PM, kulkarni.swar...@gmail.com < >>>>> kulkarni.swar...@gmail.com> wrote: >>>>> >>>>>> Thanks for your reply nitin. >>>>>> >>>>>> Ok. So you mean we always need to set HADOOP_HOME irrespective of >>>>>> "hadoop" is on the path or not. Correct? >>>>>> >>>>>> Little confused because that contradicts what's mentioned here[1]. >>>>>> >>>>>> [1] >>>>>> https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-RunningHive >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> On Wed, Jul 18, 2012 at 11:59 AM, Nitin Pawar < >>>>>> nitinpawar...@gmail.com> wrote: >>>>>> >>>>>>> This is not a bug. >>>>>>> >>>>>>> even if hadoop was path, hive does not use it. >>>>>>> Hive internally uses HADOOP_HOME in the code base. So you will >>>>>>> always need to set that for hive. >>>>>>> Where as for HADOOP clusters, HADOOP_HOME is deprecated but hive >>>>>>> still needs it. >>>>>>> >>>>>>> Don't know if that answers your question >>>>>>> >>>>>>> Thanks, >>>>>>> Nitin >>>>>>> >>>>>>> >>>>>>> On Wed, Jul 18, 2012 at 10:01 PM, kulkarni.swar...@gmail.com < >>>>>>> kulkarni.swar...@gmail.com> wrote: >>>>>>> >>>>>>>> Hello, >>>>>>>> >>>>>>>> The hive documentation states that either HADOOP_HOME should be set >>>>>>>> or hadoop should be on the path. However for some cases, where >>>>>>>> HADOOP_HOME >>>>>>>> was not set but hadoop was on path, I have seen this error pop up: >>>>>>>> >>>>>>>> java.io.IOException: *Cannot run program "null/bin/hadoop" *(in >>>>>>>> directory "/root/swarnim/hive-0.9.0-cern1-SNAPSHOT"): >>>>>>>> java.io.IOException: >>>>>>>> error=2, No such file or directory >>>>>>>> at java.lang.ProcessBuilder.start(ProcessBuilder.java:460) >>>>>>>> at java.lang.Runtime.exec(Runtime.java:593) >>>>>>>> at java.lang.Runtime.exec(Runtime.java:431) >>>>>>>> at >>>>>>>> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:268) >>>>>>>> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134) >>>>>>>> at >>>>>>>> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) >>>>>>>> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1326) >>>>>>>> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1118) >>>>>>>> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:951) >>>>>>>> at >>>>>>>> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258) >>>>>>>> at >>>>>>>> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215) >>>>>>>> at >>>>>>>> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406) >>>>>>>> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:689) >>>>>>>> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:557) >>>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>>>>>> at >>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >>>>>>>> at >>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >>>>>>>> at java.lang.reflect.Method.invoke(Method.java:597) >>>>>>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:208) >>>>>>>> >>>>>>>> Digging into the code in MapRedTask.java, I found the following >>>>>>>> (simplified): >>>>>>>> >>>>>>>> String *hadoopExec* = conf.getVar(System.getenv("HADOOP_HOME") + >>>>>>>> "/bin/hadoop"); >>>>>>>> ... >>>>>>>> >>>>>>>> Runtime.getRuntime().exec(*hadoopExec*, env, new File(workDir)); >>>>>>>> >>>>>>>> Clearly, if HADOOP_HOME is not set, the command that it would try >>>>>>>> to execute is "null/bin/hadoop" which is exactly the exception I am >>>>>>>> getting. >>>>>>>> >>>>>>>> Has anyone else run into this before? Is this a bug? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> -- >>>>>>>> Swarnim >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Nitin Pawar >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Swarnim >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Nitin Pawar >>>>> >>>>> >>>> >>>> >>>> -- >>>> Swarnim >>>> >>> >>> >>> >>> -- >>> Nitin Pawar >>> >>> >> > > > -- > Swarnim > -- Swarnim