Hi,
my hadoop’s core-site.xml contains following about tmp
<property>
<name>hadoop.tmp.dir</name>
<value>/hadoop_data/hadoop_data/tmp</value>
</property>
my hive-default.xml contains following about tmp
<property>
<name>hive.exec.scratchdir</name>
<value>/tmp/hive-${user.name}</value>
<description>Scratch space for Hive jobs</description>
</property>
<property>
<name>hive.exec.local.scratchdir</name>
<value>/tmp/${user.name}</value>
<description>Local scratch space for Hive jobs</description>
</property>
Will this related to configuration issue or a bug?
Please help!
Regards
Arthur
On 6 Jan, 2015, at 3:45 am, Jason Dere <[email protected]> wrote:
> During query compilation Hive needs to instantiate the UDF class and so the
> JAR needs to be resolvable by the class loader, thus the JAR is copied
> locally to a temp location for use.
> During map/reduce jobs the local jar (like all jars added with the ADD JAR
> command) should then be added to the distributed cache. It looks like this is
> where the issue is occurring, but based on path in the error message I
> suspect that either Hive or Hadoop is mistaking what should be a local path
> with an HDFS path.
>
> On Jan 4, 2015, at 10:23 AM, [email protected]
> <[email protected]> wrote:
>
>> Hi,
>>
>> A question: Why does it need to copy the jar file to the temp folder? Why
>> couldn’t it use the file defined in using JAR
>> 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar' directly?
>>
>> Regards
>> Arthur
>>
>>
>> On 4 Jan, 2015, at 7:48 am, [email protected]
>> <[email protected]> wrote:
>>
>>> Hi,
>>>
>>>
>>> A1: Are all of these commands (Step 1-5) from the same Hive CLI prompt?
>>> Yes
>>>
>>> A2: Would you be able to check if such a file exists with the same path,
>>> on the local file system?
>>> The file does not exist on the local file system.
>>>
>>>
>>> Is there a way to set the another “tmp" folder for HIVE? or any suggestions
>>> to fix this issue?
>>>
>>> Thanks !!
>>>
>>> Arthur
>>>
>>>
>>>
>>> On 3 Jan, 2015, at 4:12 am, Jason Dere <[email protected]> wrote:
>>>
>>>> The point of USING JAR as part of the CREATE FUNCTION statement to try to
>>>> avoid having to do ADD JAR/aux path stuff to get the UDF to work.
>>>>
>>>> Are all of these commands (Step 1-5) from the same Hive CLI prompt?
>>>>
>>>>>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate'
>>>>>> using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar';
>>>>>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>> Added
>>>>>> /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>> to class path
>>>>>> Added resource:
>>>>>> /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>> OK
>>>>
>>>>
>>>> One note,
>>>> /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>> here should actually be on the local file system, not on HDFS where you
>>>> were checking in Step 5. During CREATE FUNCTION/query compilation, Hive
>>>> will make a copy of the source JAR
>>>> (hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar), copied to a temp
>>>> location on the local file system where it's used by that Hive session.
>>>>
>>>> The location mentioned in the FileNotFoundException
>>>> (hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar)
>>>> has a different path than the local copy mentioned during CREATE FUNCTION
>>>> (/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar).
>>>> I'm not really sure why it is a HDFS path here either, but I'm not too
>>>> familiar with what goes on during the job submission process. But the fact
>>>> that this HDFS path has the same naming convention as the directory used
>>>> for downloading resources locally (***_resources) looks a little fishy to
>>>> me. Would you be able to check if such a file exists with the same path,
>>>> on the local file system?
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Dec 31, 2014, at 5:22 AM, Nirmal Kumar <[email protected]>
>>>> wrote:
>>>>
>>>>> Important: HiveQL's ADD JAR operation does not work with HiveServer2
>>>>> and the Beeline client when Beeline runs on a different host. As an
>>>>> alterntive to ADD JAR, Hive auxiliary path functionality should be used
>>>>> as described below.
>>>>>
>>>>> Refer:
>>>>> http://www.cloudera.com/content/cloudera/en/documentation/cloudera-manager/v4-8-0/Cloudera-Manager-Managing-Clusters/cmmc_hive_udf.html
>>>>>
>>>>>
>>>>> Thanks,
>>>>> -Nirmal
>>>>>
>>>>> From: [email protected] <[email protected]>
>>>>> Sent: Tuesday, December 30, 2014 9:54 PM
>>>>> To: vic0777
>>>>> Cc: [email protected]; [email protected]
>>>>> Subject: Re: CREATE FUNCTION: How to automatically load extra jar file?
>>>>>
>>>>> Thank you.
>>>>>
>>>>> Will this work for hiveserver2 ?
>>>>>
>>>>>
>>>>> Arthur
>>>>>
>>>>> On 30 Dec, 2014, at 2:24 pm, vic0777 <[email protected]> wrote:
>>>>>
>>>>>>
>>>>>> You can put it into $HOME/.hiverc like this: ADD JAR
>>>>>> full_path_of_the_jar. Then, the file is automatically loaded when Hive
>>>>>> is started.
>>>>>>
>>>>>> Wantao
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> At 2014-12-30 11:01:06, "[email protected]"
>>>>>> <[email protected]> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I am using Hive 0.13.1 on Hadoop 2.4.1, I need to automatically load an
>>>>>> extra JAR file to hive for UDF, below are my steps to create the UDF
>>>>>> function. I have tried the following but still no luck to get thru.
>>>>>>
>>>>>> Please help!!
>>>>>>
>>>>>> Regards
>>>>>> Arthur
>>>>>>
>>>>>>
>>>>>> Step 1: (make sure the jar in in HDFS)
>>>>>> hive> dfs -ls hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar;
>>>>>> -rw-r--r-- 3 hadoop hadoop 57388 2014-12-30
>>>>>> 10:02hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>>
>>>>>> Step 2: (drop if function exists)
>>>>>> hive> drop function sysdate;
>>>>>>
>>>>>> OK
>>>>>> Time taken: 0.013 seconds
>>>>>>
>>>>>> Step 3: (create function using the jar in HDFS)
>>>>>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate'
>>>>>> using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar';
>>>>>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>> Added
>>>>>> /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>> to class path
>>>>>> Added resource:
>>>>>> /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>> OK
>>>>>> Time taken: 0.034 seconds
>>>>>>
>>>>>> Step 4: (test)
>>>>>> hive> select sysdate();
>>>>>>
>>>>>>
>>>>>> Automatically selecting local only mode for query
>>>>>> Total jobs = 1
>>>>>> Launching Job 1 out of 1
>>>>>> Number of reduce tasks is set to 0 since there's no reduce operator
>>>>>> SLF4J: Class path contains multiple SLF4J bindings.
>>>>>> SLF4J: Found binding in
>>>>>> [jar:file:/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>>>>> SLF4J: Found binding in
>>>>>> [jar:file:/hadoop/hbase-0.98.5-hadoop2/lib/phoenix-4.1.0-client-hadoop2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>>>>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
>>>>>> explanation.
>>>>>> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
>>>>>> 14/12/30 10:17:06 WARN conf.Configuration:
>>>>>> file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an
>>>>>> attempt to override final parameter:
>>>>>> mapreduce.job.end-notification.max.retry.interval; Ignoring.
>>>>>> 14/12/30 10:17:06 WARN conf.Configuration:
>>>>>> file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an
>>>>>> attempt to override final parameter: yarn.nodemanager.loacl-dirs;
>>>>>> Ignoring.
>>>>>> 14/12/30 10:17:06 WARN conf.Configuration:
>>>>>> file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an
>>>>>> attempt to override final parameter:
>>>>>> mapreduce.job.end-notification.max.attempts; Ignoring.
>>>>>> Execution log at:
>>>>>> /tmp/hadoop/hadoop_20141230101717_282ec475-8621-40fa-8178-a7927d81540b.log
>>>>>> java.io.FileNotFoundException: File does not
>>>>>> exist:hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
>>>>>> at
>>>>>> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128)
>>>>>> at
>>>>>> org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
>>>>>> at
>>>>>> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>>>>>> at
>>>>>> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
>>>>>> at
>>>>>> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
>>>>>> at
>>>>>> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
>>>>>> at
>>>>>> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99)
>>>>>> at
>>>>>> org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
>>>>>> at
>>>>>> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265)
>>>>>> at
>>>>>> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301)
>>>>>> at
>>>>>> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389)
>>>>>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
>>>>>> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
>>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>>> at
>>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>>>>>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
>>>>>> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
>>>>>> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
>>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>>> at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>>> at
>>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>>>>>> at
>>>>>> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
>>>>>> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
>>>>>> at
>>>>>> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:420)
>>>>>> at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:740)
>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>> at
>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>>>> at
>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>>>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>>>>>> Job Submission failed with exception 'java.io.FileNotFoundException(File
>>>>>> does not
>>>>>> exist:hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar)'
>>>>>> Execution failed with exit status: 1
>>>>>> Obtaining error information
>>>>>> Task failed!
>>>>>> Task ID:
>>>>>> Stage-1
>>>>>> Logs:
>>>>>> /tmp/hadoop/hive.log
>>>>>> FAILED: Execution Error, return code 1 from
>>>>>> org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>>>>>>
>>>>>>
>>>>>> Step 5: (check the file)
>>>>>> hive> dfs -ls
>>>>>> /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar;
>>>>>> ls:
>>>>>> `/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar':
>>>>>> No such file or directory
>>>>>> Command failed with exit code = 1
>>>>>> Query returned non-zero code: 1, cause: null
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> NOTE: This message may contain information that is confidential,
>>>>> proprietary, privileged or otherwise protected by law. The message is
>>>>> intended solely for the named addressee. If received in error, please
>>>>> destroy and notify the sender. Any use of this email is prohibited when
>>>>> received in error. Impetus does not represent, warrant and/or guarantee,
>>>>> that the integrity of this communication has been maintained nor that the
>>>>> communication is free of errors, virus, interception or interference.
>>>>
>>>>
>>>> CONFIDENTIALITY NOTICE
>>>> NOTICE: This message is intended for the use of the individual or entity
>>>> to which it is addressed and may contain information that is confidential,
>>>> privileged and exempt from disclosure under applicable law. If the reader
>>>> of this message is not the intended recipient, you are hereby notified
>>>> that any printing, copying, dissemination, distribution, disclosure or
>>>> forwarding of this communication is strictly prohibited. If you have
>>>> received this communication in error, please contact the sender
>>>> immediately and delete it from your system. Thank You.
>>>
>>
>
>
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader of
> this message is not the intended recipient, you are hereby notified that any
> printing, copying, dissemination, distribution, disclosure or forwarding of
> this communication is strictly prohibited. If you have received this
> communication in error, please contact the sender immediately and delete it
> from your system. Thank You.