Hi,
I am using Hive 0.13.1 on Hadoop 2.4.1, I need to automatically load an extra
JAR file to hive for UDF, below are my steps to create the UDF function. I have
tried the following but still no luck to get thru.
Please help!!
Regards
Arthur
Step 1: (make sure the jar in in HDFS)
hive> dfs -ls hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar;
-rw-r--r-- 3 hadoop hadoop 57388 2014-12-30 10:02
hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
Step 2: (drop if function exists)
hive> drop function sysdate;
OK
Time taken: 0.013 seconds
Step 3: (create function using the jar in HDFS)
hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' using
JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar';
converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar
Added
/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
to class path
Added resource:
/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
OK
Time taken: 0.034 seconds
Step 4: (test)
hive> select sysdate();
Automatically selecting local only mode for query
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in
[jar:file:/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/hadoop/hbase-0.98.5-hadoop2/lib/phoenix-4.1.0-client-hadoop2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
14/12/30 10:17:06 WARN conf.Configuration:
file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an
attempt to override final parameter:
mapreduce.job.end-notification.max.retry.interval; Ignoring.
14/12/30 10:17:06 WARN conf.Configuration:
file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an
attempt to override final parameter: yarn.nodemanager.loacl-dirs; Ignoring.
14/12/30 10:17:06 WARN conf.Configuration:
file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an
attempt to override final parameter:
mapreduce.job.end-notification.max.attempts; Ignoring.
Execution log at:
/tmp/hadoop/hadoop_20141230101717_282ec475-8621-40fa-8178-a7927d81540b.log
java.io.FileNotFoundException: File does not exist:
hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar
at
org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128)
at
org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
at
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
at
org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
at
org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
at
org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99)
at
org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
at
org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265)
at
org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301)
at
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
at
org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:420)
at
org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:740)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Job Submission failed with exception 'java.io.FileNotFoundException(File does
not exist:
hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar)'
Execution failed with exit status: 1
Obtaining error information
Task failed!
Task ID:
Stage-1
Logs:
/tmp/hadoop/hive.log
FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.mr.MapRedTask
Step 5: (check the file)
hive> dfs -ls
/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar;
ls:
`/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar':
No such file or directory
Command failed with exit code = 1
Query returned non-zero code: 1, cause: null