subscribe
subscribe
Smoke Test after 1 days 7 hours 5 minutes 19 seconds 70 msec, Failed with Error: GC overhead limit exceeded
Hi, My Hive version is 0.13.1, I tried a smoke test, after 1 days 7 hours 5 minutes 19 seconds 70 msec, the job failed with error: Error: GC overhead limit exceeded LOG: 2014-10-12 06:16:07,288 Stage-6 map = 100%, reduce = 50%, Cumulative CPU 425.35 sec 2014-10-12 06:16:12,431 Stage-6 map = 100%, reduce = 67%, Cumulative CPU 433.01 sec 2014-10-12 06:16:15,515 Stage-6 map = 100%, reduce = 100%, Cumulative CPU 447.59 sec …... Hadoop job information for Stage-19: number of mappers: 3; number of reducers: 0 2014-10-12 06:16:30,643 Stage-19 map = 0%, reduce = 0% 2014-10-12 06:16:55,494 Stage-19 map = 33%, reduce = 0%, Cumulative CPU 153.83 sec 2014-10-12 06:16:56,520 Stage-19 map = 0%, reduce = 0% 2014-10-12 06:17:57,037 Stage-19 map = 0%, reduce = 0% 2014-10-12 06:18:27,720 Stage-19 map = 100%, reduce = 0% MapReduce Total cumulative CPU time: 2 minutes 33 seconds 830 msec Ended Job = job_1413024651684_0033 with errors Error during job, obtaining debugging information... Examining task ID: task_1413024651684_0033_m_01 (and more) from job job_1413024651684_0033 Task with the most failures(4): - Task ID: task_1413024651684_0033_m_02 URL: http://m1:8088/taskdetails.jsp?jobid=job_1413024651684_0033&tipid=task_1413024651684_0033_m_02 - Diagnostic Messages for this Task: Error: GC overhead limit exceeded FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask MapReduce Jobs Launched: Job 0: Map: 5 Reduce: 1 Cumulative CPU: 10705.42 sec HDFS Read: 829911667 HDFS Write: 693918010684 SUCCESS Job 1: Map: 2684 Reduce: 721 Cumulative CPU: 100612.23 sec HDFS Read: 720031197955 HDFS Write: 56301916 SUCCESS Job 2: Map: 25 Reduce: 6 Cumulative CPU: 447.59 sec HDFS Read: 5785850462 HDFS Write: 22244710 SUCCESS Job 3: Map: 3 Cumulative CPU: 153.83 sec HDFS Read: 0 HDFS Write: 0 FAIL Total MapReduce CPU Time Spent: 1 days 7 hours 5 minutes 19 seconds 70 msec my smoke test SQL : SELECT O_YEAR, SUM(CASE WHEN NATION = 'BRAZIL' THEN VOLUME ELSE 0 END) / SUM(VOLUME) AS MKT_SHARE FROM (SELECT YEAR(cast(O_ORDERDATE as date)) AS O_YEAR, L_EXTENDEDPRICE * ( 1 - L_DISCOUNT ) AS VOLUME, N2.N_NAMEAS NATION FROM PART, SUPPLIER, LINEITEM, ORDERS, CUSTOMER, NATION N1, NATION N2, REGION WHERE P_PARTKEY = L_PARTKEY AND S_SUPPKEY = L_SUPPKEY AND L_ORDERKEY = O_ORDERKEY AND O_CUSTKEY = C_CUSTKEY AND C_NATIONKEY = N1.N_NATIONKEY AND N1.N_REGIONKEY = R_REGIONKEY AND R_NAME = 'AMERICA' AND S_NATIONKEY = N2.N_NATIONKEY AND cast(O_ORDERDATE as date) >= cast('1995-01-01' as date) AND cast(O_ORDERDATE as date) <= cast('1996-12-31' as date) AND P_TYPE = 'ECONOMY ANODIZED STEEL') AS ALL_NATIONS GROUP BY O_YEAR ORDER BY O_YEAR; Please help. Regards Arthur
Re: Smoke Test after 1 days 7 hours 5 minutes 19 seconds 70 msec, Failed with Error: GC overhead limit exceeded
Hi, I have managed to resolve the issue by turning the SQL. Regards Arthur On 12 Oct, 2014, at 6:49 am, arthur.hk.c...@gmail.com wrote: > Hi, > > My Hive version is 0.13.1, I tried a smoke test, after 1 days 7 hours 5 > minutes 19 seconds 70 msec, the job failed with error: Error: GC overhead > limit exceeded > > > LOG: > 2014-10-12 06:16:07,288 Stage-6 map = 100%, reduce = 50%, Cumulative CPU > 425.35 sec > 2014-10-12 06:16:12,431 Stage-6 map = 100%, reduce = 67%, Cumulative CPU > 433.01 sec > 2014-10-12 06:16:15,515 Stage-6 map = 100%, reduce = 100%, Cumulative CPU > 447.59 sec > …... > Hadoop job information for Stage-19: number of mappers: 3; number of > reducers: 0 > 2014-10-12 06:16:30,643 Stage-19 map = 0%, reduce = 0% > 2014-10-12 06:16:55,494 Stage-19 map = 33%, reduce = 0%, Cumulative CPU > 153.83 sec > 2014-10-12 06:16:56,520 Stage-19 map = 0%, reduce = 0% > 2014-10-12 06:17:57,037 Stage-19 map = 0%, reduce = 0% > 2014-10-12 06:18:27,720 Stage-19 map = 100%, reduce = 0% > MapReduce Total cumulative CPU time: 2 minutes 33 seconds 830 msec > Ended Job = job_1413024651684_0033 with errors > Error during job, obtaining debugging information... > Examining task ID: task_1413024651684_0033_m_01 (and more) from job > job_1413024651684_0033 > > Task with the most failures(4): > - > Task ID: > task_1413024651684_0033_m_02 > > URL: > > http://m1:8088/taskdetails.jsp?jobid=job_1413024651684_0033&tipid=task_1413024651684_0033_m_02 > - > Diagnostic Messages for this Task: > Error: GC overhead limit exceeded > > FAILED: Execution Error, return code 2 from > org.apache.hadoop.hive.ql.exec.mr.MapRedTask > MapReduce Jobs Launched: > Job 0: Map: 5 Reduce: 1 Cumulative CPU: 10705.42 sec HDFS Read: > 829911667 HDFS Write: 693918010684 SUCCESS > Job 1: Map: 2684 Reduce: 721 Cumulative CPU: 100612.23 sec HDFS Read: > 720031197955 HDFS Write: 56301916 SUCCESS > Job 2: Map: 25 Reduce: 6 Cumulative CPU: 447.59 sec HDFS Read: > 5785850462 HDFS Write: 22244710 SUCCESS > Job 3: Map: 3 Cumulative CPU: 153.83 sec HDFS Read: 0 HDFS Write: 0 FAIL > Total MapReduce CPU Time Spent: 1 days 7 hours 5 minutes 19 seconds 70 msec > > my smoke test SQL : > SELECT O_YEAR, >SUM(CASE > WHEN NATION = 'BRAZIL' THEN VOLUME > ELSE 0 >END) / SUM(VOLUME) AS MKT_SHARE > FROM (SELECT YEAR(cast(O_ORDERDATE as date)) AS O_YEAR, >L_EXTENDEDPRICE * ( 1 - L_DISCOUNT ) AS VOLUME, >N2.N_NAMEAS NATION > FROM PART, >SUPPLIER, >LINEITEM, >ORDERS, >CUSTOMER, >NATION N1, >NATION N2, >REGION > WHERE P_PARTKEY = L_PARTKEY >AND S_SUPPKEY = L_SUPPKEY >AND L_ORDERKEY = O_ORDERKEY >AND O_CUSTKEY = C_CUSTKEY >AND C_NATIONKEY = N1.N_NATIONKEY >AND N1.N_REGIONKEY = R_REGIONKEY >AND R_NAME = 'AMERICA' >AND S_NATIONKEY = N2.N_NATIONKEY >AND cast(O_ORDERDATE as date) >= cast('1995-01-01' as date) >AND cast(O_ORDERDATE as date) <= cast('1996-12-31' as date) >AND P_TYPE = 'ECONOMY ANODIZED STEEL') AS ALL_NATIONS > GROUP BY O_YEAR > ORDER BY O_YEAR; > > > Please help. > Regards > Arthur > > >
java.io.FileNotFoundException: File does not exist (nexr-hive-udf-0.2-SNAPSHOT.jar)
Hi, Please help! I am using hiveserver2 on HIVE 0.13 on Hadoop 2.4.1, also nexr-hive-udf-0.2-SNAPSHOT.jar I can run query from CLI, e.g. hive> SELECT add_months(sysdate(), +12) FROM DUAL; Execution completed successfully MapredLocal task succeeded OK 2015-12-17 Time taken: 7.393 seconds, Fetched: 1 row(s) hive-site.xml (added) hive.aux.jars.path $HIVE_HOME/nexr-hive-udf-0.2-SNAPSHOT.jar,$HIVE_HOME/csv-serde-1.1.2-0.11.0-all.jar hive-env.sh (added) export HIVE_AUX_JARS_PATH=$HIVE_HOME/lib/csv-serde-1.1.2-0.11.0-all.jar:$HIVE_HOME/lib/nexr-hive-udf-0.2-SNAPSHOT.jar However, if it is accessed via hiveserver2, I got the following error, please help. Regards Arthur 14/12/17 16:47:52 WARN conf.Configuration: file:/tmp/hive_2014-12-17_16-47-51_096_5821374687950910377-1/-local-10003/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. Execution log at: /tmp/hduser_20141217164747_80b15b85-7820-4e3a-88ea-afffa131ff5a.log java.io.FileNotFoundException: File does not exist: hdfs://mycluster/hadoop_data/hadoop_data/tmp/mapred/staging/hduser1962118853/.staging/job_local1962118853_0001/libjars/nexr-hive-udf-0.2-SNAPSHOT.jar at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120) at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288) at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224) at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:93) at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57) at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265) at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:420) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:740) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Job Submission failed with exception 'java.io.FileNotFoundException(File does not exist: hdfs://mycluster/hadoop_data/hadoop_data/tmp/mapred/staging/hduser1962118853/.staging/job_local1962118853_0001/libjars/nexr-hive-udf-0.2-SNAPSHOT.jar)' Execution failed with exit status: 1 Obtaining error information
CREATE FUNCTION: How to automatically load extra jar file?
Hi, I am using Hive 0.13.1 on Hadoop 2.4.1, I need to automatically load an extra JAR file to hive for UDF, below are my steps to create the UDF function. I have tried the following but still no luck to get thru. Please help!! Regards Arthur Step 1: (make sure the jar in in HDFS) hive> dfs -ls hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar; -rw-r--r-- 3 hadoop hadoop 57388 2014-12-30 10:02 hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar Step 2: (drop if function exists) hive> drop function sysdate; OK Time taken: 0.013 seconds Step 3: (create function using the jar in HDFS) hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar'; converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar Added /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar to class path Added resource: /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar OK Time taken: 0.034 seconds Step 4: (test) hive> select sysdate(); Automatically selecting local only mode for query Total jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/hadoop/hbase-0.98.5-hadoop2/lib/phoenix-4.1.0-client-hadoop2.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: yarn.nodemanager.loacl-dirs; Ignoring. 14/12/30 10:17:06 WARN conf.Configuration: file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. Execution log at: /tmp/hadoop/hadoop_20141230101717_282ec475-8621-40fa-8178-a7927d81540b.log java.io.FileNotFoundException: File does not exist: hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128) at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120) at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288) at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224) at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99) at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57) at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265) at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557) at org.apache.hadoop.
Re: CREATE FUNCTION: How to automatically load extra jar file?
Thank you. Will this work for hiveserver2 ? Arthur On 30 Dec, 2014, at 2:24 pm, vic0777 wrote: > > You can put it into $HOME/.hiverc like this: ADD JAR full_path_of_the_jar. > Then, the file is automatically loaded when Hive is started. > > Wantao > > > > > At 2014-12-30 11:01:06, "arthur.hk.c...@gmail.com" > wrote: > Hi, > > I am using Hive 0.13.1 on Hadoop 2.4.1, I need to automatically load an extra > JAR file to hive for UDF, below are my steps to create the UDF function. I > have tried the following but still no luck to get thru. > > Please help!! > > Regards > Arthur > > > Step 1: (make sure the jar in in HDFS) > hive> dfs -ls hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar; > -rw-r--r-- 3 hadoop hadoop 57388 2014-12-30 10:02 > hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar > > Step 2: (drop if function exists) > hive> drop function sysdate; > OK > Time taken: 0.013 seconds > > Step 3: (create function using the jar in HDFS) > hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' > using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar'; > converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar > Added > /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar > to class path > Added resource: > /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar > OK > Time taken: 0.034 seconds > > Step 4: (test) > hive> select sysdate(); > > Automatically selecting local only mode for query > Total jobs = 1 > Launching Job 1 out of 1 > Number of reduce tasks is set to 0 since there's no reduce operator > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/hadoop/hbase-0.98.5-hadoop2/lib/phoenix-4.1.0-client-hadoop2.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > 14/12/30 10:17:06 WARN conf.Configuration: > file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an > attempt to override final parameter: > mapreduce.job.end-notification.max.retry.interval; Ignoring. > 14/12/30 10:17:06 WARN conf.Configuration: > file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an > attempt to override final parameter: yarn.nodemanager.loacl-dirs; Ignoring. > 14/12/30 10:17:06 WARN conf.Configuration: > file:/tmp/hadoop/hive_2014-12-30_10-17-04_514_2721050094719255719-1/-local-10003/jobconf.xml:an > attempt to override final parameter: > mapreduce.job.end-notification.max.attempts; Ignoring. > Execution log at: > /tmp/hadoop/hadoop_20141230101717_282ec475-8621-40fa-8178-a7927d81540b.log > java.io.FileNotFoundException: File does not exist: > hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar > at > org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128) > at > org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120) > at > org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288) > at > org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224) > at > org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99) > at > org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57) > at > org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265) > at > org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285) > at org.
Re: CREATE FUNCTION: How to automatically load extra jar file?
Hi, Thanks. Below are my steps, I did copy my JAR to HDFS and "CREATE FUNCTION using the JAR in HDFS", however during my smoke test, I got FileNotFoundException. >> java.io.FileNotFoundException: File does not exist: >> hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar >> Step 1: (make sure the jar in in HDFS) >> hive> dfs -ls hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar; >> -rw-r--r-- 3 hadoop hadoop 57388 2014-12-30 10:02 >> hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar >> >> Step 2: (drop if function exists) >> hive> drop function sysdate; >> >> OK >> Time taken: 0.013 seconds >> >> Step 3: (create function using the jar in HDFS) >> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' >> using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar'; >> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar >> Added >> /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar >> to class path >> Added resource: >> /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar >> OK >> Time taken: 0.034 seconds >> >> Step 4: (test) >> hive> select sysdate(); >> Execution log at: >> /tmp/hadoop/hadoop_20141230101717_282ec475-8621-40fa-8178-a7927d81540b.log >> java.io.FileNotFoundException: File does not exist: >> hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar Please help! Arthur On 31 Dec, 2014, at 12:31 am, Nitin Pawar wrote: > just copy pasting Jason's reply to other thread > > If you have a recent version of Hive (0.13+), you could try registering your > UDF as a "permanent" UDF which was added in HIVE-6047: > > 1) Copy your JAR somewhere on HDFS, say > hdfs:///home/nirmal/udf/hiveUDF-1.0-SNAPSHOT.jar. > 2) In Hive, run CREATE FUNCTION zeroifnull AS 'com.test.udf.ZeroIfNullUDF' > USING JAR 'hdfs:///home/nirmal/udf/hiveUDF-1.0-SNAPSHOT.jar'; > > The function definition should be saved in the metastore and Hive should > remember to pull the JAR from the location you specified in the CREATE > FUNCTION call. > > On Tue, Dec 30, 2014 at 9:54 PM, arthur.hk.c...@gmail.com > wrote: > Thank you. > > Will this work for hiveserver2 ? > > > Arthur > > On 30 Dec, 2014, at 2:24 pm, vic0777 wrote: > >> >> You can put it into $HOME/.hiverc like this: ADD JAR full_path_of_the_jar. >> Then, the file is automatically loaded when Hive is started. >> >> Wantao >> >> >> >> >> At 2014-12-30 11:01:06, "arthur.hk.c...@gmail.com" >> wrote: >> Hi, >> >> I am using Hive 0.13.1 on Hadoop 2.4.1, I need to automatically load an >> extra JAR file to hive for UDF, below are my steps to create the UDF >> function. I have tried the following but still no luck to get thru. >> >> Please help!! >> >> Regards >> Arthur >> >> >> Step 1: (make sure the jar in in HDFS) >> hive> dfs -ls hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar; >> -rw-r--r-- 3 hadoop hadoop 57388 2014-12-30 10:02 >> hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar >> >> Step 2: (drop if function exists) >> hive> drop function sysdate; >> >> OK >> Time taken: 0.013 seconds >> >> Step 3: (create function using the jar in HDFS) >> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' >> using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar'; >> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar >> Added >> /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar >> to class path >> Added resource: >> /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar >> OK >> Time taken: 0.034 seconds >> >> Step 4: (test) >> hive> select sysdate(); >> >> Automatically selecting local only mode for query >> Total jobs = 1 >> Launching Job 1 out of 1 >> Number of reduce tasks is set to 0 since there's no reduce operator >> SLF4J: Class path contains multiple SLF4J bindings. >> SLF4J: Found binding in &g
Re: CREATE FUNCTION: How to automatically load extra jar file?
Hi I have already placed it in another folder, not the /tmp/ one: >>> hive> dfs -ls hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar; >>> -rw-r--r-- 3 hadoop hadoop 57388 2014-12-30 10:02 >>> hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar However, Hive places it to /tmp/ folder during its "CREATE FUNCTION USING JAR" >>> Step 3: (create function using the jar in HDFS) >>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' >>> using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar'; >>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar >>> Added >>> /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar >>> to class path >>> Added resource: >>> /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar >>> OK >>> Time taken: 0.034 seconds Any ideas how to avoid HIVE uses /tmp/ folder? Arthur On 31 Dec, 2014, at 2:27 pm, Nitin Pawar wrote: > If you put a file inside tmp then there is no guarantee it will live there > forever based on ur cluster configuration. > > You may want to put it as a place where all users can access it like making a > folder and keeping it read permission > > On Wed, Dec 31, 2014 at 11:40 AM, arthur.hk.c...@gmail.com > wrote: > > Hi, > > Thanks. > > Below are my steps, I did copy my JAR to HDFS and "CREATE FUNCTION using the > JAR in HDFS", however during my smoke test, I got FileNotFoundException. > >>> java.io.FileNotFoundException: File does not exist: >>> hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar > > > >>> Step 1: (make sure the jar in in HDFS) >>> hive> dfs -ls hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar; >>> -rw-r--r-- 3 hadoop hadoop 57388 2014-12-30 10:02 >>> hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar >>> >>> Step 2: (drop if function exists) >>> hive> drop function sysdate; >>> >>> OK >>> Time taken: 0.013 seconds >>> >>> Step 3: (create function using the jar in HDFS) >>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' >>> using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar'; >>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar >>> Added >>> /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar >>> to class path >>> Added resource: >>> /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar >>> OK >>> Time taken: 0.034 seconds >>> >>> Step 4: (test) >>> hive> select sysdate(); >>> Execution log at: >>> /tmp/hadoop/hadoop_20141230101717_282ec475-8621-40fa-8178-a7927d81540b.log >>> java.io.FileNotFoundException: File does not exist: >>> hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar > > > Please help! > > Arthur > > > > On 31 Dec, 2014, at 12:31 am, Nitin Pawar wrote: > >> just copy pasting Jason's reply to other thread >> >> If you have a recent version of Hive (0.13+), you could try registering your >> UDF as a "permanent" UDF which was added in HIVE-6047: >> >> 1) Copy your JAR somewhere on HDFS, say >> hdfs:///home/nirmal/udf/hiveUDF-1.0-SNAPSHOT.jar. >> 2) In Hive, run CREATE FUNCTION zeroifnull AS 'com.test.udf.ZeroIfNullUDF' >> USING JAR 'hdfs:///home/nirmal/udf/hiveUDF-1.0-SNAPSHOT.jar'; >> >> The function definition should be saved in the metastore and Hive should >> remember to pull the JAR from the location you specified in the CREATE >> FUNCTION call. >> >> On Tue, Dec 30, 2014 at 9:54 PM, arthur.hk.c...@gmail.com >> wrote: >> Thank you. >> >> Will this work for hiveserver2 ? >> >> >> Arthur >> >> On 30 Dec, 2014, at 2:24 pm, vic0777 wrote: >> >>> >>> You can put it into $HOME/.hiverc like this: ADD JAR full_path_of_the_jar. >>> Then, the file is automatically loaded when Hive is started. >>> >>> Wantao >>> >>> >>> >>> >>> At 2014-12-30 11:01:06, "arthur.hk.c...@gmail.com" >>> wrote: >>> Hi, >>> >>> I am using Hive 0.13.1 on Hadoop
Re: CREATE FUNCTION: How to automatically load extra jar file?
Hi, A1: Are all of these commands (Step 1-5) from the same Hive CLI prompt? Yes A2: Would you be able to check if such a file exists with the same path, on the local file system? The file does not exist on the local file system. Is there a way to set the another “tmp" folder for HIVE? or any suggestions to fix this issue? Thanks !! Arthur On 3 Jan, 2015, at 4:12 am, Jason Dere wrote: > The point of USING JAR as part of the CREATE FUNCTION statement to try to > avoid having to do ADD JAR/aux path stuff to get the UDF to work. > > Are all of these commands (Step 1-5) from the same Hive CLI prompt? > >>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' >>> using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar'; >>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar >>> Added >>> /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar >>> to class path >>> Added resource: >>> /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar >>> OK > > > One note, > /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar > here should actually be on the local file system, not on HDFS where you were > checking in Step 5. During CREATE FUNCTION/query compilation, Hive will make > a copy of the source JAR (hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar), > copied to a temp location on the local file system where it's used by that > Hive session. > > The location mentioned in the FileNotFoundException > (hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar) > has a different path than the local copy mentioned during CREATE FUNCTION > (/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar). > I'm not really sure why it is a HDFS path here either, but I'm not too > familiar with what goes on during the job submission process. But the fact > that this HDFS path has the same naming convention as the directory used for > downloading resources locally (***_resources) looks a little fishy to me. > Would you be able to check if such a file exists with the same path, on the > local file system? > > > > > > On Dec 31, 2014, at 5:22 AM, Nirmal Kumar wrote: > >> Important: HiveQL's ADD JAR operation does not work with HiveServer2 and >> the Beeline client when Beeline runs on a different host. As an alterntive >> to ADD JAR, Hive auxiliary path functionality should be used as described >> below. >> >> Refer: >> http://www.cloudera.com/content/cloudera/en/documentation/cloudera-manager/v4-8-0/Cloudera-Manager-Managing-Clusters/cmmc_hive_udf.html >> >> >> Thanks, >> -Nirmal >> >> From: arthur.hk.c...@gmail.com >> Sent: Tuesday, December 30, 2014 9:54 PM >> To: vic0777 >> Cc: arthur.hk.c...@gmail.com; user@hive.apache.org >> Subject: Re: CREATE FUNCTION: How to automatically load extra jar file? >> >> Thank you. >> >> Will this work for hiveserver2 ? >> >> >> Arthur >> >> On 30 Dec, 2014, at 2:24 pm, vic0777 wrote: >> >>> >>> You can put it into $HOME/.hiverc like this: ADD JAR full_path_of_the_jar. >>> Then, the file is automatically loaded when Hive is started. >>> >>> Wantao >>> >>> >>> >>> >>> At 2014-12-30 11:01:06, "arthur.hk.c...@gmail.com" >>> wrote: >>> Hi, >>> >>> I am using Hive 0.13.1 on Hadoop 2.4.1, I need to automatically load an >>> extra JAR file to hive for UDF, below are my steps to create the UDF >>> function. I have tried the following but still no luck to get thru. >>> >>> Please help!! >>> >>> Regards >>> Arthur >>> >>> >>> Step 1: (make sure the jar in in HDFS) >>> hive> dfs -ls hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar; >>> -rw-r--r-- 3 hadoop hadoop 57388 2014-12-30 >>> 10:02hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar >>> >>> Step 2: (drop if function exists) >>> hive> drop function sysdate; >>> >>> OK >>> Time taken: 0.013 seconds >>> >>> Step 3: (create function using the jar in HDFS) >>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' >>> using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-S
Re: CREATE FUNCTION: How to automatically load extra jar file?
Hi, A question: Why does it need to copy the jar file to the temp folder? Why couldn’t it use the file defined in using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar' directly? Regards Arthur On 4 Jan, 2015, at 7:48 am, arthur.hk.c...@gmail.com wrote: > Hi, > > > A1: Are all of these commands (Step 1-5) from the same Hive CLI prompt? > Yes > > A2: Would you be able to check if such a file exists with the same path, on > the local file system? > The file does not exist on the local file system. > > > Is there a way to set the another “tmp" folder for HIVE? or any suggestions > to fix this issue? > > Thanks !! > > Arthur > > > > On 3 Jan, 2015, at 4:12 am, Jason Dere wrote: > >> The point of USING JAR as part of the CREATE FUNCTION statement to try to >> avoid having to do ADD JAR/aux path stuff to get the UDF to work. >> >> Are all of these commands (Step 1-5) from the same Hive CLI prompt? >> >>>> hive> CREATE FUNCTION sysdate AS 'com.nexr.platform.hive.udf.UDFSysDate' >>>> using JAR 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar'; >>>> converting to local hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar >>>> Added >>>> /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar >>>> to class path >>>> Added resource: >>>> /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar >>>> OK >> >> >> One note, >> /tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar >> here should actually be on the local file system, not on HDFS where you >> were checking in Step 5. During CREATE FUNCTION/query compilation, Hive will >> make a copy of the source JAR >> (hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar), copied to a temp >> location on the local file system where it's used by that Hive session. >> >> The location mentioned in the FileNotFoundException >> (hdfs://tmp/5c658d17-dbeb-4b84-ae8d-ba936404c8bc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar) >> has a different path than the local copy mentioned during CREATE FUNCTION >> (/tmp/69700312-684c-45d3-b27a-0732bb268ddc_resources/nexr-hive-udf-0.2-SNAPSHOT.jar). >> I'm not really sure why it is a HDFS path here either, but I'm not too >> familiar with what goes on during the job submission process. But the fact >> that this HDFS path has the same naming convention as the directory used for >> downloading resources locally (***_resources) looks a little fishy to me. >> Would you be able to check if such a file exists with the same path, on the >> local file system? >> >> >> >> >> >> On Dec 31, 2014, at 5:22 AM, Nirmal Kumar wrote: >> >>> Important: HiveQL's ADD JAR operation does not work with HiveServer2 and >>> the Beeline client when Beeline runs on a different host. As an alterntive >>> to ADD JAR, Hive auxiliary path functionality should be used as described >>> below. >>> >>> Refer: >>> http://www.cloudera.com/content/cloudera/en/documentation/cloudera-manager/v4-8-0/Cloudera-Manager-Managing-Clusters/cmmc_hive_udf.html >>> >>> >>> Thanks, >>> -Nirmal >>> >>> From: arthur.hk.c...@gmail.com >>> Sent: Tuesday, December 30, 2014 9:54 PM >>> To: vic0777 >>> Cc: arthur.hk.c...@gmail.com; user@hive.apache.org >>> Subject: Re: CREATE FUNCTION: How to automatically load extra jar file? >>> >>> Thank you. >>> >>> Will this work for hiveserver2 ? >>> >>> >>> Arthur >>> >>> On 30 Dec, 2014, at 2:24 pm, vic0777 wrote: >>> >>>> >>>> You can put it into $HOME/.hiverc like this: ADD JAR full_path_of_the_jar. >>>> Then, the file is automatically loaded when Hive is started. >>>> >>>> Wantao >>>> >>>> >>>> >>>> >>>> At 2014-12-30 11:01:06, "arthur.hk.c...@gmail.com" >>>> wrote: >>>> Hi, >>>> >>>> I am using Hive 0.13.1 on Hadoop 2.4.1, I need to automatically load an >>>> extra JAR file to hive for UDF, below are my steps to create the UDF >>>> function. I have tried the following but still no luck to get thru. >>>> >>>> Please help!! >>>> >>>> Regards >>>> Arthur >>>> &
Re: CREATE FUNCTION: How to automatically load extra jar file?
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89) at com.sun.proxy.$Proxy11.getAllDatabases(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1098) at org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionNames(FunctionRegistry.java:671) at org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionNames(FunctionRegistry.java:662) at org.apache.hadoop.hive.cli.CliDriver.getCommandCompletor(CliDriver.java:540) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:758) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Regards Arthur On 7 Jan, 2015, at 7:22 am, Jason Dere wrote: > Does your hive.log contain any lines with "adding libjars:"? > > May also search for any lines containing "_resources", would like to see the > result of both searches. > > For example, mine is showing the following line: > 2015-01-06 14:53:28,115 INFO mr.ExecDriver (ExecDriver.java:execute(307)) - > adding libjars: > file:///tmp/d0ed1585-d9e6-4944-b985-225351574de0_resources/spatial-sdk-hive-1.0.3-SNAPSHOT.jar,file:///tmp/d0ed1585-d9e6-4944-b985-225351574de0_resources/esri-geometry-api.jar > > I wonder if your libjars setting for the map/reduce job is somehow getting > sent without the "file:///", which might be causing hadoop to interpret the > path as a HDFS path rather than a local path. > > On Jan 6, 2015, at 1:11 AM, Arthur.hk.chan wrote: > >> Hi, >> >> my hadoop’s core-site.xml contains following about tmp >> >> >> hadoop.tmp.dir >> /hadoop_data/hadoop_data/tmp >> >> >> >> >> my hive-default.xml contains following about tmp >> >> >> hive.exec.scratchdir >> /tmp/hive-${user.name} >> Scratch space for Hive jobs >> >> >> >> hive.exec.local.scratchdir >> /tmp/${user.name} >> Local scratch space for Hive jobs >> >> >> >> >> Will this related to configuration issue or a bug? >> >> Please help! >> >> Regards >> Arthur >> >> >> On 6 Jan, 2015, at 3:45 am, Jason Dere wrote: >> >>> During query compilation Hive needs to instantiate the UDF class and so the >>> JAR needs to be resolvable by the class loader, thus the JAR is copied >>> locally to a temp location for use. >>> During map/reduce jobs the local jar (like all jars added with the ADD JAR >>> command) should then be added to the distributed cache. It looks like this >>> is where the issue is occurring, but based on path in the error message I >>> suspect that either Hive or Hadoop is mistaking what should be a local path >>> with an HDFS path. >>> >>> On Jan 4, 2015, at 10:23 AM, arthur.hk.c...@gmail.com >>> wrote: >>> >>>> Hi, >>>> >>>> A question: Why does it need to copy the jar file to the temp folder? Why >>>> couldn’t it use the file defined in using JAR >>>> 'hdfs://hadoop/hive/nexr-hive-udf-0.2-SNAPSHOT.jar' directly? >>>> >>>> Regards >>>> Arthur >>>> >>>> >>>> On 4 Jan, 2015, at 7:48 am, arthur.hk.c...@gmail.com >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> >>>>> A1: Are all of these commands (Step 1-5) from the same Hive CLI prompt? >>>>> Yes >>>>> >>>>> A2: Would you be able to check if such a file exists with the same path, >>>>> on the local file system? >>>>> The file does not exist on the local file system. >>>>> >>>>> >>>>> Is there a way to set the another “tmp" folder for HIVE? or any >>>>> suggestions to fix this issue? >>>>> >>>>
Re: CREATE FUNCTION: How to automatically load extra jar file?
ternal(JDOQLQuery.java:370) at org.datanucleus.store.query.Query.executeQuery(Query.java:1744) at org.datanucleus.store.query.Query.executeWithArray(Query.java:1672) at org.datanucleus.store.query.Query.execute(Query.java:1654) at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:221) at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.(MetaStoreDirectSql.java:121) at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:252) at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:223) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) at org.apache.hadoop.hive.metastore.RawStoreProxy.(RawStoreProxy.java:58) at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:67) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:497) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:475) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:523) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:397) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.(HiveMetaStore.java:356) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.(RetryingHMSHandler.java:54) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:59) at org.apache.hadoop.hive.metastore.HiveMetaStore.newHMSHandler(HiveMetaStore.java:4944) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:171) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1410) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:62) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:72) at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2453) at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2465) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:340) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) What would be wrong? Regards Arthur On 11 Jan, 2015, at 5:18 pm, arthur.hk.c...@gmail.com wrote: > Hi, > > > 2015-01-04 08:57:12,154 ERROR [main]: DataNucleus.Datastore > (Log4JLogger.java:error(115)) - An exception was thrown while > adding/validating class(es) : Specified key was too long; max key length is > 767 bytes > com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified key was > too long; max key length is 767 bytes > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at com.mysql.jdbc.Util.handleNewInstance(Util.java:408) > at com.mysql.jdbc.Util.getInstance(Util.java:383) > at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1062) > at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4226) > at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4158) > at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2615) > at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2776) > at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2834) > at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2783) > at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:908) > at com.mysql.jdbc.StatementImpl.execute(Statement
Re: CREATE FUNCTION: How to automatically load extra jar file?
eflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Job Submission failed with exception 'java.io.FileNotFoundException(File does not exist: hdfs://mycluster/tmp/abce45b1-6041-40b6-83ed-8c6491216360_resources/nexr.jar)' Execution failed with exit status: 1 Obtaining error information Task failed! 5) I cannot find the "abce45b1-6041-40b6-83ed-8c6491216360_resources/nexr.jar” in "hdfs://mycluster/tmp/abce45b1-6041-40b6-83ed-8c6491216360_resources/nexr.jar”, also not found in local /tmp/ folder. I think it should be the case here that “libjars setting for the map/reduce job is somehow getting sent without the "file:///", which might be causing hadoop to interpret the path as a HDFS path rather than a local path." Is there a way to verify my libjars setting for map/reduce job? Please help! Regards Arthur On 11 Jan, 2015, at 5:35 pm, arthur.hk.c...@gmail.com wrote: > Hi, > > > > mysql> show variables like "character_set_database"; > +++ > | Variable_name | Value | > +++ > | character_set_database | latin1 | > +++ > 1 row in set (0.00 sec) > > mysql> show variables like "collation_database"; > ++---+ > | Variable_name | Value | > ++---+ > | collation_database | latin1_swedish_ci | > ++---+ > 1 row in set (0.00 sec) > > > > 2015-01-11 17:21:07,835 ERROR [main]: DataNucleus.Datastore > (Log4JLogger.java:error(115)) - An exception was thrown while > adding/validating class(es) : Specified key was too long; max key length is > 767 bytes > com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified key was > too long; max key length is 767 bytes > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at com.mysql.jdbc.Util.handleNewInstance(Util.java:408) > at com.mysql.jdbc.Util.getInstance(Util.java:383) > at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1062) > at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4226) > at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4158) > at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2615) > at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2776) > at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2834) > at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2783) > at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:908) > at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:788) > at com.jolbox.bonecp.StatementHandle.execute(StatementHandle.java:254) > at > org.datanucleus.store.rdbms.table.AbstractTable.executeDdlStatement(AbstractTable.java:760) > at > org.datanucleus.store.rdbms.table.TableImpl.createIndices(TableImpl.java:648) > at > org.datanucleus.store.rdbms.table.TableImpl.validateIndices(TableImpl.java:593) > at > org.datanucleus.store.rdbms.table.TableImpl.validateConstraints(TableImpl.java:390) > at > org.datanucleus.store.rdbms.table.ClassTable.validateConstraints(ClassTable.java:3463) > at > org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.performTablesValidation(RDBMSStoreManager.java:3464) > at > org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.addClassTablesAndValidate(RDBMSStoreManager.java:3190) > at > org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.run(RDBMSStoreManager.java:2841) > at > org.datanucleus.store.rdbms.AbstractSchemaTransaction.execute(AbstractSchemaTransaction.java:122) > at > org.datanucleus.store.rdbms.RDBMSStoreManager.addClasses(RDBMSStoreManager.java:1605) > at > org.datanucleus.store.AbstractStoreManager.addClass(AbstractStoreManager.java:954) > at > org.datanucleus.store.rdbms.RDBMSStoreManager.getDatastoreClass(RDBMSStoreManager.java:679) > at > org.datanucleus.store.rdbms.query.RDBMSQueryUtils.getStatementForCandidates(RDBMSQueryUtils.java:408) > at