On Fri, Apr 5, 2013 at 6:01 PM, David Medinets <[email protected]> wrote: > I ran into this issue. Look in your log files for a directory not found > exceotion which is not bubbled up to the bash shell.
Could the following issue be the problem? https://issues.apache.org/jira/browse/ACCUMULO-1171 David, for the issue you ran into. If you know of a situation where bulk import errors are not propagating back to client, can you open a ticket? > > On Apr 5, 2013 11:37 AM, "Aji Janis" <[email protected]> wrote: >> >> I agree with you that changing HADOOP_CLASSPATH like you said should be >> done. I couldn't quite do that just yet (people have jobs running and don't >> want to risk it). >> >> However, I did a work around. (I am going off the theory that my >> Hadoop_classpath is bad so it can't accept all the libraries I am passing to >> it so I decided to package all the libraries I needed into a jar. >> http://blog.cloudera.com/blog/2011/01/how-to-include-third-party-libraries-in-your-map-reduce-job/) >> I downloaded the source code and made a shaded (uber) jar to include all the >> libraries I needed. Then I submitted the hadoop job with my uber jar like >> any other map reduce job. My mappers and reducers finish the job but I got >> an exception for waitForTableOperation. I think this proves my theory of bad >> classpath but clearly I have more issues to deal with. If you have any >> suggestions on how to even debug that would be awesome! >> >> My console output(removed a lot of server specific stuff for security) is >> below. I modified BulkIngestExample.java to add some print statements. >> Modified lines shown below also. >> >> >> [user@nodebulk]$ /opt/hadoop/bin/hadoop jar uber-BulkIngestExample.jar >> instance zookeepers user password table inputdir tmp/bulk >> >> 3/04/05 11:20:52 INFO input.FileInputFormat: Total input paths to process >> : 1 >> 13/04/05 11:20:53 INFO mapred.JobClient: Running job: >> job_201304021611_0045 >> 13/04/05 11:20:54 INFO mapred.JobClient: map 0% reduce 0% >> 13/04/05 11:21:10 INFO mapred.JobClient: map 100% reduce 0% >> 13/04/05 11:21:25 INFO mapred.JobClient: map 100% reduce 50% >> 13/04/05 11:21:26 INFO mapred.JobClient: map 100% reduce 100% >> 13/04/05 11:21:31 INFO mapred.JobClient: Job complete: >> job_201304021611_0045 >> 13/04/05 11:21:31 INFO mapred.JobClient: Counters: 25 >> 13/04/05 11:21:31 INFO mapred.JobClient: Job Counters >> 13/04/05 11:21:31 INFO mapred.JobClient: Launched reduce tasks=2 >> 13/04/05 11:21:31 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=15842 >> 13/04/05 11:21:31 INFO mapred.JobClient: Total time spent by all >> reduces waiting after reserving slots (ms)=0 >> 13/04/05 11:21:31 INFO mapred.JobClient: Total time spent by all maps >> waiting after reserving slots (ms)=0 >> 13/04/05 11:21:31 INFO mapred.JobClient: Rack-local map tasks=1 >> 13/04/05 11:21:31 INFO mapred.JobClient: Launched map tasks=1 >> 13/04/05 11:21:31 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=25891 >> 13/04/05 11:21:31 INFO mapred.JobClient: File Output Format Counters >> 13/04/05 11:21:31 INFO mapred.JobClient: Bytes Written=496 >> 13/04/05 11:21:31 INFO mapred.JobClient: FileSystemCounters >> 13/04/05 11:21:31 INFO mapred.JobClient: FILE_BYTES_READ=312 >> 13/04/05 11:21:31 INFO mapred.JobClient: HDFS_BYTES_READ=421 >> 13/04/05 11:21:31 INFO mapred.JobClient: FILE_BYTES_WRITTEN=68990 >> 13/04/05 11:21:31 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=496 >> 13/04/05 11:21:31 INFO mapred.JobClient: File Input Format Counters >> 13/04/05 11:21:31 INFO mapred.JobClient: Bytes Read=280 >> 13/04/05 11:21:31 INFO mapred.JobClient: Map-Reduce Framework >> 13/04/05 11:21:31 INFO mapred.JobClient: Reduce input groups=10 >> 13/04/05 11:21:31 INFO mapred.JobClient: Map output materialized >> bytes=312 >> 13/04/05 11:21:31 INFO mapred.JobClient: Combine output records=0 >> 13/04/05 11:21:31 INFO mapred.JobClient: Map input records=10 >> 13/04/05 11:21:31 INFO mapred.JobClient: Reduce shuffle bytes=186 >> 13/04/05 11:21:31 INFO mapred.JobClient: Reduce output records=10 >> 13/04/05 11:21:31 INFO mapred.JobClient: Spilled Records=20 >> 13/04/05 11:21:31 INFO mapred.JobClient: Map output bytes=280 >> 13/04/05 11:21:31 INFO mapred.JobClient: Combine input records=0 >> 13/04/05 11:21:31 INFO mapred.JobClient: Map output records=10 >> 13/04/05 11:21:31 INFO mapred.JobClient: SPLIT_RAW_BYTES=141 >> 13/04/05 11:21:31 INFO mapred.JobClient: Reduce input records=10 >> >> Here is the exception caught: >> org.apache.accumulo.core.client.AccumuloException: Internal error >> processing waitForTableOperation >> >> E.getMessage returns: >> Internal error processing waitForTableOperation >> Exception in thread "main" java.lang.RuntimeException: >> org.apache.accumulo.core.client.AccumuloException: Internal error processing >> waitForTableOperation >> at >> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample.run(BulkIngestExample.java:151) >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >> at >> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample.main(BulkIngestExample.java:166) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.lang.reflect.Method.invoke(Method.java:601) >> at org.apache.hadoop.util.RunJar.main(RunJar.java:156) >> Caused by: org.apache.accumulo.core.client.AccumuloException: Internal >> error processing waitForTableOperation >> at >> org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperationsImpl.java:290) >> at >> org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperationsImpl.java:258) >> at >> org.apache.accumulo.core.client.admin.TableOperationsImpl.importDirectory(TableOperationsImpl.java:945) >> at >> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample.run(BulkIngestExample.java:146) >> ... 7 more >> Caused by: org.apache.thrift.TApplicationException: Internal error >> processing waitForTableOperation >> at >> org.apache.thrift.TApplicationException.read(TApplicationException.java:108) >> at >> org.apache.accumulo.core.master.thrift.MasterClientService$Client.recv_waitForTableOperation(MasterClientService.java:684) >> at >> org.apache.accumulo.core.master.thrift.MasterClientService$Client.waitForTableOperation(MasterClientService.java:665) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.lang.reflect.Method.invoke(Method.java:601) >> at >> org.apache.accumulo.cloudtrace.instrument.thrift.TraceWrap$2.invoke(TraceWrap.java:84) >> at $Proxy5.waitForTableOperation(Unknown Source) >> at >> org.apache.accumulo.core.client.admin.TableOperationsImpl.waitForTableOperation(TableOperationsImpl.java:230) >> at >> org.apache.accumulo.core.client.admin.TableOperationsImpl.doTableOperation(TableOperationsImpl.java:272) >> ... 10 more >> [user@nodebulk]$ >> >> >> Modification in BulkIngestExample >> >> line 146 connector.tableOperations().importDirectory(tableName, >> workDir + "/files", workDir + "/failures", false); >> >> } catch (Exception e) { >> System.out.println("\nHere is the exception caught:\n"+ e); >> System.out.println("\nE.getMessage returns:\n"+ e.getMessage()); >> line 151 throw new RuntimeException(e); >> } finally { >> if (out != null) >> out.close(); >> line 166 int res = ToolRunner.run(CachedConfiguration.getInstance(), >> new BulkIngestExample(), args); >> >> >> On Thu, Apr 4, 2013 at 3:51 PM, Billie Rinaldi <[email protected]> wrote: >>> >>> On Thu, Apr 4, 2013 at 12:26 PM, Aji Janis <[email protected]> wrote: >>>> >>>> I haven't tried the classpath option yet, but I executed the below >>>> command as hadoop user ... this seemed to be the command that accumulo was >>>> trying to execute anyway and I am not sure but I would think this should >>>> have avoided the custom classpath issue... Right/Wrong? >>> >>> >>> No, the jar needs to be both in the libjars and on the classpath. There >>> are classes that need to be accessed on the local machine in the process of >>> submitting the MapReduce job, and this only can see the classpath, not the >>> libjars. >>> >>> The HADOOP_CLASSPATH you have is unusual. More often, HADOOP_CLASSPATH >>> is not set at all in hadoop-env.sh, but if it is it should generally be of >>> the form newstuff:$HADOOP_CLASSPATH to avoid this issue. >>> >>> You will have to restart Hadoop after making the change to hadoop-env.sh. >>> >>> Billie >>> >>> >>>> >>>> >>>> >>>> Got the same error: >>>> [hadoop@node]$ /opt/hadoop/bin/hadoop jar >>>> /opt/accumulo/lib/examples-simple-1.4.2.jar >>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample >>>> -libjars >>>> "/opt/accumulo/lib/libthrift-0.6.1.jar,/opt/accumulo/lib/accumulo-core-1.4.2.jar,/opt/zookeeper/zookeeper-3.3.3.jar,/opt/accumulo/lib/cloudtrace-1.4.2.jar,/opt/accumulo/lib/commons-collections-3.2.jar,/opt/accumulo/lib/commons-configuration-1.5.jar,/opt/accumulo/lib/commons-io-1.4.jar,/opt/accumulo/lib/commons-jci-core-1.0.jar,/opt/accumulo/lib/commons-jci-fam-1.0.jar,/opt/accumulo/lib/commons-lang-2.4.jar,/opt/accumulo/lib/commons-logging-1.0.4.jar,/opt/accumulo/lib/commons-logging-api-1.0.4.jar" >>>> >>>> Exception in thread "main" java.lang.NoClassDefFoundError: >>>> org/apache/accumulo/core/client/Instance >>>> at java.lang.Class.forName0(Native Method) >>>> at java.lang.Class.forName(Class.java:264) >>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:149) >>>> Caused by: java.lang.ClassNotFoundException: >>>> org.apache.accumulo.core.client.Instance >>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:366) >>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:355) >>>> at java.security.AccessController.doPrivileged(Native Method) >>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:354) >>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:423) >>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:356) >>>> ... 3 more >>>> >>>> >>>> >>>> On Thu, Apr 4, 2013 at 2:51 PM, Billie Rinaldi <[email protected]> >>>> wrote: >>>>> >>>>> On Thu, Apr 4, 2013 at 11:41 AM, Aji Janis <[email protected]> wrote: >>>>>> >>>>>> [accumulo@node accumulo]$ cat /opt/hadoop/conf/hadoop-env.sh | grep >>>>>> HADOOP_CLASSPATH >>>>>> export HADOOP_CLASSPATH=./:/conf:/build/*: >>>>> >>>>> >>>>> To preserve custom HADOOP_CLASSPATHs, this line should be: >>>>> export HADOOP_CLASSPATH=./:/conf:/build/*:$HADOOP_CLASSPATH >>>>> >>>>> Billie >>>>> >>>>> >>>>>> >>>>>> >>>>>> looks like it is overwriting everything. Isn't this the default >>>>>> behavior? Is you hadoop-env.sh missing that line? >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Thu, Apr 4, 2013 at 2:25 PM, Billie Rinaldi <[email protected]> >>>>>> wrote: >>>>>>> >>>>>>> On Thu, Apr 4, 2013 at 10:27 AM, Aji Janis <[email protected]> wrote: >>>>>>>> >>>>>>>> I thought about the permissions issue too. All the accumulo stuff is >>>>>>>> under accumulo user so I started running the commands as accumulo ... >>>>>>>> only >>>>>>>> to get the same result. >>>>>>>> -The errors happen right away >>>>>>>> -the box has both accumulo and hadoop on it >>>>>>>> -the jar contains the instance class. But note that the instance >>>>>>>> class is part of accumulo-core and not examples-simple-1.4.2.jar .... >>>>>>>> (can >>>>>>>> this be the issue?) >>>>>>> >>>>>>> >>>>>>> No, that isn't the issue. tool.sh is finding the accumulo-core jar >>>>>>> and putting it on the HADOOP_CLASSPATH and in the libjars. >>>>>>> >>>>>>> I wonder if your hadoop environment is set up to override the >>>>>>> HADOOP_CLASSPATH. Check in your hadoop-env.sh to see if >>>>>>> HADOOP_CLASSPATH is >>>>>>> set there. >>>>>>> >>>>>>> The reason your commands of the form "tool.sh lib/*jar" aren't >>>>>>> working is that the regex is finding multiple jars and putting them all >>>>>>> on >>>>>>> the command line. tool.sh expects at most one jar followed by a class >>>>>>> name, >>>>>>> so whatever jar comes second when the regex is expanded is being >>>>>>> interpreted >>>>>>> as a class name. >>>>>>> >>>>>>> Billie >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Commands I ran: >>>>>>>> >>>>>>>> [accumulo@node accumulo]$ whoami >>>>>>>> accumulo >>>>>>>> [accumulo@node accumulo]$ ls -l >>>>>>>> total 184 >>>>>>>> drwxr-xr-x 2 accumulo accumulo 4096 Apr 4 10:25 bin >>>>>>>> -rwxr-xr-x 1 accumulo accumulo 24263 Oct 22 15:30 CHANGES >>>>>>>> drwxr-xr-x 3 accumulo accumulo 4096 Apr 3 10:17 conf >>>>>>>> drwxr-xr-x 2 accumulo accumulo 4096 Jan 15 13:35 contrib >>>>>>>> -rwxr-xr-x 1 accumulo accumulo 695 Nov 18 2011 DISCLAIMER >>>>>>>> drwxr-xr-x 5 accumulo accumulo 4096 Jan 15 13:35 docs >>>>>>>> drwxr-xr-x 4 accumulo accumulo 4096 Jan 15 13:35 lib >>>>>>>> -rwxr-xr-x 1 accumulo accumulo 56494 Mar 21 2012 LICENSE >>>>>>>> drwxr-xr-x 2 accumulo accumulo 12288 Apr 3 14:43 logs >>>>>>>> -rwxr-xr-x 1 accumulo accumulo 2085 Mar 21 2012 NOTICE >>>>>>>> -rwxr-xr-x 1 accumulo accumulo 27814 Oct 17 08:32 pom.xml >>>>>>>> -rwxr-xr-x 1 accumulo accumulo 12449 Oct 17 08:32 README >>>>>>>> drwxr-xr-x 9 accumulo accumulo 4096 Nov 8 13:40 src >>>>>>>> drwxr-xr-x 5 accumulo accumulo 4096 Nov 8 13:40 test >>>>>>>> drwxr-xr-x 2 accumulo accumulo 4096 Apr 4 09:09 walogs >>>>>>>> [accumulo@node accumulo]$ ls bin/ >>>>>>>> accumulo check-slaves etc_initd_accumulo start-all.sh >>>>>>>> start-server.sh stop-here.sh tdown.sh tup.sh >>>>>>>> catapultsetup.acc config.sh LogForwarder.sh start-here.sh >>>>>>>> stop-all.sh stop-server.sh tool.sh upgrade.sh >>>>>>>> [accumulo@node accumulo]$ ls lib/ >>>>>>>> accumulo-core-1.4.2.jar accumulo-start-1.4.2.jar >>>>>>>> commons-collections-3.2.jar commons-logging-1.0.4.jar >>>>>>>> jline-0.9.94.jar >>>>>>>> accumulo-core-1.4.2-javadoc.jar accumulo-start-1.4.2-javadoc.jar >>>>>>>> commons-configuration-1.5.jar commons-logging-api-1.0.4.jar >>>>>>>> libthrift-0.6.1.jar >>>>>>>> accumulo-core-1.4.2-sources.jar accumulo-start-1.4.2-sources.jar >>>>>>>> commons-io-1.4.jar examples-simple-1.4.2.jar >>>>>>>> log4j-1.2.16.jar >>>>>>>> accumulo-server-1.4.2.jar cloudtrace-1.4.2.jar >>>>>>>> commons-jci-core-1.0.jar examples-simple-1.4.2-javadoc.jar >>>>>>>> native >>>>>>>> accumulo-server-1.4.2-javadoc.jar cloudtrace-1.4.2-javadoc.jar >>>>>>>> commons-jci-fam-1.0.jar examples-simple-1.4.2-sources.jar >>>>>>>> wikisearch-ingest-1.4.2-javadoc.jar >>>>>>>> accumulo-server-1.4.2-sources.jar cloudtrace-1.4.2-sources.jar >>>>>>>> commons-lang-2.4.jar ext >>>>>>>> wikisearch-query-1.4.2-javadoc.jar >>>>>>>> >>>>>>>> [accumulo@node accumulo]$ jar -tf >>>>>>>> /opt/accumulo/lib/accumulo-core-1.4.2.jar | grep >>>>>>>> org/apache/accumulo/core/client/Instance >>>>>>>> org/apache/accumulo/core/client/Instance.class >>>>>>>> >>>>>>>> [accumulo@node accumulo]$ jar -tf >>>>>>>> /opt/accumulo/lib/examples-simple-1.4.2.jar | grep >>>>>>>> org/apache/accumulo/core/client/Instance >>>>>>>> >>>>>>>> [accumulo@node accumulo]$ ./bin/tool.sh lib/*[^cs].jar >>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample >>>>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork >>>>>>>> USERJARS= >>>>>>>> CLASSNAME=lib/accumulo-server-1.4.2.jar >>>>>>>> >>>>>>>> HADOOP_CLASSPATH=/opt/accumulo/lib/libthrift-0.6.1.jar:/opt/accumulo/lib/accumulo-core-1.4.2.jar:/opt/zookeeper/zookeeper-3.3.3.jar:/opt/accumulo/lib/cloudtrace-1.4.2.jar:/opt/accumulo/lib/commons-collections-3.2.jar:/opt/accumulo/lib/commons-configuration-1.5.jar:/opt/accumulo/lib/commons-io-1.4.jar:/opt/accumulo/lib/commons-jci-core-1.0.jar:/opt/accumulo/lib/commons-jci-fam-1.0.jar:/opt/accumulo/lib/commons-lang-2.4.jar:/opt/accumulo/lib/commons-logging-1.0.4.jar:/opt/accumulo/lib/commons-logging-api-1.0.4.jar: >>>>>>>> exec /opt/hadoop/bin/hadoop jar lib/accumulo-core-1.4.2.jar >>>>>>>> lib/accumulo-server-1.4.2.jar -libjars >>>>>>>> "/opt/accumulo/lib/libthrift-0.6.1.jar,/opt/accumulo/lib/accumulo-core-1.4.2.jar,/opt/zookeeper/zookeeper-3.3.3.jar,/opt/accumulo/lib/cloudtrace-1.4.2.jar,/opt/accumulo/lib/commons-collections-3.2.jar,/opt/accumulo/lib/commons-configuration-1.5.jar,/opt/accumulo/lib/commons-io-1.4.jar,/opt/accumulo/lib/commons-jci-core-1.0.jar,/opt/accumulo/lib/commons-jci-fam-1.0.jar,/opt/accumulo/lib/commons-lang-2.4.jar,/opt/accumulo/lib/commons-logging-1.0.4.jar,/opt/accumulo/lib/commons-logging-api-1.0.4.jar" >>>>>>>> Exception in thread "main" java.lang.ClassNotFoundException: >>>>>>>> lib.accumulo-server-1.4.2.jar >>>>>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:366) >>>>>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:355) >>>>>>>> at java.security.AccessController.doPrivileged(Native >>>>>>>> Method) >>>>>>>> at >>>>>>>> java.net.URLClassLoader.findClass(URLClassLoader.java:354) >>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:423) >>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:356) >>>>>>>> at java.lang.Class.forName0(Native Method) >>>>>>>> at java.lang.Class.forName(Class.java:264) >>>>>>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:149) >>>>>>>> >>>>>>>> [accumulo@node accumulo]$ ./bin/tool.sh lib/*.jar >>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample >>>>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork >>>>>>>> USERJARS= >>>>>>>> CLASSNAME=lib/accumulo-core-1.4.2-javadoc.jar >>>>>>>> >>>>>>>> HADOOP_CLASSPATH=/opt/accumulo/lib/libthrift-0.6.1.jar:/opt/accumulo/lib/accumulo-core-1.4.2.jar:/opt/zookeeper/zookeeper-3.3.3.jar:/opt/accumulo/lib/cloudtrace-1.4.2.jar:/opt/accumulo/lib/commons-collections-3.2.jar:/opt/accumulo/lib/commons-configuration-1.5.jar:/opt/accumulo/lib/commons-io-1.4.jar:/opt/accumulo/lib/commons-jci-core-1.0.jar:/opt/accumulo/lib/commons-jci-fam-1.0.jar:/opt/accumulo/lib/commons-lang-2.4.jar:/opt/accumulo/lib/commons-logging-1.0.4.jar:/opt/accumulo/lib/commons-logging-api-1.0.4.jar: >>>>>>>> exec /opt/hadoop/bin/hadoop jar lib/accumulo-core-1.4.2.jar >>>>>>>> lib/accumulo-core-1.4.2-javadoc.jar -libjars >>>>>>>> "/opt/accumulo/lib/libthrift-0.6.1.jar,/opt/accumulo/lib/accumulo-core-1.4.2.jar,/opt/zookeeper/zookeeper-3.3.3.jar,/opt/accumulo/lib/cloudtrace-1.4.2.jar,/opt/accumulo/lib/commons-collections-3.2.jar,/opt/accumulo/lib/commons-configuration-1.5.jar,/opt/accumulo/lib/commons-io-1.4.jar,/opt/accumulo/lib/commons-jci-core-1.0.jar,/opt/accumulo/lib/commons-jci-fam-1.0.jar,/opt/accumulo/lib/commons-lang-2.4.jar,/opt/accumulo/lib/commons-logging-1.0.4.jar,/opt/accumulo/lib/commons-logging-api-1.0.4.jar" >>>>>>>> Exception in thread "main" java.lang.ClassNotFoundException: >>>>>>>> lib.accumulo-core-1.4.2-javadoc.jar >>>>>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:366) >>>>>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:355) >>>>>>>> at java.security.AccessController.doPrivileged(Native >>>>>>>> Method) >>>>>>>> at >>>>>>>> java.net.URLClassLoader.findClass(URLClassLoader.java:354) >>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:423) >>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:356) >>>>>>>> at java.lang.Class.forName0(Native Method) >>>>>>>> at java.lang.Class.forName(Class.java:264) >>>>>>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:149) >>>>>>>> >>>>>>>> [accumulo@node accumulo]$ ./bin/tool.sh lib/*[^c].jar >>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample >>>>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork >>>>>>>> USERJARS= >>>>>>>> CLASSNAME=lib/accumulo-core-1.4.2-sources.jar >>>>>>>> >>>>>>>> HADOOP_CLASSPATH=/opt/accumulo/lib/libthrift-0.6.1.jar:/opt/accumulo/lib/accumulo-core-1.4.2.jar:/opt/zookeeper/zookeeper-3.3.3.jar:/opt/accumulo/lib/cloudtrace-1.4.2.jar:/opt/accumulo/lib/commons-collections-3.2.jar:/opt/accumulo/lib/commons-configuration-1.5.jar:/opt/accumulo/lib/commons-io-1.4.jar:/opt/accumulo/lib/commons-jci-core-1.0.jar:/opt/accumulo/lib/commons-jci-fam-1.0.jar:/opt/accumulo/lib/commons-lang-2.4.jar:/opt/accumulo/lib/commons-logging-1.0.4.jar:/opt/accumulo/lib/commons-logging-api-1.0.4.jar: >>>>>>>> exec /opt/hadoop/bin/hadoop jar lib/accumulo-core-1.4.2.jar >>>>>>>> lib/accumulo-core-1.4.2-sources.jar -libjars >>>>>>>> "/opt/accumulo/lib/libthrift-0.6.1.jar,/opt/accumulo/lib/accumulo-core-1.4.2.jar,/opt/zookeeper/zookeeper-3.3.3.jar,/opt/accumulo/lib/cloudtrace-1.4.2.jar,/opt/accumulo/lib/commons-collections-3.2.jar,/opt/accumulo/lib/commons-configuration-1.5.jar,/opt/accumulo/lib/commons-io-1.4.jar,/opt/accumulo/lib/commons-jci-core-1.0.jar,/opt/accumulo/lib/commons-jci-fam-1.0.jar,/opt/accumulo/lib/commons-lang-2.4.jar,/opt/accumulo/lib/commons-logging-1.0.4.jar,/opt/accumulo/lib/commons-logging-api-1.0.4.jar" >>>>>>>> Exception in thread "main" java.lang.ClassNotFoundException: >>>>>>>> lib.accumulo-core-1.4.2-sources.jar >>>>>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:366) >>>>>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:355) >>>>>>>> at java.security.AccessController.doPrivileged(Native >>>>>>>> Method) >>>>>>>> at >>>>>>>> java.net.URLClassLoader.findClass(URLClassLoader.java:354) >>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:423) >>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:356) >>>>>>>> at java.lang.Class.forName0(Native Method) >>>>>>>> at java.lang.Class.forName(Class.java:264) >>>>>>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:149) >>>>>>>> >>>>>>>> [accumulo@node accumulo]$ ./bin/tool.sh >>>>>>>> lib/examples-simple-*[^c].jar >>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample >>>>>>>> default >>>>>>>> node14.catapult.dev.boozallenet.com:2181 root password test_aj >>>>>>>> /user/559599/input tmp/ajbulktest >>>>>>>> USERJARS= >>>>>>>> CLASSNAME=lib/examples-simple-1.4.2-sources.jar >>>>>>>> >>>>>>>> HADOOP_CLASSPATH=/opt/accumulo/lib/libthrift-0.6.1.jar:/opt/accumulo/lib/accumulo-core-1.4.2.jar:/opt/zookeeper/zookeeper-3.3.3.jar:/opt/accumulo/lib/cloudtrace-1.4.2.jar:/opt/accumulo/lib/commons-collections-3.2.jar:/opt/accumulo/lib/commons-configuration-1.5.jar:/opt/accumulo/lib/commons-io-1.4.jar:/opt/accumulo/lib/commons-jci-core-1.0.jar:/opt/accumulo/lib/commons-jci-fam-1.0.jar:/opt/accumulo/lib/commons-lang-2.4.jar:/opt/accumulo/lib/commons-logging-1.0.4.jar:/opt/accumulo/lib/commons-logging-api-1.0.4.jar: >>>>>>>> exec /opt/hadoop/bin/hadoop jar lib/examples-simple-1.4.2.jar >>>>>>>> lib/examples-simple-1.4.2-sources.jar -libjars >>>>>>>> "/opt/accumulo/lib/libthrift-0.6.1.jar,/opt/accumulo/lib/accumulo-core-1.4.2.jar,/opt/zookeeper/zookeeper-3.3.3.jar,/opt/accumulo/lib/cloudtrace-1.4.2.jar,/opt/accumulo/lib/commons-collections-3.2.jar,/opt/accumulo/lib/commons-configuration-1.5.jar,/opt/accumulo/lib/commons-io-1.4.jar,/opt/accumulo/lib/commons-jci-core-1.0.jar,/opt/accumulo/lib/commons-jci-fam-1.0.jar,/opt/accumulo/lib/commons-lang-2.4.jar,/opt/accumulo/lib/commons-logging-1.0.4.jar,/opt/accumulo/lib/commons-logging-api-1.0.4.jar" >>>>>>>> Exception in thread "main" java.lang.ClassNotFoundException: >>>>>>>> lib.examples-simple-1.4.2-sources.jar >>>>>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:366) >>>>>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:355) >>>>>>>> at java.security.AccessController.doPrivileged(Native >>>>>>>> Method) >>>>>>>> at >>>>>>>> java.net.URLClassLoader.findClass(URLClassLoader.java:354) >>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:423) >>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:356) >>>>>>>> at java.lang.Class.forName0(Native Method) >>>>>>>> at java.lang.Class.forName(Class.java:264) >>>>>>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:149) >>>>>>>> [accumulo@node accumulo]$ >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Apr 4, 2013 at 11:55 AM, Billie Rinaldi <[email protected]> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> On Thu, Apr 4, 2013 at 7:46 AM, Aji Janis <[email protected]> >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Billie, I checked the values in tool.sh they match. I uncommented >>>>>>>>>> the echo statements and reran the cmd here is what I have: >>>>>>>>>> >>>>>>>>>> $ ./bin/tool.sh ./lib/examples-simple-1.4.2.jar >>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample >>>>>>>>>> instance zookeeper usr pswd table inputdir tmp/bulk >>>>>>>>>> >>>>>>>>>> USERJARS= >>>>>>>>>> >>>>>>>>>> CLASSNAME=org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample >>>>>>>>>> >>>>>>>>>> HADOOP_CLASSPATH=/opt/accumulo/lib/libthrift-0.6.1.jar:/opt/accumulo/lib/accumulo-core-1.4.2.jar:/opt/zookeeper/zookeeper-3.3.3.jar:/opt/accumulo/lib/cloudtrace-1.4.2.jar:/opt/accumulo/lib/commons-collections-3.2.jar:/opt/accumulo/lib/commons-configuration-1.5.jar:/opt/accumulo/lib/commons-io-1.4.jar:/opt/accumulo/lib/commons-jci-core-1.0.jar:/opt/accumulo/lib/commons-jci-fam-1.0.jar:/opt/accumulo/lib/commons-lang-2.4.jar:/opt/accumulo/lib/commons-logging-1.0.4.jar:/opt/accumulo/lib/commons-logging-api-1.0.4.jar: >>>>>>>>>> exec /opt/hadoop/bin/hadoop jar ./lib/examples-simple-1.4.2.jar >>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample >>>>>>>>>> -libjars >>>>>>>>>> "/opt/accumulo/lib/libthrift-0.6.1.jar,/opt/accumulo/lib/accumulo-core-1.4.2.jar,/opt/zookeeper/zookeeper-3.3.3.jar,/opt/accumulo/lib/cloudtrace-1.4.2.jar,/opt/accumulo/lib/commons-collections-3.2.jar,/opt/accumulo/lib/commons-configuration-1.5.jar,/opt/accumulo/lib/commons-io-1.4.jar,/opt/accumulo/lib/commons-jci-core-1.0.jar,/opt/accumulo/lib/commons-jci-fam-1.0.jar,/opt/accumulo/lib/commons-lang-2.4.jar,/opt/accumulo/lib/commons-logging-1.0.4.jar,/opt/accumulo/lib/commons-logging-api-1.0.4.jar" >>>>>>>>>> Exception in thread "main" java.lang.NoClassDefFoundError: >>>>>>>>>> org/apache/accumulo/core/client/Instance >>>>>>>>>> at java.lang.Class.forName0(Native Method) >>>>>>>>>> at java.lang.Class.forName(Class.java:264) >>>>>>>>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:149) >>>>>>>>>> Caused by: java.lang.ClassNotFoundException: >>>>>>>>>> org.apache.accumulo.core.client.Instance >>>>>>>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:366) >>>>>>>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:355) >>>>>>>>>> at java.security.AccessController.doPrivileged(Native >>>>>>>>>> Method) >>>>>>>>>> at >>>>>>>>>> java.net.URLClassLoader.findClass(URLClassLoader.java:354) >>>>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:423) >>>>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:356) >>>>>>>>>> ... 3 more >>>>>>>>>> >>>>>>>>> >>>>>>>>> The command looks right. Instance should be packaged in the >>>>>>>>> accumulo core jar. To verify that, you could run: >>>>>>>>> jar tf /opt/accumulo/lib/accumulo-core-1.4.2.jar | grep >>>>>>>>> org/apache/accumulo/core/client/Instance >>>>>>>>> >>>>>>>>> I'm not sure what's going on here. If that error is happening >>>>>>>>> right away, it seems like it can't load the jar on the local machine. >>>>>>>>> If >>>>>>>>> you're running multiple machines, and if the error were happening >>>>>>>>> later >>>>>>>>> during the MapReduce, I would suggest that you make sure accumulo is >>>>>>>>> present >>>>>>>>> on all the machines. >>>>>>>>> >>>>>>>>> You asked about the user; is the owner of the jars different than >>>>>>>>> the user you're running as? In that case, it could be a permissions >>>>>>>>> issue. >>>>>>>>> Could the permissions be set so that you can list that directory but >>>>>>>>> not >>>>>>>>> read the jar? >>>>>>>>> >>>>>>>>> Billie >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> org/apache/accumulo/core/client/Instance is located in the src/... >>>>>>>>>> folder which I am not is what is packaged in the >>>>>>>>>> examples-simple-[^c].jar ? >>>>>>>>>> Sorry folks for the constant emails... just trying to get this to >>>>>>>>>> work but I really appreciate the help. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Thu, Apr 4, 2013 at 10:18 AM, John Vines <[email protected]> >>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> If you run tool.sh with sh -x, it will step through the script so >>>>>>>>>>> you can see what jars it is picking up and perhaps why it's missing >>>>>>>>>>> them for >>>>>>>>>>> you. >>>>>>>>>>> >>>>>>>>>>> Sent from my phone, please pardon the typos and brevity. >>>>>>>>>>> >>>>>>>>>>> On Apr 4, 2013 10:15 AM, "Aji Janis" <[email protected]> wrote: >>>>>>>>>>>> >>>>>>>>>>>> What user are you running the commands as ? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Apr 4, 2013 at 9:59 AM, Aji Janis <[email protected]> >>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Where did you put all your java files? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Thu, Apr 4, 2013 at 9:55 AM, Eric Newton >>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> I was able to run the example, as written in >>>>>>>>>>>>>> docs/examples/README.bulkIngest substituting my >>>>>>>>>>>>>> instance/zookeeper/user/password information: >>>>>>>>>>>>>> >>>>>>>>>>>>>> $ pwd >>>>>>>>>>>>>> /home/ecn/workspace/1.4.3 >>>>>>>>>>>>>> $ ls >>>>>>>>>>>>>> bin conf docs LICENSE NOTICE README src test >>>>>>>>>>>>>> CHANGES contrib lib logs pom.xml target walogs >>>>>>>>>>>>>> >>>>>>>>>>>>>> $ ./bin/accumulo >>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.SetupTable >>>>>>>>>>>>>> test localhost >>>>>>>>>>>>>> root secret test_bulk row_00000333 row_00000666 >>>>>>>>>>>>>> >>>>>>>>>>>>>> $ ./bin/accumulo >>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.GenerateTestData >>>>>>>>>>>>>> 0 1000 >>>>>>>>>>>>>> bulk/test_1.txt >>>>>>>>>>>>>> >>>>>>>>>>>>>> $ ./bin/tool.sh lib/examples-simple-*[^cs].jar >>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample >>>>>>>>>>>>>> test >>>>>>>>>>>>>> localhost root secret test_bulk bulk tmp/bulkWork >>>>>>>>>>>>>> >>>>>>>>>>>>>> $./bin/accumulo >>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.VerifyIngest >>>>>>>>>>>>>> test >>>>>>>>>>>>>> localhost root secret test_bulk 0 1000 >>>>>>>>>>>>>> >>>>>>>>>>>>>> -Eric >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Thu, Apr 4, 2013 at 9:33 AM, Aji Janis <[email protected]> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I am not sure its just a regular expression issue. Below is >>>>>>>>>>>>>>> my console output. Not sure why this ClassDefFoundError occurs. >>>>>>>>>>>>>>> Has anyone >>>>>>>>>>>>>>> tried to do it successfully? Can you please tell me your env >>>>>>>>>>>>>>> set up if you >>>>>>>>>>>>>>> did. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> [user@mynode bulk]$ pwd >>>>>>>>>>>>>>> /home/user/bulk >>>>>>>>>>>>>>> [user@mynode bulk]$ ls >>>>>>>>>>>>>>> BulkIngestExample.java GenerateTestData.java >>>>>>>>>>>>>>> SetupTable.java test_1.txt VerifyIngest.java >>>>>>>>>>>>>>> [user@mynode bulk]$ >>>>>>>>>>>>>>> [user@mynode bulk]$ /opt/accumulo/bin/tool.sh >>>>>>>>>>>>>>> /opt/accumulo/lib/examples-simple-1.4.2.jar >>>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample >>>>>>>>>>>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork >>>>>>>>>>>>>>> Exception in thread "main" java.lang.NoClassDefFoundError: >>>>>>>>>>>>>>> org/apache/accumulo/core/client/Instance >>>>>>>>>>>>>>> at java.lang.Class.forName0(Native Method) >>>>>>>>>>>>>>> at java.lang.Class.forName(Class.java:264) >>>>>>>>>>>>>>> at >>>>>>>>>>>>>>> org.apache.hadoop.util.RunJar.main(RunJar.java:149) >>>>>>>>>>>>>>> Caused by: java.lang.ClassNotFoundException: >>>>>>>>>>>>>>> org.apache.accumulo.core.client.Instance >>>>>>>>>>>>>>> at >>>>>>>>>>>>>>> java.net.URLClassLoader$1.run(URLClassLoader.java:366) >>>>>>>>>>>>>>> at >>>>>>>>>>>>>>> java.net.URLClassLoader$1.run(URLClassLoader.java:355) >>>>>>>>>>>>>>> at java.security.AccessController.doPrivileged(Native >>>>>>>>>>>>>>> Method) >>>>>>>>>>>>>>> at >>>>>>>>>>>>>>> java.net.URLClassLoader.findClass(URLClassLoader.java:354) >>>>>>>>>>>>>>> at >>>>>>>>>>>>>>> java.lang.ClassLoader.loadClass(ClassLoader.java:423) >>>>>>>>>>>>>>> at >>>>>>>>>>>>>>> java.lang.ClassLoader.loadClass(ClassLoader.java:356) >>>>>>>>>>>>>>> ... 3 more >>>>>>>>>>>>>>> [user@mynode bulk]$ /opt/accumulo/bin/tool.sh >>>>>>>>>>>>>>> /opt/accumulo/lib/examples-simple-*[^cs].jar >>>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample >>>>>>>>>>>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork >>>>>>>>>>>>>>> Exception in thread "main" java.lang.NoClassDefFoundError: >>>>>>>>>>>>>>> org/apache/accumulo/core/client/Instance >>>>>>>>>>>>>>> at java.lang.Class.forName0(Native Method) >>>>>>>>>>>>>>> at java.lang.Class.forName(Class.java:264) >>>>>>>>>>>>>>> at >>>>>>>>>>>>>>> org.apache.hadoop.util.RunJar.main(RunJar.java:149) >>>>>>>>>>>>>>> Caused by: java.lang.ClassNotFoundException: >>>>>>>>>>>>>>> org.apache.accumulo.core.client.Instance >>>>>>>>>>>>>>> at >>>>>>>>>>>>>>> java.net.URLClassLoader$1.run(URLClassLoader.java:366) >>>>>>>>>>>>>>> at >>>>>>>>>>>>>>> java.net.URLClassLoader$1.run(URLClassLoader.java:355) >>>>>>>>>>>>>>> at java.security.AccessController.doPrivileged(Native >>>>>>>>>>>>>>> Method) >>>>>>>>>>>>>>> at >>>>>>>>>>>>>>> java.net.URLClassLoader.findClass(URLClassLoader.java:354) >>>>>>>>>>>>>>> at >>>>>>>>>>>>>>> java.lang.ClassLoader.loadClass(ClassLoader.java:423) >>>>>>>>>>>>>>> at >>>>>>>>>>>>>>> java.lang.ClassLoader.loadClass(ClassLoader.java:356) >>>>>>>>>>>>>>> ... 3 more >>>>>>>>>>>>>>> [user@mynode bulk]$ /opt/accumulo/bin/tool.sh >>>>>>>>>>>>>>> /opt/accumulo/lib/examples-simple-*[^c].jar >>>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample >>>>>>>>>>>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork >>>>>>>>>>>>>>> Exception in thread "main" java.lang.ClassNotFoundException: >>>>>>>>>>>>>>> /opt/accumulo/lib/examples-simple-1/4/2-sources/jar >>>>>>>>>>>>>>> at java.lang.Class.forName0(Native Method) >>>>>>>>>>>>>>> at java.lang.Class.forName(Class.java:264) >>>>>>>>>>>>>>> at >>>>>>>>>>>>>>> org.apache.hadoop.util.RunJar.main(RunJar.java:149) >>>>>>>>>>>>>>> [user@mynode bulk]$ >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Wed, Apr 3, 2013 at 4:57 PM, Billie Rinaldi >>>>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Wed, Apr 3, 2013 at 1:16 PM, Christopher >>>>>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Try with -libjars: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> tool.sh automatically adds libjars. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The problem is the regular expression for the >>>>>>>>>>>>>>>> examples-simple jar. It's trying to exclude the javadoc jar >>>>>>>>>>>>>>>> with ^c, but it >>>>>>>>>>>>>>>> isn't excluding the sources jar. >>>>>>>>>>>>>>>> /opt/accumulo/lib/examples-simple-*[^cs].jar may work, or you >>>>>>>>>>>>>>>> can just >>>>>>>>>>>>>>>> specify the jar exactly, >>>>>>>>>>>>>>>> /opt/accumulo/lib/examples-simple-1.4.2.jar >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> /opt/accumulo/bin/tool.sh >>>>>>>>>>>>>>>> /opt/accumulo/lib/examples-simple-*[^cs].jar >>>>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample >>>>>>>>>>>>>>>> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Billie >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> /opt/accumulo/bin/tool.sh >>>>>>>>>>>>>>>>> /opt/accumulo/lib/examples-simple-*[^c].jar >>>>>>>>>>>>>>>>> -libjars /opt/accumulo/lib/examples-simple-*[^c].jar >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample >>>>>>>>>>>>>>>>> myinstance zookeepers user pswd tableName inputDir >>>>>>>>>>>>>>>>> tmp/bulkWork >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>> Christopher L Tubbs II >>>>>>>>>>>>>>>>> http://gravatar.com/ctubbsii >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Wed, Apr 3, 2013 at 4:11 PM, Aji Janis >>>>>>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>>> > I am trying to run the BulkIngest example (on 1.4.2 >>>>>>>>>>>>>>>>> > accumulo) and I am not >>>>>>>>>>>>>>>>> > able to run the following steps. Here is the error I get: >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > [user@mynode bulk]$ /opt/accumulo/bin/tool.sh >>>>>>>>>>>>>>>>> > /opt/accumulo/lib/examples-simple-*[^c].jar >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample >>>>>>>>>>>>>>>>> > myinstance zookeepers user pswd tableName inputDir >>>>>>>>>>>>>>>>> > tmp/bulkWork >>>>>>>>>>>>>>>>> > Exception in thread "main" >>>>>>>>>>>>>>>>> > java.lang.ClassNotFoundException: >>>>>>>>>>>>>>>>> > /opt/accumulo/lib/examples-simple-1/4/2-sources/jar >>>>>>>>>>>>>>>>> > at java.lang.Class.forName0(Native Method) >>>>>>>>>>>>>>>>> > at java.lang.Class.forName(Class.java:264) >>>>>>>>>>>>>>>>> > at >>>>>>>>>>>>>>>>> > org.apache.hadoop.util.RunJar.main(RunJar.java:149) >>>>>>>>>>>>>>>>> > [user@mynode bulk]$ >>>>>>>>>>>>>>>>> > [user@mynode bulk]$ >>>>>>>>>>>>>>>>> > [user@mynode bulk]$ >>>>>>>>>>>>>>>>> > [user@mynode bulk]$ ls /opt/accumulo/lib/ >>>>>>>>>>>>>>>>> > accumulo-core-1.4.2.jar >>>>>>>>>>>>>>>>> > accumulo-start-1.4.2.jar >>>>>>>>>>>>>>>>> > commons-collections-3.2.jar >>>>>>>>>>>>>>>>> > commons-logging-1.0.4.jar >>>>>>>>>>>>>>>>> > jline-0.9.94.jar >>>>>>>>>>>>>>>>> > accumulo-core-1.4.2-javadoc.jar >>>>>>>>>>>>>>>>> > accumulo-start-1.4.2-javadoc.jar >>>>>>>>>>>>>>>>> > commons-configuration-1.5.jar >>>>>>>>>>>>>>>>> > commons-logging-api-1.0.4.jar >>>>>>>>>>>>>>>>> > libthrift-0.6.1.jar >>>>>>>>>>>>>>>>> > accumulo-core-1.4.2-sources.jar >>>>>>>>>>>>>>>>> > accumulo-start-1.4.2-sources.jar >>>>>>>>>>>>>>>>> > commons-io-1.4.jar >>>>>>>>>>>>>>>>> > examples-simple-1.4.2.jar >>>>>>>>>>>>>>>>> > log4j-1.2.16.jar >>>>>>>>>>>>>>>>> > accumulo-server-1.4.2.jar >>>>>>>>>>>>>>>>> > cloudtrace-1.4.2.jar >>>>>>>>>>>>>>>>> > commons-jci-core-1.0.jar >>>>>>>>>>>>>>>>> > examples-simple-1.4.2-javadoc.jar >>>>>>>>>>>>>>>>> > native >>>>>>>>>>>>>>>>> > accumulo-server-1.4.2-javadoc.jar >>>>>>>>>>>>>>>>> > cloudtrace-1.4.2-javadoc.jar >>>>>>>>>>>>>>>>> > commons-jci-fam-1.0.jar >>>>>>>>>>>>>>>>> > examples-simple-1.4.2-sources.jar >>>>>>>>>>>>>>>>> > wikisearch-ingest-1.4.2-javadoc.jar >>>>>>>>>>>>>>>>> > accumulo-server-1.4.2-sources.jar >>>>>>>>>>>>>>>>> > cloudtrace-1.4.2-sources.jar >>>>>>>>>>>>>>>>> > commons-lang-2.4.jar >>>>>>>>>>>>>>>>> > ext >>>>>>>>>>>>>>>>> > wikisearch-query-1.4.2-javadoc.jar >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > [user@mynode bulk]$ >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > Clearly, the libraries and source file exist so I am not >>>>>>>>>>>>>>>>> > sure whats going >>>>>>>>>>>>>>>>> > on. I tried putting in >>>>>>>>>>>>>>>>> > /opt/accumulo/lib/examples-simple-1.4.2-sources.jar >>>>>>>>>>>>>>>>> > instead then it complains BulkIngestExample >>>>>>>>>>>>>>>>> > ClassNotFound. >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > Suggestions? >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > On Wed, Apr 3, 2013 at 2:36 PM, Eric Newton >>>>>>>>>>>>>>>>> > <[email protected]> wrote: >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> You will have to write your own InputFormat class which >>>>>>>>>>>>>>>>> >> will parse your >>>>>>>>>>>>>>>>> >> file and pass records to your reducer. >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> -Eric >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> On Wed, Apr 3, 2013 at 2:29 PM, Aji Janis >>>>>>>>>>>>>>>>> >> <[email protected]> wrote: >>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>> >>> Looking at the BulkIngestExample, it uses >>>>>>>>>>>>>>>>> >>> GenerateTestData and creates a >>>>>>>>>>>>>>>>> >>> .txt file which contians Key: Value pair and correct me >>>>>>>>>>>>>>>>> >>> if I am wrong but >>>>>>>>>>>>>>>>> >>> each new line is a new row right? >>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>> >>> I need to know how to have family and qualifiers also. >>>>>>>>>>>>>>>>> >>> In other words, >>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>> >>> 1) Do I set up a .txt file that can be converted into >>>>>>>>>>>>>>>>> >>> an Accumulo RF File >>>>>>>>>>>>>>>>> >>> using AccumuloFileOutputFormat which can then be >>>>>>>>>>>>>>>>> >>> imported into my table? >>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>> >>> 2) if yes, what is the format of the .txt file. >>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>> >>> On Wed, Apr 3, 2013 at 2:19 PM, Eric Newton >>>>>>>>>>>>>>>>> >>> <[email protected]> >>>>>>>>>>>>>>>>> >>> wrote: >>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>> >>>> Your data needs to be in the RFile format, and more >>>>>>>>>>>>>>>>> >>>> importantly it needs >>>>>>>>>>>>>>>>> >>>> to be sorted. >>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>> >>>> It's handy to use a Map/Reduce job to convert/sort >>>>>>>>>>>>>>>>> >>>> your data. See the >>>>>>>>>>>>>>>>> >>>> BulkIngestExample. >>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>> >>>> -Eric >>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>> >>>> On Wed, Apr 3, 2013 at 2:15 PM, Aji Janis >>>>>>>>>>>>>>>>> >>>> <[email protected]> wrote: >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>> >>>>> I have some data in a text file in the following >>>>>>>>>>>>>>>>> >>>>> format. >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>> >>>>> rowid1 columnFamily1 colQualifier1 value >>>>>>>>>>>>>>>>> >>>>> rowid1 columnFamily1 colQualifier2 value >>>>>>>>>>>>>>>>> >>>>> rowid1 columnFamily2 colQualifier1 value >>>>>>>>>>>>>>>>> >>>>> rowid2 columnFamily1 colQualifier1 value >>>>>>>>>>>>>>>>> >>>>> rowid3 columnFamily1 colQualifier1 value >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>> >>>>> I want to import this data into a table in accumulo. >>>>>>>>>>>>>>>>> >>>>> My end goal is to >>>>>>>>>>>>>>>>> >>>>> understand how to use the BulkImport feature in >>>>>>>>>>>>>>>>> >>>>> accumulo. I tried to login >>>>>>>>>>>>>>>>> >>>>> to the accumulo shell as root and then run: >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>> >>>>> #table mytable >>>>>>>>>>>>>>>>> >>>>> #importdirectory /home/inputDir /home/failureDir true >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>> >>>>> but it didn't work. My data file was saved as >>>>>>>>>>>>>>>>> >>>>> data.txt in >>>>>>>>>>>>>>>>> >>>>> /home/inputDir. I tried to create the dir/file >>>>>>>>>>>>>>>>> >>>>> structure in hdfs and linux >>>>>>>>>>>>>>>>> >>>>> but neither worked. When trying locally, it keeps >>>>>>>>>>>>>>>>> >>>>> complaining about >>>>>>>>>>>>>>>>> >>>>> failureDir not existing. >>>>>>>>>>>>>>>>> >>>>> ... >>>>>>>>>>>>>>>>> >>>>> java.io.FileNotFoundException: File does not exist: >>>>>>>>>>>>>>>>> >>>>> failures >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>> >>>>> When trying with files on hdfs, I get no error on the >>>>>>>>>>>>>>>>> >>>>> console but the >>>>>>>>>>>>>>>>> >>>>> logger had the following messages: >>>>>>>>>>>>>>>>> >>>>> ... >>>>>>>>>>>>>>>>> >>>>> [tableOps.BulkImport] WARN : >>>>>>>>>>>>>>>>> >>>>> hdfs://node....//inputDir/data.txt does >>>>>>>>>>>>>>>>> >>>>> not have a valid extension, ignoring >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>> >>>>> or, >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>> >>>>> [tableOps.BulkImport] WARN : >>>>>>>>>>>>>>>>> >>>>> hdfs://node....//inputDir/data.txt is not >>>>>>>>>>>>>>>>> >>>>> a map file, ignoring >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>> >>>>> Suggestions? Am I not setting up the job right? Thank >>>>>>>>>>>>>>>>> >>>>> you for help in >>>>>>>>>>>>>>>>> >>>>> advance. >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>> >>>>> On Wed, Apr 3, 2013 at 2:04 PM, Aji Janis >>>>>>>>>>>>>>>>> >>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>> >>>>>> I have some data in a text file in the following >>>>>>>>>>>>>>>>> >>>>>> format: >>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>> >>>>>> rowid1 columnFamily colQualifier value >>>>>>>>>>>>>>>>> >>>>>> rowid1 columnFamily colQualifier value >>>>>>>>>>>>>>>>> >>>>>> rowid1 columnFamily colQualifier value >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >
