On Wed, Apr 3, 2013 at 1:16 PM, Christopher <[email protected]> wrote:
> Try with -libjars: > tool.sh automatically adds libjars. The problem is the regular expression for the examples-simple jar. It's trying to exclude the javadoc jar with ^c, but it isn't excluding the sources jar. /opt/accumulo/lib/examples-simple-*[^cs].jar may work, or you can just specify the jar exactly, /opt/accumulo/lib/examples-simple-1.4.2.jar */opt/accumulo/bin/tool.sh /opt/accumulo/lib/examples-simple-*[^cs].jar org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample myinstance zookeepers user pswd tableName inputDir tmp/bulkWork* Billie > > /opt/accumulo/bin/tool.sh /opt/accumulo/lib/examples-simple-*[^c].jar > -libjars /opt/accumulo/lib/examples-simple-*[^c].jar > org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample > myinstance zookeepers user pswd tableName inputDir tmp/bulkWork > > -- > Christopher L Tubbs II > http://gravatar.com/ctubbsii > > > On Wed, Apr 3, 2013 at 4:11 PM, Aji Janis <[email protected]> wrote: > > I am trying to run the BulkIngest example (on 1.4.2 accumulo) and I am > not > > able to run the following steps. Here is the error I get: > > > > [user@mynode bulk]$ /opt/accumulo/bin/tool.sh > > /opt/accumulo/lib/examples-simple-*[^c].jar > > org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample > > myinstance zookeepers user pswd tableName inputDir tmp/bulkWork > > Exception in thread "main" java.lang.ClassNotFoundException: > > /opt/accumulo/lib/examples-simple-1/4/2-sources/jar > > at java.lang.Class.forName0(Native Method) > > at java.lang.Class.forName(Class.java:264) > > at org.apache.hadoop.util.RunJar.main(RunJar.java:149) > > [user@mynode bulk]$ > > [user@mynode bulk]$ > > [user@mynode bulk]$ > > [user@mynode bulk]$ ls /opt/accumulo/lib/ > > accumulo-core-1.4.2.jar > > accumulo-start-1.4.2.jar > > commons-collections-3.2.jar > > commons-logging-1.0.4.jar > > jline-0.9.94.jar > > accumulo-core-1.4.2-javadoc.jar > > accumulo-start-1.4.2-javadoc.jar > > commons-configuration-1.5.jar > > commons-logging-api-1.0.4.jar > > libthrift-0.6.1.jar > > accumulo-core-1.4.2-sources.jar > > accumulo-start-1.4.2-sources.jar > > commons-io-1.4.jar > > examples-simple-1.4.2.jar > > log4j-1.2.16.jar > > accumulo-server-1.4.2.jar > > cloudtrace-1.4.2.jar > > commons-jci-core-1.0.jar > > examples-simple-1.4.2-javadoc.jar > > native > > accumulo-server-1.4.2-javadoc.jar > > cloudtrace-1.4.2-javadoc.jar > > commons-jci-fam-1.0.jar > > examples-simple-1.4.2-sources.jar > > wikisearch-ingest-1.4.2-javadoc.jar > > accumulo-server-1.4.2-sources.jar > > cloudtrace-1.4.2-sources.jar > > commons-lang-2.4.jar > > ext > > wikisearch-query-1.4.2-javadoc.jar > > > > [user@mynode bulk]$ > > > > > > Clearly, the libraries and source file exist so I am not sure whats going > > on. I tried putting in > /opt/accumulo/lib/examples-simple-1.4.2-sources.jar > > instead then it complains BulkIngestExample ClassNotFound. > > > > Suggestions? > > > > > > On Wed, Apr 3, 2013 at 2:36 PM, Eric Newton <[email protected]> > wrote: > >> > >> You will have to write your own InputFormat class which will parse your > >> file and pass records to your reducer. > >> > >> -Eric > >> > >> > >> On Wed, Apr 3, 2013 at 2:29 PM, Aji Janis <[email protected]> wrote: > >>> > >>> Looking at the BulkIngestExample, it uses GenerateTestData and creates > a > >>> .txt file which contians Key: Value pair and correct me if I am wrong > but > >>> each new line is a new row right? > >>> > >>> I need to know how to have family and qualifiers also. In other words, > >>> > >>> 1) Do I set up a .txt file that can be converted into an Accumulo RF > File > >>> using AccumuloFileOutputFormat which can then be imported into my > table? > >>> > >>> 2) if yes, what is the format of the .txt file. > >>> > >>> > >>> > >>> > >>> On Wed, Apr 3, 2013 at 2:19 PM, Eric Newton <[email protected]> > >>> wrote: > >>>> > >>>> Your data needs to be in the RFile format, and more importantly it > needs > >>>> to be sorted. > >>>> > >>>> It's handy to use a Map/Reduce job to convert/sort your data. See the > >>>> BulkIngestExample. > >>>> > >>>> -Eric > >>>> > >>>> > >>>> On Wed, Apr 3, 2013 at 2:15 PM, Aji Janis <[email protected]> wrote: > >>>>> > >>>>> I have some data in a text file in the following format. > >>>>> > >>>>> rowid1 columnFamily1 colQualifier1 value > >>>>> rowid1 columnFamily1 colQualifier2 value > >>>>> rowid1 columnFamily2 colQualifier1 value > >>>>> rowid2 columnFamily1 colQualifier1 value > >>>>> rowid3 columnFamily1 colQualifier1 value > >>>>> > >>>>> I want to import this data into a table in accumulo. My end goal is > to > >>>>> understand how to use the BulkImport feature in accumulo. I tried to > login > >>>>> to the accumulo shell as root and then run: > >>>>> > >>>>> #table mytable > >>>>> #importdirectory /home/inputDir /home/failureDir true > >>>>> > >>>>> but it didn't work. My data file was saved as data.txt in > >>>>> /home/inputDir. I tried to create the dir/file structure in hdfs and > linux > >>>>> but neither worked. When trying locally, it keeps complaining about > >>>>> failureDir not existing. > >>>>> ... > >>>>> java.io.FileNotFoundException: File does not exist: failures > >>>>> > >>>>> When trying with files on hdfs, I get no error on the console but the > >>>>> logger had the following messages: > >>>>> ... > >>>>> [tableOps.BulkImport] WARN : hdfs://node....//inputDir/data.txt does > >>>>> not have a valid extension, ignoring > >>>>> > >>>>> or, > >>>>> > >>>>> [tableOps.BulkImport] WARN : hdfs://node....//inputDir/data.txt is > not > >>>>> a map file, ignoring > >>>>> > >>>>> > >>>>> Suggestions? Am I not setting up the job right? Thank you for help in > >>>>> advance. > >>>>> > >>>>> > >>>>> On Wed, Apr 3, 2013 at 2:04 PM, Aji Janis <[email protected]> wrote: > >>>>>> > >>>>>> I have some data in a text file in the following format: > >>>>>> > >>>>>> rowid1 columnFamily colQualifier value > >>>>>> rowid1 columnFamily colQualifier value > >>>>>> rowid1 columnFamily colQualifier value > >>>>> > >>>>> > >>>> > >>> > >> > > >
