Re: importdirectory in accumulo

Billie Rinaldi Wed, 03 Apr 2013 13:57:42 -0700

On Wed, Apr 3, 2013 at 1:16 PM, Christopher <[email protected]> wrote:


> Try with -libjars:
>

tool.sh automatically adds libjars.

The problem is the regular expression for the examples-simple jar.  It's
trying to exclude the javadoc jar with ^c, but it isn't excluding the
sources jar. /opt/accumulo/lib/examples-simple-*[^cs].jar may work, or you
can just specify the jar exactly,
/opt/accumulo/lib/examples-simple-1.4.2.jar

*/opt/accumulo/bin/tool.sh /opt/accumulo/lib/examples-simple-*[^cs].jar
org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
myinstance zookeepers user pswd tableName inputDir tmp/bulkWork*

Billie



>
> /opt/accumulo/bin/tool.sh /opt/accumulo/lib/examples-simple-*[^c].jar
> -libjars  /opt/accumulo/lib/examples-simple-*[^c].jar
> org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
> myinstance zookeepers user pswd tableName inputDir tmp/bulkWork
>
> --
> Christopher L Tubbs II
> http://gravatar.com/ctubbsii
>
>
> On Wed, Apr 3, 2013 at 4:11 PM, Aji Janis <[email protected]> wrote:
> > I am trying to run the BulkIngest example (on 1.4.2 accumulo) and I am
> not
> > able to run the following steps. Here is the error I get:
> >
> > [user@mynode bulk]$ /opt/accumulo/bin/tool.sh
> > /opt/accumulo/lib/examples-simple-*[^c].jar
> > org.apache.accumulo.examples.simple.mapreduce.bulk.BulkIngestExample
> > myinstance zookeepers user pswd tableName inputDir tmp/bulkWork
> > Exception in thread "main" java.lang.ClassNotFoundException:
> > /opt/accumulo/lib/examples-simple-1/4/2-sources/jar
> >         at java.lang.Class.forName0(Native Method)
> >         at java.lang.Class.forName(Class.java:264)
> >         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
> > [user@mynode bulk]$
> > [user@mynode bulk]$
> > [user@mynode bulk]$
> > [user@mynode bulk]$ ls /opt/accumulo/lib/
> > accumulo-core-1.4.2.jar
> > accumulo-start-1.4.2.jar
> > commons-collections-3.2.jar
> > commons-logging-1.0.4.jar
> > jline-0.9.94.jar
> > accumulo-core-1.4.2-javadoc.jar
> > accumulo-start-1.4.2-javadoc.jar
> > commons-configuration-1.5.jar
> > commons-logging-api-1.0.4.jar
> > libthrift-0.6.1.jar
> > accumulo-core-1.4.2-sources.jar
> > accumulo-start-1.4.2-sources.jar
> > commons-io-1.4.jar
> > examples-simple-1.4.2.jar
> > log4j-1.2.16.jar
> > accumulo-server-1.4.2.jar
> > cloudtrace-1.4.2.jar
> > commons-jci-core-1.0.jar
> > examples-simple-1.4.2-javadoc.jar
> > native
> > accumulo-server-1.4.2-javadoc.jar
> > cloudtrace-1.4.2-javadoc.jar
> > commons-jci-fam-1.0.jar
> > examples-simple-1.4.2-sources.jar
> > wikisearch-ingest-1.4.2-javadoc.jar
> > accumulo-server-1.4.2-sources.jar
> > cloudtrace-1.4.2-sources.jar
> > commons-lang-2.4.jar
> >  ext
> > wikisearch-query-1.4.2-javadoc.jar
> >
> > [user@mynode bulk]$
> >
> >
> > Clearly, the libraries and source file exist so I am not sure whats going
> > on. I tried putting in
> /opt/accumulo/lib/examples-simple-1.4.2-sources.jar
> > instead then it complains BulkIngestExample ClassNotFound.
> >
> > Suggestions?
> >
> >
> > On Wed, Apr 3, 2013 at 2:36 PM, Eric Newton <[email protected]>
> wrote:
> >>
> >> You will have to write your own InputFormat class which will parse your
> >> file and pass records to your reducer.
> >>
> >> -Eric
> >>
> >>
> >> On Wed, Apr 3, 2013 at 2:29 PM, Aji Janis <[email protected]> wrote:
> >>>
> >>> Looking at the BulkIngestExample, it uses GenerateTestData and creates
> a
> >>> .txt file which contians Key: Value pair and correct me if I am wrong
> but
> >>> each new line is a new row right?
> >>>
> >>> I need to know how to have family and qualifiers also. In other words,
> >>>
> >>> 1) Do I set up a .txt file that can be converted into an Accumulo RF
> File
> >>> using AccumuloFileOutputFormat  which can then be imported into my
> table?
> >>>
> >>> 2) if yes, what is the format of the .txt file.
> >>>
> >>>
> >>>
> >>>
> >>> On Wed, Apr 3, 2013 at 2:19 PM, Eric Newton <[email protected]>
> >>> wrote:
> >>>>
> >>>> Your data needs to be in the RFile format, and more importantly it
> needs
> >>>> to be sorted.
> >>>>
> >>>> It's handy to use a Map/Reduce job to convert/sort your data.  See the
> >>>> BulkIngestExample.
> >>>>
> >>>> -Eric
> >>>>
> >>>>
> >>>> On Wed, Apr 3, 2013 at 2:15 PM, Aji Janis <[email protected]> wrote:
> >>>>>
> >>>>> I have some data in a text file in the following format.
> >>>>>
> >>>>> rowid1 columnFamily1 colQualifier1 value
> >>>>> rowid1 columnFamily1 colQualifier2 value
> >>>>> rowid1 columnFamily2 colQualifier1 value
> >>>>> rowid2 columnFamily1 colQualifier1 value
> >>>>> rowid3 columnFamily1 colQualifier1 value
> >>>>>
> >>>>> I want to import this data into a table in accumulo. My end goal is
> to
> >>>>> understand how to use the BulkImport feature in accumulo. I tried to
> login
> >>>>> to the accumulo shell as root and then run:
> >>>>>
> >>>>> #table mytable
> >>>>> #importdirectory /home/inputDir /home/failureDir true
> >>>>>
> >>>>> but it didn't work. My data file was saved as data.txt in
> >>>>> /home/inputDir. I tried to create the dir/file structure in hdfs and
> linux
> >>>>> but neither worked. When trying locally, it keeps complaining about
> >>>>> failureDir not existing.
> >>>>> ...
> >>>>> java.io.FileNotFoundException: File does not exist: failures
> >>>>>
> >>>>> When trying with files on hdfs, I get no error on the console but the
> >>>>> logger had the following messages:
> >>>>> ...
> >>>>> [tableOps.BulkImport] WARN : hdfs://node....//inputDir/data.txt does
> >>>>> not have a valid extension, ignoring
> >>>>>
> >>>>> or,
> >>>>>
> >>>>> [tableOps.BulkImport] WARN : hdfs://node....//inputDir/data.txt is
> not
> >>>>> a map file, ignoring
> >>>>>
> >>>>>
> >>>>> Suggestions? Am I not setting up the job right? Thank you for help in
> >>>>> advance.
> >>>>>
> >>>>>
> >>>>> On Wed, Apr 3, 2013 at 2:04 PM, Aji Janis <[email protected]> wrote:
> >>>>>>
> >>>>>> I have some data in a text file in the following format:
> >>>>>>
> >>>>>> rowid1 columnFamily colQualifier value
> >>>>>> rowid1 columnFamily colQualifier value
> >>>>>> rowid1 columnFamily colQualifier value
> >>>>>
> >>>>>
> >>>>
> >>>
> >>
> >
>

Re: importdirectory in accumulo

Reply via email to