Thanks for the help. I do see that stuff is getting into Accumulo but, in my unexperienced opinion, it looks like the map method is never getting called in the job. I'm not sure if this is supposed to happen after the currentJob.waitForCompletion(true); in the InjectorJob class, but I never see any of the output lines like:
INFO mapred.JobClient: map 30% reduce 0% Here is my gora.properties file. Pretty much standard: gora.datastore.default=org.apache.gora.accumulo.store.AccumuloStore gora.datastore.accumulo.mock=false gora.datastore.accumulo.instance=dev1o gora.datastore.accumulo.zookeepers=dev1o,dev5o,dev6o gora.datastore.accumulo.user=root gora.datastore.accumulo.password=secret I can use the root:secret credentials to access the Accumulo data and I can connect to the "webpage" table and scan it (granted there is no information in it). But the table is created by the Nutch Hadoop job. When I was running the "gora.datastore.accumulo.mock=true" setting, I would definitely see information in the webpage table in Accumulo. Not sure if any of this helps... Just reporting what I'm seeing right now. On Fri, Nov 15, 2013 at 3:20 PM, Lewis John Mcgibbney < [email protected]> wrote: > Hi Jon, > > As you've guessed by now this is not so much a Nutch specific problem. > I'm CC'ing user@gora in here as well. > > On Fri, Nov 15, 2013 at 8:05 PM, <[email protected]> > wrote: > > > > > I was wrong. So changing the gora.datastore.accumulo.user property caused > > the inject to finish and on the command line, it looked like it was > > successful: > > > > 13/11/15 17:09:30 INFO crawl.InjectorJob: InjectorJob: total number of > urls > > rejected by filters: 0 > > 13/11/15 17:09:30 INFO crawl.InjectorJob: InjectorJob: total number of > urls > > injected after normalization and filtering: 35 > > > > But when I tried to run the bin/nutch generate command, it gives me the > > error: > > > > 13/11/15 17:12:36 ERROR store.AccumuloStore: > > org.apache.accumulo.core.client.AccumuloSecurityException: Error > > BAD_CREDENTIALS - Username or Password is Invalid > > > > Still trying to figure this out... > > > > > > I've some suggestions... > * Please see this post... in particular the gora.properties spec, which is > suggested as follows > > gora.datastore.default=org.apache.gora.accumulo.store.AccumuloStore > gora.datastore.accumulo.mock=false > gora.datastore.accumulo.instance=inst > gora.datastore.accumulo.zookeepers=localhost > gora.datastore.accumulo.user=root > gora.datastore.accumulo.password=secret > gora.datastore.accumulo.zookeepers=127.0.0.1:2181 > > Also please note that it may be necessary to create the user (or in the > example above use root) as is stated in the accumulo docs. > <quote> > > 1. > > Install and run Accumulo via the instructions found in > $ACCUMULO_HOME/README. Remember the instance name. It will be referred > to > as "instance" throughout the examples. A comma-separated list of > zookeeper > servers will be referred to as "zookeepers". > 2. > > Create an Accumulo user (see the user > manual< > http://accumulo.apache.org/1.4/user_manual/Accumulo_Shell.html#User_Administration > >), > or use the root user. The Accumulo user name will be referred to as > "username" with password "password" throughout the examples. This user > will > need to have the ability to create table. > > </quote> > > Sounds like some config problem on the Accumulo side of things. > > hth > > Lewis > -- Jon Uhal

