I was wrong. So changing the gora.datastore.accumulo.user property caused the inject to finish and on the command line, it looked like it was successful:
13/11/15 17:09:30 INFO crawl.InjectorJob: InjectorJob: total number of urls rejected by filters: 0 13/11/15 17:09:30 INFO crawl.InjectorJob: InjectorJob: total number of urls injected after normalization and filtering: 35 But when I tried to run the bin/nutch generate command, it gives me the error: 13/11/15 17:12:36 ERROR store.AccumuloStore: org.apache.accumulo.core.client.AccumuloSecurityException: Error BAD_CREDENTIALS - Username or Password is Invalid Still trying to figure this out... On Fri, Nov 15, 2013 at 11:52 AM, Jon Uhal <[email protected]> wrote: > So I think I figured this out. I believe this had to do with the Nutch > conf/gora.properties settings for Accumulo. The default user was set to: > > gora.datastore.accumulo.user=root > > and after trying to clean up ZooKeeper, I was running into issues trying > to remove /accumulo from ZooKeeper. It looked like a permissions issue and > I ran across this: > > > http://mail-archives.apache.org/mod_mbox/accumulo-user/201309.mbox/%3CCAGUtCHqY9eKM-modotn8YRmGR6Aus=oQkT9ys-=+v7-=oof...@mail.gmail.com%3E > > I didn't realize there might be an accumulo user that was accessing > ZooKeeper. I updated Nutch's gora.properties file to have: > > gora.datastore.accumulo.user=accumulo > > and things look like they are working. > > I'm not sure if this is the only change that caused things to start > working, but it looks like things are getting injected successfully. > > > On Thu, Nov 14, 2013 at 4:33 PM, Lewis John Mcgibbney < > [email protected]> wrote: > >> Hi Jon, >> >> Glad to hear that your making some more progress! >> >> On Thu, Nov 14, 2013 at 8:45 PM, <[email protected]> >> wrote: >> >> > >> > So I think it has to do with Accumulo somehow. I reverted the >> > conf/gora.properties setting for mock from false to: >> > >> > gora.datastore.accumulo.mock=true >> > >> > and re-building and re-running the runtime deploy job completed >> > successfully. Trying to see if I can track down the issue. >> > >> > >> > >> I am not sure about this approach. Have you tried editing the >> gora.datastore.accumulo.zookeepers=localhost property to the IP for the >> Zookeeper(s) server? I am not certain that simulating a mock datastore is >> the way to go here. >> AccumuloStore contains the following code >> >> try { >> if (mock == null || !mock.equals("true")) { >> String instance = DataStoreFactory.findProperty(properties, >> this, INSTANCE_NAME_PROPERTY, null); >> String zookeepers = >> DataStoreFactory.findProperty(properties, this, >> ZOOKEEPERS_NAME_PROPERTY, null); >> conn = new ZooKeeperInstance(instance, >> zookeepers).getConnector(user, password); >> authInfo = new AuthInfo(user, >> ByteBuffer.wrap(password.getBytes()), >> conn.getInstance().getInstanceID()); >> } else { >> conn = new MockInstance().getConnector(user, password); >> } >> >> This to me indicates that if you want to create the persistent data store >> then you would edit the mock property to boolean false which will take you >> in to the if block. Then you are just searching for configuration >> properties for the Accumulo server instance, zookeeper server instance and >> usename and password from gora.proerties >> hth, please let us know how you get on... and also how the AccumuloStore >> is >> working. AFAIK it is one of the lesser used data stores so we are always >> keen to hear of user experiences, etc. >> Thanks >> Lewis >> > > > > -- > Jon Uhal > -- Jon Uhal

