The exception speaks about the problem: java.lang.RuntimeException: java.lang.IllegalArgumentException: Illegal first character <46> at 0. User-space table names can only start with 'word characters': i.e. [a-zA-Z_0-9]: ./crawl/_webpage
The crawlId passed must follow the regex [a-zA-Z_0-9]. The one you passed has dot and slash. $ ./bin/nutch inject urls/ -crawlId ./crawl/ Try this: $ ./bin/nutch inject urls/ -crawlId crawl On Fri, May 17, 2013 at 12:47 PM, <[email protected]> wrote: > What if you do bin/nutch inject urls/ ? > > > > > > > -----Original Message----- > From: Christopher Gross <[email protected]> > To: user <[email protected]> > Sent: Fri, May 17, 2013 11:26 am > Subject: error crawling > > > I'm having trouble getting my nutch working. I had it on another server > and it was working fine. I migrated it to a new server, and I've been > getting nothing but problems. My old script wasn't working right (getting > a lot of "skipping" on the parser saying that the crawl id was null [a > separate point of frustration]), so now I'm trying the 'newer' crawl > script. This one is worse, since I can't even get the inject to work. > > urls contains a "seed.txt" file that worked previously and contains a bunch > of urls. crawl is empty. > > from my $NUTCH_HOME directory: > > $ ./bin/nutch inject urls/ -crawlId ./crawl/ > InjectorJob: starting > InjectorJob: urlDir: urls > InjectorJob: org.apache.gora.util.GoraException: > java.lang.RuntimeException: java.lang.IllegalArgumentException: Illegal > first character <46> at 0. User-space table names can only start with 'word > characters': i.e. [a-zA-Z_0-9]: ./crawl/_webpage > at > > org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:167) > at > > org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:135) > at > org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:75) > at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:214) > at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:228) > at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:248) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:258) > Caused by: java.lang.RuntimeException: java.lang.IllegalArgumentException: > Illegal first character <46> at 0. User-space table names can only start > with 'word characters': i.e. [a-zA-Z_0-9]: ./crawl/_webpage > at > org.apache.gora.hbase.store.HBaseStore.initialize(HBaseStore.java:125) > at > > org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:102) > at > > org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:161) > ... 7 more > Caused by: java.lang.IllegalArgumentException: Illegal first character <46> > at 0. User-space table names can only start with 'word characters': i.e. > [a-zA-Z_0-9]: ./crawl/_webpage > at > > org.apache.hadoop.hbase.HTableDescriptor.isLegalTableName(HTableDescriptor.java:280) > at > org.apache.hadoop.hbase.HTableDescriptor.<init>(HTableDescriptor.java:172) > at > org.apache.hadoop.hbase.HTableDescriptor.<init>(HTableDescriptor.java:158) > at > > org.apache.gora.hbase.store.HBaseMapping$HBaseMappingBuilder.build(HBaseMapping.java:171) > at > org.apache.gora.hbase.store.HBaseStore.readMapping(HBaseStore.java:592) > at > org.apache.gora.hbase.store.HBaseStore.initialize(HBaseStore.java:111) > ... 9 more > > Where is the "_webpage" coming from? Am I just missing something? > > Any help/ideas/references would be appreciated. > > Thanks! > > -- Chris > > >

