The exception speaks about the problem:

java.lang.RuntimeException: java.lang.IllegalArgumentException: Illegal first
character <46> at 0.
User-space table names can only start with 'word characters': i.e.
[a-zA-Z_0-9]: ./crawl/_webpage

The crawlId passed must follow the regex [a-zA-Z_0-9]. The one you passed
has dot and slash.
$ ./bin/nutch inject urls/ -crawlId ./crawl/

Try this:
$ ./bin/nutch inject urls/ -crawlId crawl



On Fri, May 17, 2013 at 12:47 PM, <[email protected]> wrote:

> What if you do bin/nutch inject urls/ ?
>
>
>
>
>
>
> -----Original Message-----
> From: Christopher Gross <[email protected]>
> To: user <[email protected]>
> Sent: Fri, May 17, 2013 11:26 am
> Subject: error crawling
>
>
> I'm having trouble getting my nutch working.  I had it on another server
> and it was working fine.  I migrated it to a new server, and I've been
> getting nothing but problems.  My old script wasn't working right (getting
> a lot of "skipping" on the parser saying that the crawl id was null [a
> separate point of frustration]), so now I'm trying the 'newer' crawl
> script.  This one is worse, since I can't even get the inject to work.
>
> urls contains a "seed.txt" file that worked previously and contains a bunch
> of urls.  crawl is empty.
>
> from my $NUTCH_HOME directory:
>
> $ ./bin/nutch inject urls/ -crawlId ./crawl/
> InjectorJob: starting
> InjectorJob: urlDir: urls
> InjectorJob: org.apache.gora.util.GoraException:
> java.lang.RuntimeException: java.lang.IllegalArgumentException: Illegal
> first character <46> at 0. User-space table names can only start with 'word
> characters': i.e. [a-zA-Z_0-9]: ./crawl/_webpage
>         at
>
> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:167)
>         at
>
> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:135)
>         at
> org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:75)
>         at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:214)
>         at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:228)
>         at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:248)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:258)
> Caused by: java.lang.RuntimeException: java.lang.IllegalArgumentException:
> Illegal first character <46> at 0. User-space table names can only start
> with 'word characters': i.e. [a-zA-Z_0-9]: ./crawl/_webpage
>         at
> org.apache.gora.hbase.store.HBaseStore.initialize(HBaseStore.java:125)
>         at
>
> org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:102)
>         at
>
> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:161)
>         ... 7 more
> Caused by: java.lang.IllegalArgumentException: Illegal first character <46>
> at 0. User-space table names can only start with 'word characters': i.e.
> [a-zA-Z_0-9]: ./crawl/_webpage
>         at
>
> org.apache.hadoop.hbase.HTableDescriptor.isLegalTableName(HTableDescriptor.java:280)
>         at
> org.apache.hadoop.hbase.HTableDescriptor.<init>(HTableDescriptor.java:172)
>         at
> org.apache.hadoop.hbase.HTableDescriptor.<init>(HTableDescriptor.java:158)
>         at
>
> org.apache.gora.hbase.store.HBaseMapping$HBaseMappingBuilder.build(HBaseMapping.java:171)
>         at
> org.apache.gora.hbase.store.HBaseStore.readMapping(HBaseStore.java:592)
>         at
> org.apache.gora.hbase.store.HBaseStore.initialize(HBaseStore.java:111)
>         ... 9 more
>
> Where is the "_webpage" coming from?  Am I just missing something?
>
> Any help/ideas/references would be appreciated.
>
> Thanks!
>
> -- Chris
>
>
>

Reply via email to