Alright Lewis, Leaps and bounds over yesterday's progress. I've abandon the idea of using HBASE for now. I setup mySQL database. Launch successfully, and now fail @ IndexerJob.createIndexJob()
*gora.properties* gora.sqlstore.jdbc.driver=com.mysql.jdbc.Driver gora.sqlstore.jdbc.url=jdbc:mysql://10.100.220.220:3306/nutch?createDatabaseIfNotExist=true gora.sqlstore.jdbc.user=root gora.sqlstore.jdbc.password=pw *nutch-site.xml* <configuration> <property> <name>http.agent.name</name> <value>The Nutchess</value> </property> <property> <name>parser.character.encoding.default</name> <value>utf-8</value> <description>The character encoding to fall back to when no other information is available</description> </property> <property> <name>storage.data.store.class</name> <value>org.apache.gora.sql.store.SqlStore</value> <description>The Gora DataStore class for storing and retrieving data. Currently the following stores are available: .. </description> </property> </configuration> [root@hdpjt01 build]# sudo -u mapred hadoop jar apache-nutch-2.1.job org.apache.nutch.crawl.Crawler urls -solr http://10.100.220.220:8983/solr/ -depth 3 -topN 5 12/10/23 10:31:17 INFO input.FileInputFormat: Total input paths to process : 1 12/10/23 10:31:17 WARN snappy.LoadSnappy: Snappy native library is available 12/10/23 10:31:17 INFO util.NativeCodeLoader: Loaded the native-hadoop library 12/10/23 10:31:17 INFO snappy.LoadSnappy: Snappy native library loaded 12/10/23 10:31:18 INFO mapred.JobClient: Running job: job_201210221719_0006 12/10/23 10:31:19 INFO mapred.JobClient: map 0% reduce 0% 12/10/23 10:31:28 INFO mapred.JobClient: map 100% reduce 0% 12/10/23 10:31:29 INFO mapred.JobClient: Job complete: job_201210221719_0006 *... Several more jobs ...* 12/10/23 10:36:02 INFO mapred.JobClient: Job complete: job_201210221719_0018 12/10/23 10:36:02 INFO mapred.JobClient: Counters: 24 12/10/23 10:36:02 INFO mapred.JobClient: Job Counters 12/10/23 10:36:02 INFO mapred.JobClient: Launched reduce tasks=1 12/10/23 10:36:02 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=7562 12/10/23 10:36:02 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 12/10/23 10:36:02 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 12/10/23 10:36:02 INFO mapred.JobClient: Launched map tasks=1 12/10/23 10:36:02 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=9683 12/10/23 10:36:02 INFO mapred.JobClient: FileSystemCounters 12/10/23 10:36:02 INFO mapred.JobClient: FILE_BYTES_READ=280978 12/10/23 10:36:02 INFO mapred.JobClient: HDFS_BYTES_READ=1078 12/10/23 10:36:02 INFO mapred.JobClient: FILE_BYTES_WRITTEN=738512 12/10/23 10:36:02 INFO mapred.JobClient: Map-Reduce Framework 12/10/23 10:36:02 INFO mapred.JobClient: Map input records=688 12/10/23 10:36:02 INFO mapred.JobClient: Reduce shuffle bytes=280978 12/10/23 10:36:02 INFO mapred.JobClient: Spilled Records=3078 12/10/23 10:36:02 INFO mapred.JobClient: Map output bytes=277753 12/10/23 10:36:02 INFO mapred.JobClient: CPU time spent (ms)=9830 12/10/23 10:36:02 INFO mapred.JobClient: Total committed heap usage (bytes)=891486208 12/10/23 10:36:02 INFO mapred.JobClient: Combine input records=0 12/10/23 10:36:02 INFO mapred.JobClient: SPLIT_RAW_BYTES=1078 12/10/23 10:36:02 INFO mapred.JobClient: Reduce input records=1539 12/10/23 10:36:02 INFO mapred.JobClient: Reduce input groups=820 12/10/23 10:36:02 INFO mapred.JobClient: Combine output records=0 12/10/23 10:36:02 INFO mapred.JobClient: Physical memory (bytes) snapshot=949526528 12/10/23 10:36:02 INFO mapred.JobClient: Reduce output records=820 12/10/23 10:36:02 INFO mapred.JobClient: Virtual memory (bytes) snapshot=3199545344 12/10/23 10:36:02 INFO mapred.JobClient: Map output records=1539 Exception in thread "main" java.lang.NullPointerException at java.util.Hashtable.put(Hashtable.java:394) at java.util.Properties.setProperty(Properties.java:143) at org.apache.hadoop.conf.Configuration.set(Configuration.java:460) at org.apache.nutch.indexer.IndexerJob.createIndexJob(IndexerJob.java:128) at org.apache.nutch.indexer.solr.SolrIndexerJob.run(SolrIndexerJob.java:44) at org.apache.nutch.crawl.Crawler.runTool(Crawler.java:68) at org.apache.nutch.crawl.Crawler.run(Crawler.java:192) at org.apache.nutch.crawl.Crawler.run(Crawler.java:250) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.nutch.crawl.Crawler.main(Crawler.java:257) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:197) -- View this message in context: http://lucene.472066.n3.nabble.com/nutch-hadoop-solr-tp4014761p4015379.html Sent from the Nutch - User mailing list archive at Nabble.com.

