We've found that the solrindex process chokes on the custom metadata fields I added to my Nutch using the urlmeta plugin. A sample of the lengthy error messages:
java.lang.Exception: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://localhost:8983/solr/phfaws: ERROR: [doc=http://academics.wellesley.edu/lts/archives/3/3L_Astronomy.html] unknown field 'icosreposurl' at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) As mentioned in my previous message, I've copied my Nutch schema.xml into my Solr's conf folder, but since my Solr instance hadn't already had a schema.xml file I'm not convinced it's being read.. How do I set up my Solr to take these new fields? Chip ________________________________________ From: Chip Calhoun [[email protected]] Sent: Friday, February 03, 2017 11:45 AM To: [email protected] Subject: Failing to index from Nutch 1.12 to Solr 5.5.3 I'm switching to more recent Nutch/Solr, after years of using Nutch 1.4 and Solr 3.3.0. I get no results when I index into Solr. I can't tell where this breaks down. I use these commands: cd /opt/apache-nutch-1.12/runtime/local export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.121.x86_64 export NUTCH_CONF_DIR=/opt/apache-nutch-1.12/runtime/local/conf/phfaws bin/crawl urls/phfaws crawl/phfaws 1 bin/nutch solrindex http://localhost:8983/solr/phfaws/ crawl/phfaws/crawldb -linkdb crawl/phfaws/linkdb crawl/phfaws/segments/* I believe that Nutch is crawling properly, but I do find that the crawl folders end up about 25% as large as what I produced with Nutch 1.4. I suspect that the problem is with the Nutch/Solr integration. My Solr core didn't create a schema.xml, instead having a managed scheme. I've copied my Nutch local conf's schema.xml into Solr, but I haven't seen that I'm supposed to do anything more with that. Chip Calhoun Digital Archivist Niels Bohr Library & Archives American Institute of Physics One Physics Ellipse College Park, MD 20740 301-209-3180 https://www.aip.org/history-programs/niels-bohr-library

