Not sure what was going on, so I deleted the core, and the underlying folders under solr/server/nutch, bounced the solr service, and the schema browser in the Solr interface shows now schema as expected.

If it put a single document it (i.e. a single URL in the seed list, then run inject, generate,fetch,parse,update and solrindex) then all is well. The schema browser in the Solr interface is showing the digest field as string.

If I then run "bin/crawl ..." it adds some more documents (as expected) but ultimiatly dies with the ClassCastException. Like I have a bad document in the index ?

But still, Solr schema browser is showing the digest field as string (as before) and my documents are listed (via the solr query web interface) as having string digests too !

Tom


On 14/10/16 15:57, Tom Chiverton wrote:
I don't understand what you mean here. I am not a Solr expert, though I've used it a bit in the past, though not with Nutch.

Is there a schema I should be feeding it ?

Tom


On 14/10/16 15:50, Markus Jelsma wrote:
Solr supports schemaless mode, which may be your case. Perhaps it made your digest field multi valued. I'd suggest to use Solr's classic schema factory, and a fixed schema.



______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com
______________________________________________________________________

Reply via email to