WikipediaTokenizer documentation - never mind

Ken Krugler Thu, 03 Oct 2013 19:44:29 -0700

Hi all,

Sorry for the noise - I finally realized that the script I was running was 
using some Java code (EnwikiContentSource, from Lucene benchmark) to explicitly 
set up fields and then push the results to Solr.


-- Ken

==================================================================
Where's the documentation on the WikipediaTokenizer?

Specifically I'm wondering how pieces from the source XML get mapped to field 
names in the Solr schema.

For example, <revision><timestamp> seems to be going into the "date" field for 
an example schema I've got.

And <revision><text> goes into "body".

But is there any way to get <revision><contributor><username>, for example?

Thanks,

-- Ken

--------------------------
Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr







--------------------------
Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr

WikipediaTokenizer documentation - never mind

Reply via email to