Hi all, Sorry for the noise - I finally realized that the script I was running was using some Java code (EnwikiContentSource, from Lucene benchmark) to explicitly set up fields and then push the results to Solr.
-- Ken ================================================================== Where's the documentation on the WikipediaTokenizer? Specifically I'm wondering how pieces from the source XML get mapped to field names in the Solr schema. For example, <revision><timestamp> seems to be going into the "date" field for an example schema I've got. And <revision><text> goes into "body". But is there any way to get <revision><contributor><username>, for example? Thanks, -- Ken -------------------------- Ken Krugler +1 530-210-6378 http://www.scaleunlimited.com custom big data solutions & training Hadoop, Cascading, Cassandra & Solr -------------------------- Ken Krugler +1 530-210-6378 http://www.scaleunlimited.com custom big data solutions & training Hadoop, Cascading, Cassandra & Solr