Yes, the SolrEntityProcessor can be used for this.
If you stored the original document bodies in the Solr index!
You can also download the documents in Json or CSV format and re-upload those to old Solr. I don't know if CSV will work for your docs. If CSV works, you can directly upload what you download. If you download JSON, you have to "unwrap" the outermost structure and upload the data as an array.

There are problems with the SolrEntityProcessor.1) It is single-threaded. 2) If you 'copyField' to a field, and store that field, you have to be sure not to reload the contents of the field, because you will add a new copy from the 'source' field.

On 03/01/2013 04:48 AM, Alexandre Rafalovitch wrote:
What about SolrEntityProcessor in DIH?
https://wiki.apache.org/solr/DataImportHandler#SolrEntityProcessor

Regards,
     Alex.

Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


On Fri, Mar 1, 2013 at 5:16 AM, Dotan Cohen <dotanco...@gmail.com> wrote:

On Fri, Mar 1, 2013 at 11:59 AM, Rafał Kuć <r....@solr.pl> wrote:
Hello!

I assumed that re-indexing can be painful in your case, if it wouldn't
you probably would re-index by now :) I guess (didn't test it myself),
that you can create another collection inside your cluster, use the
old codec for Lucene 4.0 (setting the version in solrconfig.xml should
be enough) and re-indexing, but still re-indexing will have to be
done. Or maybe someone knows a better way ?

Will I have to reindex via an external script bridging, such as a
Python script which requests N documents at a time, indexes them into
Solr 4.1, then requests another N documents to index? Or is there
internal Solr / Lucene facility for this? I've actually looked for
such a facility, but as I am unable to find such a thing I ask.


--
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com


Reply via email to