It wouldn't be that bad to merge the index externally and the reindex
the results, if it is as simple as your example. Search for id:[1 TO *]
and a fq for the category, increment the slice of the results you need
to process until you have covered all of the docs in the category.
Request the content field and extract them from the xml responses and
save them "somewhere". When you have all the info, reindex it. 

Am Mittwoch, den 17.09.2008, 10:00 -0400 schrieb Erick Erickson:
> You *might* be able to reconstruct enough of the "original" documents
> from your indexes to create another without recrawling. I know Luke
> can reconstruct documents form an index, but for unstored data it's
> slow and may be lossy.
> 
> But it may suit your needs given how long it takes to make your index
> in the first place.
> 
> Best
> Erick
> 
> On Tue, Sep 16, 2008 at 9:14 PM, Gene Campbell <[EMAIL PROTECTED]> wrote:
> 
> > I was pretty sure you'd say that.  But, I means lots that you take the
> > time to confirm it.  Thanks Otis.
> >
> > I don't want to give details, but we crawl for our data, and we don't
> > save it in a DB or on disk.  It goes from download to index.  Was a
> > good idea at the time; when we thought our designs were done evolving.
> >  :)
> >
> > cheers
> > gene
> >
> >
> > On Wed, Sep 17, 2008 at 12:51 PM, Otis Gospodnetic
> > <[EMAIL PROTECTED]> wrote:
> > > You can't copy+merge+flatten indices like that.  Reindexing would be the
> > easiest.  Indexing taking weeks sounds suspicious.  How much data are you
> > reindexing and how big are your indices?
> > >
> > > Otis
> > > --
> > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> > >
> > >
> > >
> > > ----- Original Message ----
> > >> From: ristretto.rb <[EMAIL PROTECTED]>
> > >> To: solr-user@lucene.apache.org
> > >> Sent: Tuesday, September 16, 2008 8:14:16 PM
> > >> Subject: How to copy a solr index to another index with a different
> > schema collapsing stored data?
> > >>
> > >> Is it possible to copy stored index data from index to another, but
> > >> concatenating it as you go.
> > >>
> > >> Suppose 2 categories A and B both with 20 docs, for a total of 40 docs
> > >> in the index.  The index has a stored field for the content from the
> > >> docs.
> > >>
> > >> I want a new index with only two docs in it, one for A and one for B.
> > >> And it would have a stored field that is the sum of all the stored
> > >> data for the 20 docs of A and of B respectively.
> > >>
> > >> So, then a query on this index will tell me give me a relevant list of
> > >> Categories?
> > >>
> > >> Perhaps there's a solr query to get that data out, and then I can
> > >> handle concatenating it, and then indexing it in the new index.
> > >>
> > >> I'm hoping I don't have to reindex all this data from scratch?  It has
> > >> taken weeks!
> > >>
> > >> thanks
> > >> gene
> > >
> > >
> >

Reply via email to