Just trying to speed things up.  The solrindex currently takes quite a
while because it's reindexing all of my documents instead of just the
latest segment.


On Wed, May 1, 2013 at 1:43 PM, AC Nutch <[email protected]> wrote:

> I have also run into this issue. Our problem was that we were performing
> analysis on the URLs in Solr and adding data in various fields which get
> overwritten at the next index. We had to edit the source to fix our issue.
>
> In terms of solving it - what is your main issue with that? Is it that you
> are looking for a more efficient workflow or is it something else?
>
>
>
>
> On Wed, May 1, 2013 at 7:32 AM, Bai Shen <[email protected]> wrote:
>
> > My crawl loop consists of the following.
> >
> > generate -topN
> > fetch -all
> > parse -all
> > updatedb
> > solrindex -all
> >
> > With the fetch and parse the -all only pulls the batch that was
> generated,
> > skipping all of the other urls.  However, the solrindex seems to be
> > equivalent to -reindex, commiting everything not just what hasn't been
> > sent.
> >
> > Anyone else run into this issue?
> >
> > Thanks.
> >
>

Reply via email to