I have an index of about 4M documents with an external file field
configured to do boosting based on pagerank scores of each document. The
pagerank file is about 93MB as of today -- it's pretty big.

Each day, I add about 1,000 new documents to the index, and I need them to
be available as soon as possible so that I can send out alerts to our users
about new content (this is Google Alerts, essentially).

Soft commits seem to be exactly the thing for this, but whenever I open a
new searcher (which soft commits seem to do), the external file is
reloaded, and all queries are halted until it finishes loading. When I just
measured, this took about 30 seconds to complete. Most soft commit
documentation talks about setting up soft commits with <maxtime> of about a
second.

Is there anything I can do to make the external file field not get reloaded
constantly? It only changes about once a month, and I want to use soft
commits to power the alerts feature.

Thanks,

Mike

Reply via email to