Re: Re: Re: Re: Recommended Update Batch Size?

2006-11-02 Thread Mike Klaas
On 11/2/06, Yonik Seeley <[EMAIL PROTECTED]> wrote: On 11/2/06, Mike Klaas <[EMAIL PROTECTED]> wrote: > The one thing I'm worried about is closing the writer while documents > are being added to it. IndexWriter is nominally thread-safe, but I'm > not sure what happens to documents that are being

Re: Re: Re: Recommended Update Batch Size?

2006-11-02 Thread Yonik Seeley
On 11/2/06, Mike Klaas <[EMAIL PROTECTED]> wrote: The one thing I'm worried about is closing the writer while documents are being added to it. IndexWriter is nominally thread-safe, but I'm not sure what happens to documents that are being added at the time. Looking at IndexWriter.java, it seems l

Re: Re: Re: Recommended Update Batch Size?

2006-11-02 Thread Mike Klaas
On 11/2/06, Yonik Seeley <[EMAIL PROTECTED]> wrote: On 11/1/06, Mike Klaas <[EMAIL PROTECTED]> wrote: > DUH2.doDeletions() would also highly benefit from sorting the id terms > before looking them up in these types of cases (as it would trigger > optimizations in lucene as well as being kinder to

Re: Recommended Update Batch Size?

2006-11-02 Thread Walter Underwood
A quick update on my experiments with update rate: * 20 docs/sec using one wget call per POST * 170 docs/sec using single doc POST over a persistent HTTP connection * 250 docs/sec using 20 doc batches over persistent HTTP * 250 docs/sec using 100 doc batches over persistent HTTP The latter three

Re: Re: Recommended Update Batch Size?

2006-11-02 Thread Yonik Seeley
On 11/1/06, Mike Klaas <[EMAIL PROTECTED]> wrote: DUH2.doDeletions() would also highly benefit from sorting the id terms before looking them up in these types of cases (as it would trigger optimizations in lucene as well as being kinder to the os' read-ahead buffers). Hmmm, good point. I wonde

Re: Re: Recommended Update Batch Size?

2006-11-01 Thread Mike Klaas
On 10/31/06, Yonik Seeley <[EMAIL PROTECTED]> wrote: Bigger batches before a commit will be more efficient in general... the only state that Solr keeps around before a commit is a HashTable entry per unique id deleted or overwritten. You might be able to do your entire collection. Note that _s

Re: Recommended Update Batch Size?

2006-10-31 Thread Chris Hostetter
: Right, I meant per HTTP POST. I was wondering about parallel : update requests, so thanks for that info. --wunder FYI: the last time i looked into it, there really wasn't any benefit in sending multiple docs in a single /update POST request compared to using Keep-Alive. -Hoss

Re: Recommended Update Batch Size?

2006-10-31 Thread Walter Underwood
On 10/31/06 12:54 PM, "Mike Klaas" <[EMAIL PROTECTED]> wrote: > On 10/31/06, Walter Underwood <[EMAIL PROTECTED]> wrote: >> What is a good size for batching updates? My xml update docs are >> around 600-700 bytes each right now. > > When I think of "batches" I think of documents sent before a > ,

Re: Recommended Update Batch Size?

2006-10-31 Thread Yonik Seeley
On 10/31/06, Walter Underwood <[EMAIL PROTECTED]> wrote: What is a good size for batching updates? My xml update docs are around 600-700 bytes each right now. There are two types of batches... documents per request (I wouldn't go too big here) and documents added before a commit. Bigger batche

Re: Recommended Update Batch Size?

2006-10-31 Thread Mike Klaas
On 10/31/06, Walter Underwood <[EMAIL PROTECTED]> wrote: What is a good size for batching updates? My xml update docs are around 600-700 bytes each right now. When I think of "batches" I think of documents sent before a , but it seems like you are talking about the number of documents sent in a

Recommended Update Batch Size?

2006-10-31 Thread Walter Underwood
What is a good size for batching updates? My xml update docs are around 600-700 bytes each right now. wunder -- Walter Underwood Search Guru, Netflix