But how to explain that within an hour (after commit) I have had about
500,000 new documents, and within 30 hours (after commit) only 1,300,000?

Same _random_enough_ documents... 

BTW, SOLR Console was showing only few hundreds "deletesById" although I
don't use any deleteById explicitly; only "update" with "allowOverwrite" and
"uniqueId".




markrmiller wrote:
> 
> I'd say you have a lot of documents that have the same id.
> When you add a doc with the same id, first the old one is deleted, then
> the
> new one is added (atomically though).
> 
> The deleted docs are not removed from the index immediately though - the
> doc
> id is just marked as deleted.
> 
> Over time though, as segments are merged due to hitting triggers while
> adding new documents, deletes are removed (which deletes depends on which
> segments have been merged).
> 
> So if you add a tone of documents over time, many with the same ids, you
> would likely see this type of maxDoc, numDoc churn. maxDoc will include
> deleted docs while numDoc will not.
> 
> 
> -- 
> - Mark
> 
> http://www.lucidimagination.com
> 
> On Mon, Aug 17, 2009 at 11:09 PM, Funtick <f...@efendi.ca> wrote:
> 
>>
>> After running an application which heavily uses MD5 HEX-representation as
>> <uniqueKey> for SOLR v.1.4-dev-trunk:
>>
>> 1. After 30 hours:
>> 101,000,000 documents added
>>
>> 2. Commit:
>> numDocs = 783,714
>> maxDoc = 3,975,393
>>
>> 3. Upload new docs to SOLR during 1 hour(!!!!!!!), then commit, then
>> optimize:
>> numDocs=1,281,851
>> maxDocs=1,281,851
>>
>> It looks _extremely_ strange that within an hour I have such a huge
>> increase
>> with same 'average' document set...
>>
>> I am suspecting something goes wrong with Lucene buffer flush / index
>> merge
>> OR SOLR - Unique ID handling...
>>
>> According to my own estimates, I should have about 10,000,000 new
>> documents
>> now... I had 0.5 millions within an hour, and 0.8 mlns within a day; same
>> 'random' documents.
>>
>> This morning index size was about 4Gb, then suddenly dropped below 0.5
>> Gb.
>> Why? I haven't issued any "commit"...
>>
>> I am using ramBufferMB=8192
>>
>>
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/SOLR-%3CuniqueKey%3E---extremely-strange-behavior%21-Documents-disappeared...-tp25017728p25017728.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: 
http://www.nabble.com/SOLR-%3CuniqueKey%3E---extremely-strange-behavior%21-Documents-disappeared...-tp25017728p25017826.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to