[ 
https://issues.apache.org/jira/browse/SOLR-4589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-4589:
---------------------------

    Attachment: SOLR-4589.patch


I switched to using volatile per yonik's suggestion, but in the process of 
testing i encountered a test failure that has me convinced that either i'm 
going insane, or there is some weird state info that gets corrupted if you 
attempt to use an IndexReader to fetch the stored field values of a doc while a 
StoredFieldVisitor is already being used to fetch those fields -- and if i can 
encounter this kind of problem in a single threaded test, it scares me about 
any multi-threaded usages in any solr/lucene instance.

The attached patch updates the test with a bunch of nocommit System.out's as 
well as a new "sanity" test where i tried to recreate the underlying problem 
w/o using LazyDocument at all...

This is the first failure i encountered...

{noformat}
ant test  -Dtestcase=TestLazyDocument -Dtests.method=testLazy 
-Dtests.seed=553F22658CE2D9A9 -Dtests.slow=true -Dtests.locale=sk 
-Dtests.timezone=Africa/Ndjamena -Dtests.file.encoding=UTF-8

...

[junit4:junit4]    > Throwable #1: java.lang.AssertionError: fieldName count: b 
expected:<112> but was:<1229>
[junit4:junit4]    >    at 
__randomizedtesting.SeedInfo.seed([553F22658CE2D9A9:64E62F373CBE0D02]:0)
[junit4:junit4]    >    at org.junit.Assert.fail(Assert.java:93)
[junit4:junit4]    >    at org.junit.Assert.failNotEquals(Assert.java:647)
[junit4:junit4]    >    at org.junit.Assert.assertEquals(Assert.java:128)
[junit4:junit4]    >    at org.junit.Assert.assertEquals(Assert.java:472)
[junit4:junit4]    >    at 
org.apache.lucene.document.TestLazyDocument.testLazy(TestLazyDocument.java:170)
...
{noformat}

you can see from the sysout calls prior to the failure that the "needsField" 
method of the LazyTestingStoredFieldVisitor seems to be getting called the 
correct number of times, but with the wrong field name -- once it randomly 
calls {{assertNotNull(lazyDoc.getDocument());}} (which calls reader.doc(int)) 
every successive call to needsField winds up refering to the next field name 
("b") instead of all the other field names in the document.

if you comment out the {{assertNotNull(lazyDoc.getDocument());}} call, the test 
passes.

This lead me to write testHossssSanity which doesn't use any of hte lazy 
loading code at all, it just uses a simple StoredFieldsVisitor whose needsField 
method always returns NO but also arbitrarily calls {{reader.document(doc)}} 
for no reason once we start getting to the first lettered field name...

{noformat}
ant test  -Dtestcase=TestLazyDocument -Dtests.method=testHossssSanity 
-Dtests.seed=553F22658CE2D9A9 -Dtests.slow=true -Dtests.locale=sk 
-Dtests.timezone=Africa/Ndjamena -Dtests.file.encoding=UTF-8

...

[junit4:junit4]    > Throwable #1: java.lang.AssertionError: fieldName count: b 
expected:<139> but was:<1528>
[junit4:junit4]    >    at 
__randomizedtesting.SeedInfo.seed([553F22658CE2D9A9:FE3621EFA9383F64]:0)
[junit4:junit4]    >    at org.junit.Assert.fail(Assert.java:93)
[junit4:junit4]    >    at org.junit.Assert.failNotEquals(Assert.java:647)
[junit4:junit4]    >    at org.junit.Assert.assertEquals(Assert.java:128)
[junit4:junit4]    >    at org.junit.Assert.assertEquals(Assert.java:472)
[junit4:junit4]    >    at 
org.apache.lucene.document.TestLazyDocument.testHossssSanity(TestLazyDocument.java:120)
...
{noformat}

...you can again see from the sysout calls that needsField is called the 
expected number of times, but always with field named "b" (the first field name 
to exist _after_ the first field name that calls reader.document.

What finally convinced me that i wasn't insane was when i found a diffenret 
seed that produced a different low error at a lower level...

{noformat}
ant test  -Dtestcase=TestLazyDocument -Dtests.method=testHossssSanity 
-Dtests.seed=4C348A26E496AC38 -Dtests.slow=true -Dtests.locale=fr_CH 
-Dtests.timezone=Antarctica/Vostok -Dtests.file.encoding=UTF-8

...

[junit4:junit4]    > Throwable #1: java.lang.AssertionError: bits=5f
[junit4:junit4]    >    at 
__randomizedtesting.SeedInfo.seed([4C348A26E496AC38:E73D89ACC14C4AF5]:0)
[junit4:junit4]    >    at 
org.apache.lucene.codecs.lucene40.Lucene40StoredFieldsReader.visitDocument(Lucene40StoredFieldsReader.java:155)
[junit4:junit4]    >    at 
org.apache.lucene.index.SegmentReader.document(SegmentReader.java:139)
[junit4:junit4]    >    at 
org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:116)
[junit4:junit4]    >    at 
org.apache.lucene.document.TestLazyDocument.testHossssSanity(TestLazyDocument.java:118)
...
{noformat}

...which suggests to me that maybe there really is a bug in the underlying 
StoredFieldsVisitor logic (or perhaps just in the Lucene40StoredFieldsReader) ?

                
> 4.x + enableLazyFieldLoading + large multivalued fields + varying fl = 
> pathological CPU load & response time
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-4589
>                 URL: https://issues.apache.org/jira/browse/SOLR-4589
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 4.0, 4.1, 4.2
>            Reporter: Hoss Man
>            Assignee: Hoss Man
>             Fix For: 4.3, 5.0, 4.2.1
>
>         Attachments: SOLR-4589.patch, SOLR-4589.patch, SOLR-4589.patch, 
> SOLR-4589.patch, test-just-queries.out__4.0.0_mmap_lazy_using36index.txt, 
> test-just-queries.sh, test.out__3.6.1_mmap_lazy.txt, 
> test.out__3.6.1_mmap_nolazy.txt, test.out__3.6.1_nio_lazy.txt, 
> test.out__3.6.1_nio_nolazy.txt, test.out__4.0.0_mmap_lazy.txt, 
> test.out__4.0.0_mmap_nolazy.txt, test.out__4.0.0_nio_lazy.txt, 
> test.out__4.0.0_nio_nolazy.txt, test.out__4.2.0_mmap_lazy.txt, 
> test.out__4.2.0_mmap_nolazy.txt, test.out__4.2.0_nio_lazy.txt, 
> test.out__4.2.0_nio_nolazy.txt, test.sh
>
>
> Following up on a [user report of exterme CPU usage in 
> 4.1|http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201302.mbox/%[email protected]%3E],
>  I've discovered that the following combination of factors can result in 
> extreme CPU usage and excessively HTTP response times...
> * Solr 4.x (tested 3.6.1, 4.0.0, and 4.2.0)
> * enableLazyFieldLoading == true (included in example solrconfig.xml)
> * documents with a large number of values in multivalued fields (eg: tested 
> ~10-15K values)
> * multiple requests returning the same doc with different "fl" lists
> I haven't dug into the route cause yet, but the essential observations is: if 
> lazyloading is used in 4.x, then once a document has been fetched with an 
> initial fl list X, subsequent requests for that document using a differnet fl 
> list Y can be many orders of magnitute slower (while pegging the CPU) -- even 
> if those same requests using fl Y uncached (or w/o lazy laoding) would be 
> extremely fast.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to