[
https://issues.apache.org/jira/browse/SOLR-2947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13187889#comment-13187889
]
James Dyer commented on SOLR-2947:
----------------------------------
I think this patch does the right thing here, calling "destroy()" down the
hierarchy of EntityProcessors, but waiting until after doc-building is
complete. While I had it this way for the single-threaded code, I punted on
the multi-threaded case simply hoping that because the unit tests were passing
then everything would be alright :) . I appreciate the effort to improve the
DIH multithreaded code. We really need to get rid of bugs like this and
long-term it would pay if we could try and make the code more maintainable, get
better test coverage, etc.
An example is the new "children()" method...using just the first
ThreadedEntityProcessorWrapper from the list I think is valid because the
"children" will be same on all the threads. But then again, looking at how
this all gets populated in the ThreadedEntityProcessorWrapper constructor, the
answer (to me) isn't obvious. Best I can say is this is probably correct and
certainly a vast improvement than what is currently in Trunk.
Small point here but I prefer the TestEphemeralCache changes I made in the Dec
11, 2011 patch version. I switched to building the config file on-the-fly and
testMultiThreaded() uses a random number of threads instead of always using 10.
Of course, if we go with this then we'd need to add "@Ignore" for
testMultiThreaded() until SOLR-3011 can be commited.
> DIH caching bug - EntityRunner destroys child entity processor
> --------------------------------------------------------------
>
> Key: SOLR-2947
> URL: https://issues.apache.org/jira/browse/SOLR-2947
> Project: Solr
> Issue Type: Sub-task
> Components: contrib - DataImportHandler
> Affects Versions: 4.0
> Reporter: Mikhail Khludnev
> Labels: noob
> Fix For: 4.0
>
> Attachments: SOLR-2947.patch, SOLR-2947.patch, SOLR-2947.patch,
> SOLR-2947.patch, SOLR-2947.patch, dih-cache-destroy-on-threads-fix.patch,
> dih-cache-threads-enabling-bug.patch
>
>
> My intention is fix multithread import with SQL cache. Here is the 2nd stage.
> If I enable DocBuilder.EntityRunner flow even for single thread, it breaks
> the pretty basic functionality: parent-child join.
> the reason is [line 473
> entityProcessor.destroy();|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/DocBuilder.java?revision=1201659&view=markup]
> breaks children entityProcessor.
> see attachement comments for more details.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]