[
https://issues.apache.org/jira/browse/SOLR-2947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13186263#comment-13186263
]
Mikhail Khludnev commented on SOLR-2947:
----------------------------------------
bq. ...why is it sufficient to get the children from only the first element of
that iterator?
* yes it is. just because DIH code works only with threads=1. assert
entityProcessorWrapper.size()==1. but if we set threads to 2 or more, every
TEPW will have own collection of children' entity runners (and every ER will
have own EP). I address it in SOLR-3011 see my explanation above
"DocBuilder.java/object instantiating".
bq. are there any situations where the iterator may not return any elements?
* EntityRunner.<init> declares size of this list 1 by default. if user
configured it to 0 EntityRunner.run() method will fail due to
entityProcessorWrapper.get(0).
bq. at first i thought this was because all of the
ThreadedEntityProcessorWrapper instances were initialized with identical
EntityProcessors
done in SOLR-3011 patch.
{code}
for (int i = 0; i < threads; i++) {
thepw = new ThreadedEntityProcessorWrapper( ...
childrenRunners, i); <-- identical EntityProcessors
entityProcessorWrapper.add(thepw);
}
{code}
bq. Is this something that's just a hack that only works with 1 thread
until/unless SOLR-3011 is resolved?
in according to subj it's a fix for a bug introduced by SOLR-2384. threads=2
didn't work before, but now threads=1 doesn't work too. I removed wrong EP
destroying, and implement destroying which works now for threads=1, and will
work for threads=10 after SOLR-3011.
bq. will this break things for people using multithreaded DIH without nested
entities?
test passed, you know. it even will destroy root EP properly because the
collecting of ERs in doFullDump() starts from placing root ER into result.
bq. Either way: a comment clarifying that monstrosity would be a good idea.
I can't distinguish which monstrosity you refer to.
> DIH caching bug - EntityRunner destroys child entity processor
> --------------------------------------------------------------
>
> Key: SOLR-2947
> URL: https://issues.apache.org/jira/browse/SOLR-2947
> Project: Solr
> Issue Type: Sub-task
> Components: contrib - DataImportHandler
> Affects Versions: 4.0
> Reporter: Mikhail Khludnev
> Labels: noob
> Fix For: 4.0
>
> Attachments: SOLR-2947.patch, SOLR-2947.patch, SOLR-2947.patch,
> SOLR-2947.patch, dih-cache-destroy-on-threads-fix.patch,
> dih-cache-threads-enabling-bug.patch
>
>
> My intention is fix multithread import with SQL cache. Here is the 2nd stage.
> If I enable DocBuilder.EntityRunner flow even for single thread, it breaks
> the pretty basic functionality: parent-child join.
> the reason is [line 473
> entityProcessor.destroy();|http://svn.apache.org/viewvc/lucene/dev/trunk/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/DocBuilder.java?revision=1201659&view=markup]
> breaks children entityProcessor.
> see attachement comments for more details.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]