[
https://issues.apache.org/jira/browse/SOLR-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14035910#comment-14035910
]
David Smiley commented on SOLR-3585:
------------------------------------
I like the new name to align with CUSS. I like that you pass through the
logger. I like the name ThreadBurster. I like your new processRollback logic.
You're right to not mess with CUSS logic (e.g. choice of locks); addressing any
of that now is a bad idea.
RE IOExceptions... see handleError() which does the wrapping in CUPF.
Furthermore, consider having an OOME pass straight through.
One thing not obvious to me (that should be clarified in comments/docs) is that
the "next" URP is actually used in addition to the separately constructed
backing chain. Only added docs go through threads; everything else falls
through to the next URP. CUSS works that way too, it appears, but it's not
obvious.
I like how you figured out how to create the backing chain of remaining URPs!
Although it's unfortunate this requires the re-creation of them but no biggie.
Also, I think I see a bug. UPRChain.createProcessor doesn't create any URPs
prior to DistributingUpdateProcessorFactory (which is an interface) when
receiving a distributed update. You could fix this by inserting
NoOpDistributingUpdateProcessorFactory at the head of the factories when
creating the chain.
Testing: Since its semantics are aligned with default behavior, it would be
nice to try and (randomly) use this URP in Solr's tests. If it proves to be too
awkward; you should instead write a distributed test. To augment the existing
testing, you could add to the default test solrconfig a chain that includes
this URP along with the other URPs one normally gets by default. Then,
assuming there is some spot that a test submits an update request that has
access to the parameters, give a ~10% chance to adding update.chain=concurrent
if update.chain isn't present. Hopefully this isn't gloriously hacky even if
it it's "just" for tests. Even if this specific aspect doesn't get committed,
it'd be great to at least do locally to find more bugs.
bq. the motivation of introducing Callback is to have single thread delegate.
It's not necessary for CUSS (where stateless http client is used). But update
processors shouldn't be shared across threads. if we go the way you propose we
need to keep update processor chains in ThreadLocal, which I'm not really happy
about.
I understand that Update processors shouldn't be shared across threads (yet
another Solr undocumented semantic; ugh), but It's still not apparent to me the
way you coded it is necessary (and I agree on avoiding ThreadLocal). I think
I'll try and change it and see if the test breaks. It's also unclear why you
clone the AddUpdateCommand in processAdd(). There should be a so-called
"happens-before" relationship between when the object is placed on the queue
and after a thread takes it to run it.
Thanks for your continuing effort! I'll try and get it committed next week,
assuming the remaining points are addressed. I'll probably take a stab at
improving the javadoc & comment wording prior to doing so.
> processing updates in multiple threads
> --------------------------------------
>
> Key: SOLR-3585
> URL: https://issues.apache.org/jira/browse/SOLR-3585
> Project: Solr
> Issue Type: Improvement
> Components: update
> Affects Versions: 4.0-ALPHA, 5.0
> Reporter: Mikhail Khludnev
> Attachments: SOLR-3585.patch, SOLR-3585.patch, SOLR-3585.patch,
> SOLR-3585.patch, multithreadupd.patch, report.tar.gz
>
>
> Hello,
> I'd like to contribute update processor which forks many threads which
> concurrently process the stream of commands. It may be beneficial for users
> who streams many docs through single request.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]