[
https://issues.apache.org/jira/browse/SOLR-4086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13537437#comment-13537437
]
James Dyer commented on SOLR-4086:
----------------------------------
Dominik,
I've been testing the new DIH code with my data and if anything the newer
version seems slightly faster than the older one (I'm comparing the latest
snapshot with one from a couple of months ago, but I only updated the DIH jar
so as to isolate DIH changes). I then disabled the WeakHashMap in
VariableResolver and tried again and it didn't seem to be much slower, if any
(makes me wonder if caching here at all is misguided). Now I'm running it with
a TemplateTransformer on a child entity that has multiple children per parent
and it still doesn't seem to have slowed down. (The changes with this issue
could have dramatically slowed TemplateTransformer if I made a mistake...) The
data I'm indexing has about 50 child entities so the VariableResolver gets
plenty of exercise matching keys with the parent. I also wonder that because
you're seeing slowdowns of 3x of what you had before if perhaps something else
isn't going on? I doubt DIH's overhead is nearly enough to cause something
like that.
Can you try and narrow the cause down? Here are the steps I would take:
- revert back to the old Solr & DIH and re-index. Verify you get the old
"good" performance back.
- Just upgrade DIH and not the rest of Solr. Verify the performance is "bad"
again, and that the cause is something in DIH.
- If in DIH, try eliminating 1 feature at a time:
- Try it without use of TemplateTransformer.
- Try it without evaluators.
- Try eliminating child entities to see if one particular child is causing
the difficulties
If this is indeed caused by DIH changes, it is something that you use that I am
not testing (properly or at all) on my end.
> Refactor DIH - VariableResolver & Evaluator
> -------------------------------------------
>
> Key: SOLR-4086
> URL: https://issues.apache.org/jira/browse/SOLR-4086
> Project: Solr
> Issue Type: Improvement
> Components: contrib - DataImportHandler
> Affects Versions: 4.0
> Reporter: James Dyer
> Assignee: James Dyer
> Priority: Minor
> Fix For: 4.1, 5.0
>
> Attachments: SOLR-4086.patch
>
>
> This simplifies VariableResolver and moves each built-in Evaluator into its
> own class. Compiler warnings / missing generics are fixed. Also, the Locale
> bug with DateFormatEvaluator is solved. Instead of using the machine
> default, the Root Locale is used by default. An optional 3rd parameter
> allows users to specify whatever locale they want.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]