[ 
https://issues.apache.org/jira/browse/SOLR-4086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13536180#comment-13536180
 ] 

James Dyer commented on SOLR-4086:
----------------------------------

Dominik,

Thank you for reporting this!  Could you post your data-config.xml so I can get 
an idea of which features you're using?  Are these full or delta imports?  Are 
you using DateFormatEvaluator ( ${dataimporter.functions.formatDate(...) ) ? 

This issue involved extensive rework to the VariableResolver(Impl) class, and 
to the Evaluator framework, with the aim on making the code easier to 
understand and to maintain.  Whenever this type of change is made, it is always 
possible that the new implementation will suffer performance-wise.  I will look 
with you on this: We certainly do not want massive performance decreases to get 
released. :)  

One thing that jumps out at me is the old version stored a cache of parsed-out 
template strings (ex: ${foo.bar} ) in a HashMap.  I was worried that this could 
potentially consume too much memory and changed this to a WeakHashMap.  But if 
your JVM is clearing out these WeakReferences frequently it might require a lot 
more work to keep re-parsing these strings.  

Another problem could be with DateFormatEvaluator.  It used to keep a single 
instance of SimpleDateFormat (per-thread) and always use that as the 
"from-date-format" but now that Locales are involved this doesn't work.  To 
alleviate this it only creates one Dateformat per pattern/locale combination 
and caches these, also in a WeakHashMap.  This also might suffer from 
WeakReferences going away, but more seriously, I think having this map as an 
instance variable here entirely defeats its purpose if the VariableResolver is 
creating a new instance each time it is being used.  So I need to look into 
that.

probably the biggest change is DateFormatEvaluator
                
> Refactor DIH - VariableResolver & Evaluator
> -------------------------------------------
>
>                 Key: SOLR-4086
>                 URL: https://issues.apache.org/jira/browse/SOLR-4086
>             Project: Solr
>          Issue Type: Improvement
>          Components: contrib - DataImportHandler
>    Affects Versions: 4.0
>            Reporter: James Dyer
>            Assignee: James Dyer
>            Priority: Minor
>             Fix For: 4.1, 5.0
>
>         Attachments: SOLR-4086.patch
>
>
> This simplifies VariableResolver and moves each built-in Evaluator into its 
> own class.  Compiler warnings / missing generics are fixed.  Also, the Locale 
> bug with DateFormatEvaluator is solved.  Instead of using the machine 
> default, the Root Locale is used by default.  An optional 3rd parameter 
> allows users to specify whatever locale they want.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to