[ 
https://issues.apache.org/jira/browse/SOLR-4648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13622080#comment-13622080
 ] 

Andrzej Bialecki  commented on SOLR-4648:
-----------------------------------------

Thanks again for a thorough review. New patch is attached.

bq. in Solr, it's bad form to call Class.forName() ..

Fixed.

bq. minor: you're using 4-spaces indent instead of 2 in the main value loop in 
your URP

Fixed (it was actually a left-over from removing an outer-level loop, all other 
indents are 2 spaces).

bq. in those log.debug() calls, it's creating the string to potentially not 
even log it

Fixed.

bq.  Looking at what FieldType.createField() is doing, I propose you do the 
same in this URP ...

Funny thing, I had this in one version of the patch, and then decided to reuse 
SF.createField(..) to avoid code duplication. The problem is that 
SchemaField.isTokenized() has package-level visibility so it's not visible in 
the UP's package. I fixed this by providing a utility method in 
PreAnalyzedField to create a FieldType. Also, I moved there the chunk of logic 
for setting / resetting the Field content and type flags based on SchemaField. 
Overall, it simplifies the UP.

bq. The resulting "Field" instance can be shared from one document to the next 
I believe, and so you can cache this in the URP and reset its value & 
tokenStream.

Hmm, this doesn't seem feasible at all. First, this cache would have to be 
thread safe, and prevent reuse of Field instances until the document is 
actually processed by IndexWriter - I don't think there's a mechanism to 
enforce this in the context of this class? Also it would have to cache multiple 
instances of Field, because processing a single document may result in creating 
multiple instances (at least one per pre-analyzed field, more if fields are 
multi-valued).
                
> Create a PreAnalyzedUpdateProcessor
> -----------------------------------
>
>                 Key: SOLR-4648
>                 URL: https://issues.apache.org/jira/browse/SOLR-4648
>             Project: Solr
>          Issue Type: Bug
>          Components: update
>    Affects Versions: 4.3, 5.0
>            Reporter: Andrzej Bialecki 
>            Assignee: Andrzej Bialecki 
>             Fix For: 4.3, 5.0
>
>         Attachments: SOLR-4648.patch, SOLR-4648.patch, SOLR-4648.patch, 
> SOLR-4648.patch
>
>
> Spin-off from the discussion in SOLR-4619.
> Instead of using a PreAnalyzedField type we can use an UpdateRequestProcessor 
> that converts any input field values to StorableField-s, using the 
> PreAnalyzedParser-s, and then directly passes StorableField-s to 
> DocumentBuilder for indexing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to