[ 
https://issues.apache.org/jira/browse/SOLR-9883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated SOLR-9883:
-----------------------------
    Attachment: SOLR-9883.patch

Attaching a patch that switches example configs's add-unknown-fields-to-schema 
update chains so that the DUP is after the AddSchemaFields URPF.  In my manual 
testing (see below), this prevents the data corruption: the buffered tlog entry 
includes the date normalization. I also made {{AddSchemaFields}} URPF implement 
{{UpdateRequestProcessorFactory.RunAlways}}, so that schema modifications will 
continue to be applied on all replicas (the original rationale for moving the 
DUP position on SOLR-6137). 

Following an offline reproduction suggestion from [~hossman], I was able to 
manually reproduce the data corruption as follows:

# Added an artificial 1-minute delay in {{PeerSync}}
# {{bin/solr start -e cloud  # nodes=2, coll=gettingstarted, shards=1, rf=2, 
configset=data_driven_schema_configs}}
# {{curl -X POST -H 'Content-type: application/xml' 
http://localhost:8983/solr/gettingstarted/update -d '<add><doc><field 
name="f_dt">2015-06-09</field></doc></add>'}}
# {{kill -9 $(cat bin/solr-7574.pid)}}
# {{curl -X POST -H 'Content-type: application/xml' 
http://localhost:8983/solr/gettingstarted/update -d '<add><doc><field 
name="f_dt">2015-06-10</field></doc></add>'}}
# {{bin/solr start -cloud -p 7574 -s "example/cloud/node2/solr" -z 
localhost:9983}}
# {{curl -X POST -H 'Content-type: application/xml' 
http://localhost:8983/solr/gettingstarted/update -d '<add><doc><field 
name="f_dt">2015-06-11</field></doc></add>'}}

I had to add step #3 to create a transaction log entry on the 7574 replica 
prior to shutdown; otherwise on restart it would refuse to perform peer sync, 
because it didn't know where to start (due to no recent versions in the tlog) 
and instead initiated full recovery.

I'm working on an automated data corruption test.

I want to get this change into the 6.4 release.

> example solr config files can lead to invalid tlog replays when using 
> add-unknown-fields-to-schema updat chain
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-9883
>                 URL: https://issues.apache.org/jira/browse/SOLR-9883
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>    Affects Versions: 6.3, trunk
>            Reporter: Erick Erickson
>            Assignee: Steve Rowe
>         Attachments: SOLR-9883.patch
>
>
> The current basic_configs and data_driven_schema_configs try to create 
> unknown fields. The problem is that the date processing 
> "ParseDateFieldUpdateProcessorFactory" is not invoked if the doc is replayed 
> from the tlog. Whether there are other places this is a problem I don't know, 
> this is a concrete example that fails in the field.
> So say I have a pattern for dates that omits the trialing 'Z', as:
> yyyy-MM-dd'T'HH:mm:ss.SSS
> This work fine when the doc is initially indexed. Now say the doc must be 
> replayed from the tlog. The doc errors out with "unknown date format" since 
> (apparently) this doesn't go through the same update chain, perhaps due to 
> the sample configs defining ParseDateFieldUpdateProcessorFactory after  
> DistributedUpdateProcessorFactory?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to