[jira] [Commented] (SOLR-9883) example solr config files can lead to invalid tlog replays when using add-unknown-fields-to-schema updat chain
[ https://issues.apache.org/jira/browse/SOLR-9883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15808200#comment-15808200 ] ASF subversion and git services commented on SOLR-9883: --- Commit 9a6ff177b6f7c776cc6bf4625ed2d5dd7cce81d2 in lucene-solr's branch refs/heads/branch_6x from [~steve_rowe] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=9a6ff17 ] SOLR-9883: In example schemaless configs' default update chain, move the DUP to after the AddSchemaFields URP (which is now tagged as RunAlways), to avoid invalid buffered tlog entry replays. > example solr config files can lead to invalid tlog replays when using > add-unknown-fields-to-schema updat chain > -- > > Key: SOLR-9883 > URL: https://issues.apache.org/jira/browse/SOLR-9883 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 6.3, trunk >Reporter: Erick Erickson >Assignee: Steve Rowe > Attachments: SOLR-9883.patch, SOLR-9883.patch, SOLR-9883.patch > > > The current basic_configs and data_driven_schema_configs try to create > unknown fields. The problem is that the date processing > "ParseDateFieldUpdateProcessorFactory" is not invoked if the doc is replayed > from the tlog. Whether there are other places this is a problem I don't know, > this is a concrete example that fails in the field. > So say I have a pattern for dates that omits the trialing 'Z', as: > -MM-dd'T'HH:mm:ss.SSS > This work fine when the doc is initially indexed. Now say the doc must be > replayed from the tlog. The doc errors out with "unknown date format" since > (apparently) this doesn't go through the same update chain, perhaps due to > the sample configs defining ParseDateFieldUpdateProcessorFactory after > DistributedUpdateProcessorFactory? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9883) example solr config files can lead to invalid tlog replays when using add-unknown-fields-to-schema updat chain
[ https://issues.apache.org/jira/browse/SOLR-9883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15808201#comment-15808201 ] ASF subversion and git services commented on SOLR-9883: --- Commit d817fd43eccd67a5d73c3bbc49561de65d3fc9cb in lucene-solr's branch refs/heads/master from [~steve_rowe] [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=d817fd4 ] SOLR-9883: In example schemaless configs' default update chain, move the DUP to after the AddSchemaFields URP (which is now tagged as RunAlways), to avoid invalid buffered tlog entry replays. > example solr config files can lead to invalid tlog replays when using > add-unknown-fields-to-schema updat chain > -- > > Key: SOLR-9883 > URL: https://issues.apache.org/jira/browse/SOLR-9883 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 6.3, trunk >Reporter: Erick Erickson >Assignee: Steve Rowe > Attachments: SOLR-9883.patch, SOLR-9883.patch, SOLR-9883.patch > > > The current basic_configs and data_driven_schema_configs try to create > unknown fields. The problem is that the date processing > "ParseDateFieldUpdateProcessorFactory" is not invoked if the doc is replayed > from the tlog. Whether there are other places this is a problem I don't know, > this is a concrete example that fails in the field. > So say I have a pattern for dates that omits the trialing 'Z', as: > -MM-dd'T'HH:mm:ss.SSS > This work fine when the doc is initially indexed. Now say the doc must be > replayed from the tlog. The doc errors out with "unknown date format" since > (apparently) this doesn't go through the same update chain, perhaps due to > the sample configs defining ParseDateFieldUpdateProcessorFactory after > DistributedUpdateProcessorFactory? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9883) example solr config files can lead to invalid tlog replays when using add-unknown-fields-to-schema updat chain
[ https://issues.apache.org/jira/browse/SOLR-9883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15798902#comment-15798902 ] Steve Rowe commented on SOLR-9883: -- Forgot to mention: with the attached patch, I was no longer able to reproduce the data corruption with the above method. > example solr config files can lead to invalid tlog replays when using > add-unknown-fields-to-schema updat chain > -- > > Key: SOLR-9883 > URL: https://issues.apache.org/jira/browse/SOLR-9883 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 6.3, trunk >Reporter: Erick Erickson >Assignee: Steve Rowe > Attachments: SOLR-9883.patch > > > The current basic_configs and data_driven_schema_configs try to create > unknown fields. The problem is that the date processing > "ParseDateFieldUpdateProcessorFactory" is not invoked if the doc is replayed > from the tlog. Whether there are other places this is a problem I don't know, > this is a concrete example that fails in the field. > So say I have a pattern for dates that omits the trialing 'Z', as: > -MM-dd'T'HH:mm:ss.SSS > This work fine when the doc is initially indexed. Now say the doc must be > replayed from the tlog. The doc errors out with "unknown date format" since > (apparently) this doesn't go through the same update chain, perhaps due to > the sample configs defining ParseDateFieldUpdateProcessorFactory after > DistributedUpdateProcessorFactory? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-9883) example solr config files can lead to invalid tlog replays when using add-unknown-fields-to-schema updat chain
[ https://issues.apache.org/jira/browse/SOLR-9883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15765491#comment-15765491 ] Erick Erickson commented on SOLR-9883: -- There's quite a bit of discussion at SOLR-8030 that's relevant. I don't quite know whether the simple expedient of putting the URPs before the DistribUpdateProcessorFactory is sufficient (or safe). > example solr config files can lead to invalid tlog replays when using > add-unknown-fields-to-schema updat chain > -- > > Key: SOLR-9883 > URL: https://issues.apache.org/jira/browse/SOLR-9883 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 6.3, trunk >Reporter: Erick Erickson > > The current basic_configs and data_driven_schema_configs try to create > unknown fields. The problem is that the date processing > "ParseDateFieldUpdateProcessorFactory" is not invoked if the doc is replayed > from the tlog. Whether there are other places this is a problem I don't know, > this is a concrete example that fails in the field. > So say I have a pattern for dates that omits the trialing 'Z', as: > -MM-dd'T'HH:mm:ss.SSS > This work fine when the doc is initially indexed. Now say the doc must be > replayed from the tlog. The doc errors out with "unknown date format" since > (apparently) this doesn't go through the same update chain, perhaps due to > the sample configs defining ParseDateFieldUpdateProcessorFactory after > DistributedUpdateProcessorFactory? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org