[jira] [Commented] (SOLR-12816) Don't allow RunUpdateProcessorFactory to be set before DistributedUpdateProcessorFactory

2018-10-08 Thread David Smiley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16642763#comment-16642763
 ] 

David Smiley commented on SOLR-12816:
-

See UpdateRequestProcessorChain line 155 (in init()) which detects that the 
chain doesn't have DURP and automatically adds it immediately before the Run 
URP.

It would be nice if perhaps the chain could be defined by phases in which you 
add URPs to an URP phase.  In this way, built-ins could be handled more cleanly 
without having to mess with the constructors of URPs like I did for TRAs which 
felt like a hack.

> Don't allow RunUpdateProcessorFactory to be set before 
> DistributedUpdateProcessorFactory
> 
>
> Key: SOLR-12816
> URL: https://issues.apache.org/jira/browse/SOLR-12816
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Varun Thacker
>Priority: Major
>
> Here's the problem that came up with a customer call today morning - "My 
> documents are not getting replicated to the replicas and the doc counts don't 
> match up"
> It was a 3 node cluster. The collection was 1 shard X 3 replicas .
> This is a scary situation to be in. We started down the patch of debugging 
> replica types , auto-commits , checking if the {{_version_}}  field and 
> {{id}} fields were defined correctly etc.
>  
> The problem was the user had defined a custom update processor chain and had 
> RunUpdateProcessorFactory defined before DistributedUpdateProcessorFactory
> {code:java}
>
>   
>   
>   
> {code}
>  
> With this update chain, whichever node you index the document against will be 
> the only one indexing the document. It will never forward to the other nodes. 
> So you can  index against a node hosting a replica and the leader will never 
> get this document.
> Is there any use-case where having RunUpdateProcessor before 
> DistributedUpdateProcessor is needed?
>  
> Perhaps we could borrow the idea from TRA or make these two update processors 
> default and remove them from the default configs?
> {code:java}
> When processing an update for a TRA, Solr initializes its 
> UpdateRequestProcessor chain as usual, but when DistributedUpdateProcessor 
> (DUP) initializes, it detects that the update targets a TRA and injects 
> TimeRoutedUpdateProcessor (TRUP) in front of itself.{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12816) Don't allow RunUpdateProcessorFactory to be set before DistributedUpdateProcessorFactory

2018-10-08 Thread Alexandre Rafalovitch (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16642597#comment-16642597
 ] 

Alexandre Rafalovitch commented on SOLR-12816:
--

I did find the passage in the documentation about [URPs that may want to run 
after 
DistributedUpdateProcessor|https://lucene.apache.org/solr/guide/7_5/update-request-processors.html#atomic-update-processor-factory].
 Basically, if they need to operate on the full document even if only atomic 
update was sent, they need to be in the chain after the the 
DistributedUpdateProcessor (which reconstructs the document).

I think this must be what I was trying to remember.

> Don't allow RunUpdateProcessorFactory to be set before 
> DistributedUpdateProcessorFactory
> 
>
> Key: SOLR-12816
> URL: https://issues.apache.org/jira/browse/SOLR-12816
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Varun Thacker
>Priority: Major
>
> Here's the problem that came up with a customer call today morning - "My 
> documents are not getting replicated to the replicas and the doc counts don't 
> match up"
> It was a 3 node cluster. The collection was 1 shard X 3 replicas .
> This is a scary situation to be in. We started down the patch of debugging 
> replica types , auto-commits , checking if the {{_version_}}  field and 
> {{id}} fields were defined correctly etc.
>  
> The problem was the user had defined a custom update processor chain and had 
> RunUpdateProcessorFactory defined before DistributedUpdateProcessorFactory
> {code:java}
>
>   
>   
>   
> {code}
>  
> With this update chain, whichever node you index the document against will be 
> the only one indexing the document. It will never forward to the other nodes. 
> So you can  index against a node hosting a replica and the leader will never 
> get this document.
> Is there any use-case where having RunUpdateProcessor before 
> DistributedUpdateProcessor is needed?
>  
> Perhaps we could borrow the idea from TRA or make these two update processors 
> default and remove them from the default configs?
> {code:java}
> When processing an update for a TRA, Solr initializes its 
> UpdateRequestProcessor chain as usual, but when DistributedUpdateProcessor 
> (DUP) initializes, it detects that the update targets a TRA and injects 
> TimeRoutedUpdateProcessor (TRUP) in front of itself.{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12816) Don't allow RunUpdateProcessorFactory to be set before DistributedUpdateProcessorFactory

2018-09-29 Thread Alexandre Rafalovitch (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16633108#comment-16633108
 ] 

Alexandre Rafalovitch commented on SOLR-12816:
--

I tried to find the use-case, but the only ones I can seem to be relying on 
_UpdateRequestProcessorFactory.RunAlways_ instead. So I probably mis-remembered.

The default solrconfig seesm to use a mix of new and old syntax, perhaps to 
support the schemaless mode enabling/disabling.

> Don't allow RunUpdateProcessorFactory to be set before 
> DistributedUpdateProcessorFactory
> 
>
> Key: SOLR-12816
> URL: https://issues.apache.org/jira/browse/SOLR-12816
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Varun Thacker
>Priority: Major
>
> Here's the problem that came up with a customer call today morning - "My 
> documents are not getting replicated to the replicas and the doc counts don't 
> match up"
> It was a 3 node cluster. The collection was 1 shard X 3 replicas .
> This is a scary situation to be in. We started down the patch of debugging 
> replica types , auto-commits , checking if the {{_version_}}  field and 
> {{id}} fields were defined correctly etc.
>  
> The problem was the user had defined a custom update processor chain and had 
> RunUpdateProcessorFactory defined before DistributedUpdateProcessorFactory
> {code:java}
>
>   
>   
>   
> {code}
>  
> With this update chain, whichever node you index the document against will be 
> the only one indexing the document. It will never forward to the other nodes. 
> So you can  index against a node hosting a replica and the leader will never 
> get this document.
> Is there any use-case where having RunUpdateProcessor before 
> DistributedUpdateProcessor is needed?
>  
> Perhaps we could borrow the idea from TRA or make these two update processors 
> default and remove them from the default configs?
> {code:java}
> When processing an update for a TRA, Solr initializes its 
> UpdateRequestProcessor chain as usual, but when DistributedUpdateProcessor 
> (DUP) initializes, it detects that the update targets a TRA and injects 
> TimeRoutedUpdateProcessor (TRUP) in front of itself.{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12816) Don't allow RunUpdateProcessorFactory to be set before DistributedUpdateProcessorFactory

2018-09-29 Thread Varun Thacker (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16633060#comment-16633060
 ] 

Varun Thacker commented on SOLR-12816:
--

{quote}I think the issue was that some URPs can be injected between those two 
and have a choice to be handled centrally or per-node.
{quote}
Interesting . I hadn't thought of that. Are you aware of a use-case where we 
want to leverage that?

 
{quote}Of course, if new style processor attribute is used, they are already 
default.
{quote}
Maybe the default solrconfig shouldn't mention it then?

> Don't allow RunUpdateProcessorFactory to be set before 
> DistributedUpdateProcessorFactory
> 
>
> Key: SOLR-12816
> URL: https://issues.apache.org/jira/browse/SOLR-12816
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Varun Thacker
>Priority: Major
>
> Here's the problem that came up with a customer call today morning - "My 
> documents are not getting replicated to the replicas and the doc counts don't 
> match up"
> It was a 3 node cluster. The collection was 1 shard X 3 replicas .
> This is a scary situation to be in. We started down the patch of debugging 
> replica types , auto-commits , checking if the {{_version_}}  field and 
> {{id}} fields were defined correctly etc.
>  
> The problem was the user had defined a custom update processor chain and had 
> RunUpdateProcessorFactory defined before DistributedUpdateProcessorFactory
> {code:java}
>
>   
>   
>   
> {code}
>  
> With this update chain, whichever node you index the document against will be 
> the only one indexing the document. It will never forward to the other nodes. 
> So you can  index against a node hosting a replica and the leader will never 
> get this document.
> Is there any use-case where having RunUpdateProcessor before 
> DistributedUpdateProcessor is needed?
>  
> Perhaps we could borrow the idea from TRA or make these two update processors 
> default and remove them from the default configs?
> {code:java}
> When processing an update for a TRA, Solr initializes its 
> UpdateRequestProcessor chain as usual, but when DistributedUpdateProcessor 
> (DUP) initializes, it detects that the update targets a TRA and injects 
> TimeRoutedUpdateProcessor (TRUP) in front of itself.{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12816) Don't allow RunUpdateProcessorFactory to be set before DistributedUpdateProcessorFactory

2018-09-28 Thread Alexandre Rafalovitch (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-12816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16632257#comment-16632257
 ] 

Alexandre Rafalovitch commented on SOLR-12816:
--

I think the issue was that some URPs can be injected between those two and have 
a choice to be handled centrally or per-node.

Also, I could have sworn, we are already injecting one of them if missing.

But maybe there is a use-case for default situation and then recognize an 
explicit one with the extra check.

Of course, if new style processor attribute is used, they are already default.

> Don't allow RunUpdateProcessorFactory to be set before 
> DistributedUpdateProcessorFactory
> 
>
> Key: SOLR-12816
> URL: https://issues.apache.org/jira/browse/SOLR-12816
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Varun Thacker
>Priority: Major
>
> Here's the problem that came up with a customer call today morning - "My 
> documents are not getting replicated to the replicas and the doc counts don't 
> match up"
> It was a 3 node cluster. The collection was 1 shard X 3 replicas .
> This is a scary situation to be in. We started down the patch of debugging 
> replica types , auto-commits , checking if the {{_version_}}  field and 
> {{id}} fields were defined correctly etc.
>  
> The problem was the user had defined a custom update processor chain and had 
> RunUpdateProcessorFactory defined before DistributedUpdateProcessorFactory
> {code:java}
>
>   
>   
>   
> {code}
>  
> With this update chain, whichever node you index the document against will be 
> the only one indexing the document. It will never forward to the other nodes. 
> So you can  index against a node hosting a replica and the leader will never 
> get this document.
> Is there any use-case where having RunUpdateProcessor before 
> DistributedUpdateProcessor is needed?
>  
> Perhaps we could borrow the idea from TRA or make these two update processors 
> default and remove them from the default configs?
> {code:java}
> When processing an update for a TRA, Solr initializes its 
> UpdateRequestProcessor chain as usual, but when DistributedUpdateProcessor 
> (DUP) initializes, it detects that the update targets a TRA and injects 
> TimeRoutedUpdateProcessor (TRUP) in front of itself.{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org