[jira] [Commented] (SOLR-9941) log replay redundently (pre-)applies DBQs as if they were out of order

2017-01-16 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15824621#comment-15824621
 ] 

ASF subversion and git services commented on SOLR-9941:
---

Commit 38af094d175daebe4093782cc06e964cfc2dd14b in lucene-solr's branch 
refs/heads/master from [~ichattopadhyaya]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=38af094 ]

SOLR-9941: Moving changelog entry from 7.0.0 to 6.5.0


> log replay redundently (pre-)applies DBQs as if they were out of order
> --
>
> Key: SOLR-9941
> URL: https://issues.apache.org/jira/browse/SOLR-9941
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Assignee: Ishan Chattopadhyaya
> Attachments: SOLR-9941.hoss-test-experiment.patch, SOLR-9941.patch, 
> SOLR-9941.patch, SOLR-9941.patch, SOLR-9941.patch, SOLR-9941.patch
>
>
> There's kind of an odd situation that arises when a Solr node starts up 
> (after a crash) and tries to recover from it's tlog that causes deletes to be 
> redundantly & excessively applied -- at a minimum it causes confusing really 
> log messages
> * {{UpdateLog.init(...)}} creates {{TransactionLog}} instances for the most 
> recent log files found (based on numRecordsToKeep) and then builds a 
> {{RecentUpdates}} instance from them
> * Delete entries from the {{RecentUpdates}} are used to populate 2 lists:
> ** {{deleteByQueries}}
> ** {{oldDeletes}} (for deleteById).
> * Then when {{UpdateLog.recoverFromLog}} is called a {{LogReplayer}} is used 
> to replay any (uncommited) {{TransactionLog}} enteries
> ** during replay {{UpdateLog}} delegates to the UpdateRequestProcessorChain 
> to for the various adds/deletes, etc...
> ** when an add makes it to {{RunUpdateProcessor}} it delegates to 
> {{DirectUpdateHandler2}}, which (independent of the fact that we're in log 
> replay) calls {{UpdateLog.getDBQNewer}} for every add, looking for any 
> "Reordered" deletes that have a version greater then the add
> *** if it finds _any_ DBQs "newer" then the document being added, it does a 
> low level {{IndexWriter.updateDocument}} and then immediately executes _all_ 
> the newer DBQs ... _once per add_
> ** these deletes are *also* still executed as part of the normal tlog replay, 
> because they are in the tlog.
> Which means if you are recovering from a tlog with 90 addDocs, followed by 5 
> DBQs, then *each* of those 5 DBQs will each be executed 91 times -- and for 
> 90 of those executions, a DUH2 INFO log messages will say {{"Reordered DBQs 
> detected. ..."}} even tough the only reason they are out of order is because 
> Solr is deliberately applying them out of order.
> * At a minimum we should improve the log messages
> * Ideally we should stop (pre-emptively) applying these deletes during tlog 
> replay.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9941) log replay redundently (pre-)applies DBQs as if they were out of order

2017-01-16 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15824614#comment-15824614
 ] 

ASF subversion and git services commented on SOLR-9941:
---

Commit 302ce326d5a2eee043445918fa3e3885dc003b2f in lucene-solr's branch 
refs/heads/branch_6x from [~ichattopadhyaya]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=302ce32 ]

SOLR-9941: Clear deletes lists before log replay


> log replay redundently (pre-)applies DBQs as if they were out of order
> --
>
> Key: SOLR-9941
> URL: https://issues.apache.org/jira/browse/SOLR-9941
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Assignee: Ishan Chattopadhyaya
> Attachments: SOLR-9941.hoss-test-experiment.patch, SOLR-9941.patch, 
> SOLR-9941.patch, SOLR-9941.patch, SOLR-9941.patch, SOLR-9941.patch
>
>
> There's kind of an odd situation that arises when a Solr node starts up 
> (after a crash) and tries to recover from it's tlog that causes deletes to be 
> redundantly & excessively applied -- at a minimum it causes confusing really 
> log messages
> * {{UpdateLog.init(...)}} creates {{TransactionLog}} instances for the most 
> recent log files found (based on numRecordsToKeep) and then builds a 
> {{RecentUpdates}} instance from them
> * Delete entries from the {{RecentUpdates}} are used to populate 2 lists:
> ** {{deleteByQueries}}
> ** {{oldDeletes}} (for deleteById).
> * Then when {{UpdateLog.recoverFromLog}} is called a {{LogReplayer}} is used 
> to replay any (uncommited) {{TransactionLog}} enteries
> ** during replay {{UpdateLog}} delegates to the UpdateRequestProcessorChain 
> to for the various adds/deletes, etc...
> ** when an add makes it to {{RunUpdateProcessor}} it delegates to 
> {{DirectUpdateHandler2}}, which (independent of the fact that we're in log 
> replay) calls {{UpdateLog.getDBQNewer}} for every add, looking for any 
> "Reordered" deletes that have a version greater then the add
> *** if it finds _any_ DBQs "newer" then the document being added, it does a 
> low level {{IndexWriter.updateDocument}} and then immediately executes _all_ 
> the newer DBQs ... _once per add_
> ** these deletes are *also* still executed as part of the normal tlog replay, 
> because they are in the tlog.
> Which means if you are recovering from a tlog with 90 addDocs, followed by 5 
> DBQs, then *each* of those 5 DBQs will each be executed 91 times -- and for 
> 90 of those executions, a DUH2 INFO log messages will say {{"Reordered DBQs 
> detected. ..."}} even tough the only reason they are out of order is because 
> Solr is deliberately applying them out of order.
> * At a minimum we should improve the log messages
> * Ideally we should stop (pre-emptively) applying these deletes during tlog 
> replay.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9941) log replay redundently (pre-)applies DBQs as if they were out of order

2017-01-16 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15824615#comment-15824615
 ] 

ASF subversion and git services commented on SOLR-9941:
---

Commit 7ef8cf7d6aad25888de4cffc4c20239694a67a45 in lucene-solr's branch 
refs/heads/branch_6x from [~ichattopadhyaya]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=7ef8cf7 ]

SOLR-9941: Adding the Optimizations section to the CHANGES.txt


> log replay redundently (pre-)applies DBQs as if they were out of order
> --
>
> Key: SOLR-9941
> URL: https://issues.apache.org/jira/browse/SOLR-9941
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
>Assignee: Ishan Chattopadhyaya
> Attachments: SOLR-9941.hoss-test-experiment.patch, SOLR-9941.patch, 
> SOLR-9941.patch, SOLR-9941.patch, SOLR-9941.patch, SOLR-9941.patch
>
>
> There's kind of an odd situation that arises when a Solr node starts up 
> (after a crash) and tries to recover from it's tlog that causes deletes to be 
> redundantly & excessively applied -- at a minimum it causes confusing really 
> log messages
> * {{UpdateLog.init(...)}} creates {{TransactionLog}} instances for the most 
> recent log files found (based on numRecordsToKeep) and then builds a 
> {{RecentUpdates}} instance from them
> * Delete entries from the {{RecentUpdates}} are used to populate 2 lists:
> ** {{deleteByQueries}}
> ** {{oldDeletes}} (for deleteById).
> * Then when {{UpdateLog.recoverFromLog}} is called a {{LogReplayer}} is used 
> to replay any (uncommited) {{TransactionLog}} enteries
> ** during replay {{UpdateLog}} delegates to the UpdateRequestProcessorChain 
> to for the various adds/deletes, etc...
> ** when an add makes it to {{RunUpdateProcessor}} it delegates to 
> {{DirectUpdateHandler2}}, which (independent of the fact that we're in log 
> replay) calls {{UpdateLog.getDBQNewer}} for every add, looking for any 
> "Reordered" deletes that have a version greater then the add
> *** if it finds _any_ DBQs "newer" then the document being added, it does a 
> low level {{IndexWriter.updateDocument}} and then immediately executes _all_ 
> the newer DBQs ... _once per add_
> ** these deletes are *also* still executed as part of the normal tlog replay, 
> because they are in the tlog.
> Which means if you are recovering from a tlog with 90 addDocs, followed by 5 
> DBQs, then *each* of those 5 DBQs will each be executed 91 times -- and for 
> 90 of those executions, a DUH2 INFO log messages will say {{"Reordered DBQs 
> detected. ..."}} even tough the only reason they are out of order is because 
> Solr is deliberately applying them out of order.
> * At a minimum we should improve the log messages
> * Ideally we should stop (pre-emptively) applying these deletes during tlog 
> replay.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9941) log replay redundently (pre-)applies DBQs as if they were out of order

2017-01-11 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15818754#comment-15818754
 ] 

ASF subversion and git services commented on SOLR-9941:
---

Commit 04f45aab768b053f32feece7343f994d25f0bb26 in lucene-solr's branch 
refs/heads/master from [~ichattopadhyaya]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=04f45aa ]

SOLR-9941: Clear deletes lists before log replay


> log replay redundently (pre-)applies DBQs as if they were out of order
> --
>
> Key: SOLR-9941
> URL: https://issues.apache.org/jira/browse/SOLR-9941
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
> Attachments: SOLR-9941.hoss-test-experiment.patch, SOLR-9941.patch, 
> SOLR-9941.patch, SOLR-9941.patch, SOLR-9941.patch, SOLR-9941.patch
>
>
> There's kind of an odd situation that arises when a Solr node starts up 
> (after a crash) and tries to recover from it's tlog that causes deletes to be 
> redundantly & excessively applied -- at a minimum it causes confusing really 
> log messages
> * {{UpdateLog.init(...)}} creates {{TransactionLog}} instances for the most 
> recent log files found (based on numRecordsToKeep) and then builds a 
> {{RecentUpdates}} instance from them
> * Delete entries from the {{RecentUpdates}} are used to populate 2 lists:
> ** {{deleteByQueries}}
> ** {{oldDeletes}} (for deleteById).
> * Then when {{UpdateLog.recoverFromLog}} is called a {{LogReplayer}} is used 
> to replay any (uncommited) {{TransactionLog}} enteries
> ** during replay {{UpdateLog}} delegates to the UpdateRequestProcessorChain 
> to for the various adds/deletes, etc...
> ** when an add makes it to {{RunUpdateProcessor}} it delegates to 
> {{DirectUpdateHandler2}}, which (independent of the fact that we're in log 
> replay) calls {{UpdateLog.getDBQNewer}} for every add, looking for any 
> "Reordered" deletes that have a version greater then the add
> *** if it finds _any_ DBQs "newer" then the document being added, it does a 
> low level {{IndexWriter.updateDocument}} and then immediately executes _all_ 
> the newer DBQs ... _once per add_
> ** these deletes are *also* still executed as part of the normal tlog replay, 
> because they are in the tlog.
> Which means if you are recovering from a tlog with 90 addDocs, followed by 5 
> DBQs, then *each* of those 5 DBQs will each be executed 91 times -- and for 
> 90 of those executions, a DUH2 INFO log messages will say {{"Reordered DBQs 
> detected. ..."}} even tough the only reason they are out of order is because 
> Solr is deliberately applying them out of order.
> * At a minimum we should improve the log messages
> * Ideally we should stop (pre-emptively) applying these deletes during tlog 
> replay.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9941) log replay redundently (pre-)applies DBQs as if they were out of order

2017-01-10 Thread Christine Poerschke (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15815615#comment-15815615
 ] 

Christine Poerschke commented on SOLR-9941:
---

Tentatively linking SOLR-8760 and SOLR-9941 as related (but haven't yet had 
chance to catchup on SOLR-9941 here to see if it would solve SOLR-8760 also or 
indeed if the issues are different).

> log replay redundently (pre-)applies DBQs as if they were out of order
> --
>
> Key: SOLR-9941
> URL: https://issues.apache.org/jira/browse/SOLR-9941
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
> Attachments: SOLR-9941.hoss-test-experiment.patch, SOLR-9941.patch, 
> SOLR-9941.patch, SOLR-9941.patch, SOLR-9941.patch
>
>
> There's kind of an odd situation that arises when a Solr node starts up 
> (after a crash) and tries to recover from it's tlog that causes deletes to be 
> redundantly & excessively applied -- at a minimum it causes confusing really 
> log messages
> * {{UpdateLog.init(...)}} creates {{TransactionLog}} instances for the most 
> recent log files found (based on numRecordsToKeep) and then builds a 
> {{RecentUpdates}} instance from them
> * Delete entries from the {{RecentUpdates}} are used to populate 2 lists:
> ** {{deleteByQueries}}
> ** {{oldDeletes}} (for deleteById).
> * Then when {{UpdateLog.recoverFromLog}} is called a {{LogReplayer}} is used 
> to replay any (uncommited) {{TransactionLog}} enteries
> ** during replay {{UpdateLog}} delegates to the UpdateRequestProcessorChain 
> to for the various adds/deletes, etc...
> ** when an add makes it to {{RunUpdateProcessor}} it delegates to 
> {{DirectUpdateHandler2}}, which (independent of the fact that we're in log 
> replay) calls {{UpdateLog.getDBQNewer}} for every add, looking for any 
> "Reordered" deletes that have a version greater then the add
> *** if it finds _any_ DBQs "newer" then the document being added, it does a 
> low level {{IndexWriter.updateDocument}} and then immediately executes _all_ 
> the newer DBQs ... _once per add_
> ** these deletes are *also* still executed as part of the normal tlog replay, 
> because they are in the tlog.
> Which means if you are recovering from a tlog with 90 addDocs, followed by 5 
> DBQs, then *each* of those 5 DBQs will each be executed 91 times -- and for 
> 90 of those executions, a DUH2 INFO log messages will say {{"Reordered DBQs 
> detected. ..."}} even tough the only reason they are out of order is because 
> Solr is deliberately applying them out of order.
> * At a minimum we should improve the log messages
> * Ideally we should stop (pre-emptively) applying these deletes during tlog 
> replay.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9941) log replay redundently (pre-)applies DBQs as if they were out of order

2017-01-10 Thread Cao Manh Dat (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15814488#comment-15814488
 ] 

Cao Manh Dat commented on SOLR-9941:


+1 for remove {{boolean onStartup}}. 
The rest a good to me.

> log replay redundently (pre-)applies DBQs as if they were out of order
> --
>
> Key: SOLR-9941
> URL: https://issues.apache.org/jira/browse/SOLR-9941
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
> Attachments: SOLR-9941.hoss-test-experiment.patch, SOLR-9941.patch, 
> SOLR-9941.patch, SOLR-9941.patch
>
>
> There's kind of an odd situation that arises when a Solr node starts up 
> (after a crash) and tries to recover from it's tlog that causes deletes to be 
> redundantly & excessively applied -- at a minimum it causes confusing really 
> log messages
> * {{UpdateLog.init(...)}} creates {{TransactionLog}} instances for the most 
> recent log files found (based on numRecordsToKeep) and then builds a 
> {{RecentUpdates}} instance from them
> * Delete entries from the {{RecentUpdates}} are used to populate 2 lists:
> ** {{deleteByQueries}}
> ** {{oldDeletes}} (for deleteById).
> * Then when {{UpdateLog.recoverFromLog}} is called a {{LogReplayer}} is used 
> to replay any (uncommited) {{TransactionLog}} enteries
> ** during replay {{UpdateLog}} delegates to the UpdateRequestProcessorChain 
> to for the various adds/deletes, etc...
> ** when an add makes it to {{RunUpdateProcessor}} it delegates to 
> {{DirectUpdateHandler2}}, which (independent of the fact that we're in log 
> replay) calls {{UpdateLog.getDBQNewer}} for every add, looking for any 
> "Reordered" deletes that have a version greater then the add
> *** if it finds _any_ DBQs "newer" then the document being added, it does a 
> low level {{IndexWriter.updateDocument}} and then immediately executes _all_ 
> the newer DBQs ... _once per add_
> ** these deletes are *also* still executed as part of the normal tlog replay, 
> because they are in the tlog.
> Which means if you are recovering from a tlog with 90 addDocs, followed by 5 
> DBQs, then *each* of those 5 DBQs will each be executed 91 times -- and for 
> 90 of those executions, a DUH2 INFO log messages will say {{"Reordered DBQs 
> detected. ..."}} even tough the only reason they are out of order is because 
> Solr is deliberately applying them out of order.
> * At a minimum we should improve the log messages
> * Ideally we should stop (pre-emptively) applying these deletes during tlog 
> replay.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9941) log replay redundently (pre-)applies DBQs as if they were out of order

2017-01-09 Thread Cao Manh Dat (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813726#comment-15813726
 ] 

Cao Manh Dat commented on SOLR-9941:


Here are the reason to explain why Hoss's test fails :
- When the test add updates, it will be written to tlog-
- After the core is restarted, it will replay the updates from tlog-
- In the same time, the test add another DEQ, because the ulog is on buffering 
mode, DUP will write this DEQ to another file ( tlog-0001) and ignore apply 
this DEQ to IW.
- After LogReplay finish, we call a commit, which will write a commit update at 
the end of tlog-0001

So this DEQ update will never be replayed ( because it belong tlog-0001 not 
tlog- ), and it've never been applied to IW, so that DEQ update is lost.
Even we restart the core, to hoping that it will replay tlog-0001, but because 
we write an commit at the end of tlog-0001, tlog-0001 will never be replayed.

So I think this fail belong to another issue.

> log replay redundently (pre-)applies DBQs as if they were out of order
> --
>
> Key: SOLR-9941
> URL: https://issues.apache.org/jira/browse/SOLR-9941
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
> Attachments: SOLR-9941.hoss-test-experiment.patch, SOLR-9941.patch, 
> SOLR-9941.patch, SOLR-9941.patch
>
>
> There's kind of an odd situation that arises when a Solr node starts up 
> (after a crash) and tries to recover from it's tlog that causes deletes to be 
> redundantly & excessively applied -- at a minimum it causes confusing really 
> log messages
> * {{UpdateLog.init(...)}} creates {{TransactionLog}} instances for the most 
> recent log files found (based on numRecordsToKeep) and then builds a 
> {{RecentUpdates}} instance from them
> * Delete entries from the {{RecentUpdates}} are used to populate 2 lists:
> ** {{deleteByQueries}}
> ** {{oldDeletes}} (for deleteById).
> * Then when {{UpdateLog.recoverFromLog}} is called a {{LogReplayer}} is used 
> to replay any (uncommited) {{TransactionLog}} enteries
> ** during replay {{UpdateLog}} delegates to the UpdateRequestProcessorChain 
> to for the various adds/deletes, etc...
> ** when an add makes it to {{RunUpdateProcessor}} it delegates to 
> {{DirectUpdateHandler2}}, which (independent of the fact that we're in log 
> replay) calls {{UpdateLog.getDBQNewer}} for every add, looking for any 
> "Reordered" deletes that have a version greater then the add
> *** if it finds _any_ DBQs "newer" then the document being added, it does a 
> low level {{IndexWriter.updateDocument}} and then immediately executes _all_ 
> the newer DBQs ... _once per add_
> ** these deletes are *also* still executed as part of the normal tlog replay, 
> because they are in the tlog.
> Which means if you are recovering from a tlog with 90 addDocs, followed by 5 
> DBQs, then *each* of those 5 DBQs will each be executed 91 times -- and for 
> 90 of those executions, a DUH2 INFO log messages will say {{"Reordered DBQs 
> detected. ..."}} even tough the only reason they are out of order is because 
> Solr is deliberately applying them out of order.
> * At a minimum we should improve the log messages
> * Ideally we should stop (pre-emptively) applying these deletes during tlog 
> replay.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9941) log replay redundently (pre-)applies DBQs as if they were out of order

2017-01-09 Thread Ishan Chattopadhyaya (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813309#comment-15813309
 ] 

Ishan Chattopadhyaya commented on SOLR-9941:


{quote}
Seems to me that updates arriving during the log replay (in TestRecovery) are 
being silently dropped.
{quote}
One possible explanation came to mind: During log replay, the state of the core 
is REPLAYING. Hence, perhaps, incoming updates are not applied until the 
recovery has finished and state is back to ACTIVE?

> log replay redundently (pre-)applies DBQs as if they were out of order
> --
>
> Key: SOLR-9941
> URL: https://issues.apache.org/jira/browse/SOLR-9941
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
> Attachments: SOLR-9941.hoss-test-experiment.patch, SOLR-9941.patch, 
> SOLR-9941.patch, SOLR-9941.patch
>
>
> There's kind of an odd situation that arises when a Solr node starts up 
> (after a crash) and tries to recover from it's tlog that causes deletes to be 
> redundantly & excessively applied -- at a minimum it causes confusing really 
> log messages
> * {{UpdateLog.init(...)}} creates {{TransactionLog}} instances for the most 
> recent log files found (based on numRecordsToKeep) and then builds a 
> {{RecentUpdates}} instance from them
> * Delete entries from the {{RecentUpdates}} are used to populate 2 lists:
> ** {{deleteByQueries}}
> ** {{oldDeletes}} (for deleteById).
> * Then when {{UpdateLog.recoverFromLog}} is called a {{LogReplayer}} is used 
> to replay any (uncommited) {{TransactionLog}} enteries
> ** during replay {{UpdateLog}} delegates to the UpdateRequestProcessorChain 
> to for the various adds/deletes, etc...
> ** when an add makes it to {{RunUpdateProcessor}} it delegates to 
> {{DirectUpdateHandler2}}, which (independent of the fact that we're in log 
> replay) calls {{UpdateLog.getDBQNewer}} for every add, looking for any 
> "Reordered" deletes that have a version greater then the add
> *** if it finds _any_ DBQs "newer" then the document being added, it does a 
> low level {{IndexWriter.updateDocument}} and then immediately executes _all_ 
> the newer DBQs ... _once per add_
> ** these deletes are *also* still executed as part of the normal tlog replay, 
> because they are in the tlog.
> Which means if you are recovering from a tlog with 90 addDocs, followed by 5 
> DBQs, then *each* of those 5 DBQs will each be executed 91 times -- and for 
> 90 of those executions, a DUH2 INFO log messages will say {{"Reordered DBQs 
> detected. ..."}} even tough the only reason they are out of order is because 
> Solr is deliberately applying them out of order.
> * At a minimum we should improve the log messages
> * Ideally we should stop (pre-emptively) applying these deletes during tlog 
> replay.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9941) log replay redundently (pre-)applies DBQs as if they were out of order

2017-01-09 Thread Ishan Chattopadhyaya (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813256#comment-15813256
 ] 

Ishan Chattopadhyaya commented on SOLR-9941:


{quote}
I wanted to beef up Ishan's testLogReplayWithReorderedDBQ to prove that if 
another (probably ordered) DBQ arrived during log replay it would correctly be 
applied – even if some affected docs hadn't been added yet as part of replay. 
(ie: prove that "RecentUpdates" was still being used during replay)
{quote}

Seems to me that updates arriving *during* the log replay (in TestRecovery) are 
being silently dropped. There's definitely something fishy about it, but I 
think that is another issue altogether, since even adds during the log replay 
are not persisting after the replay. Here's [0] a modified version of your test 
that applies the DBQ *after* recovery has finished. Of course, not nearly as 
effective in proving anything, and also doesn't test for the premise you were 
after: {{prove that "RecentUpdates" was still being used during replay}}.

[0] - https://paste.fedoraproject.org/524485/05608148

> log replay redundently (pre-)applies DBQs as if they were out of order
> --
>
> Key: SOLR-9941
> URL: https://issues.apache.org/jira/browse/SOLR-9941
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
> Attachments: SOLR-9941.hoss-test-experiment.patch, SOLR-9941.patch, 
> SOLR-9941.patch, SOLR-9941.patch
>
>
> There's kind of an odd situation that arises when a Solr node starts up 
> (after a crash) and tries to recover from it's tlog that causes deletes to be 
> redundantly & excessively applied -- at a minimum it causes confusing really 
> log messages
> * {{UpdateLog.init(...)}} creates {{TransactionLog}} instances for the most 
> recent log files found (based on numRecordsToKeep) and then builds a 
> {{RecentUpdates}} instance from them
> * Delete entries from the {{RecentUpdates}} are used to populate 2 lists:
> ** {{deleteByQueries}}
> ** {{oldDeletes}} (for deleteById).
> * Then when {{UpdateLog.recoverFromLog}} is called a {{LogReplayer}} is used 
> to replay any (uncommited) {{TransactionLog}} enteries
> ** during replay {{UpdateLog}} delegates to the UpdateRequestProcessorChain 
> to for the various adds/deletes, etc...
> ** when an add makes it to {{RunUpdateProcessor}} it delegates to 
> {{DirectUpdateHandler2}}, which (independent of the fact that we're in log 
> replay) calls {{UpdateLog.getDBQNewer}} for every add, looking for any 
> "Reordered" deletes that have a version greater then the add
> *** if it finds _any_ DBQs "newer" then the document being added, it does a 
> low level {{IndexWriter.updateDocument}} and then immediately executes _all_ 
> the newer DBQs ... _once per add_
> ** these deletes are *also* still executed as part of the normal tlog replay, 
> because they are in the tlog.
> Which means if you are recovering from a tlog with 90 addDocs, followed by 5 
> DBQs, then *each* of those 5 DBQs will each be executed 91 times -- and for 
> 90 of those executions, a DUH2 INFO log messages will say {{"Reordered DBQs 
> detected. ..."}} even tough the only reason they are out of order is because 
> Solr is deliberately applying them out of order.
> * At a minimum we should improve the log messages
> * Ideally we should stop (pre-emptively) applying these deletes during tlog 
> replay.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9941) log replay redundently (pre-)applies DBQs as if they were out of order

2017-01-09 Thread Ishan Chattopadhyaya (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15813195#comment-15813195
 ] 

Ishan Chattopadhyaya commented on SOLR-9941:


{quote}
One question i have is: if the only code paths that call 
recoverFromLog(boolean) are "startup" paths that pass true why do we need the 
optional argument? why not just refactor the method to always use the new logic?
{quote}
My thought there was that if someone wanted to reuse the doLogReplay() method 
later for use during a live node, and not during the startup of a node/core 
(for whatever reason that I cannot forsee right now), that they not end up 
clearing their deletes lists in the process. Though, given the current use of 
the method now, I am also open to eliminate that extra parameter if you suggest.

> log replay redundently (pre-)applies DBQs as if they were out of order
> --
>
> Key: SOLR-9941
> URL: https://issues.apache.org/jira/browse/SOLR-9941
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
> Attachments: SOLR-9941.hoss-test-experiment.patch, SOLR-9941.patch, 
> SOLR-9941.patch, SOLR-9941.patch
>
>
> There's kind of an odd situation that arises when a Solr node starts up 
> (after a crash) and tries to recover from it's tlog that causes deletes to be 
> redundantly & excessively applied -- at a minimum it causes confusing really 
> log messages
> * {{UpdateLog.init(...)}} creates {{TransactionLog}} instances for the most 
> recent log files found (based on numRecordsToKeep) and then builds a 
> {{RecentUpdates}} instance from them
> * Delete entries from the {{RecentUpdates}} are used to populate 2 lists:
> ** {{deleteByQueries}}
> ** {{oldDeletes}} (for deleteById).
> * Then when {{UpdateLog.recoverFromLog}} is called a {{LogReplayer}} is used 
> to replay any (uncommited) {{TransactionLog}} enteries
> ** during replay {{UpdateLog}} delegates to the UpdateRequestProcessorChain 
> to for the various adds/deletes, etc...
> ** when an add makes it to {{RunUpdateProcessor}} it delegates to 
> {{DirectUpdateHandler2}}, which (independent of the fact that we're in log 
> replay) calls {{UpdateLog.getDBQNewer}} for every add, looking for any 
> "Reordered" deletes that have a version greater then the add
> *** if it finds _any_ DBQs "newer" then the document being added, it does a 
> low level {{IndexWriter.updateDocument}} and then immediately executes _all_ 
> the newer DBQs ... _once per add_
> ** these deletes are *also* still executed as part of the normal tlog replay, 
> because they are in the tlog.
> Which means if you are recovering from a tlog with 90 addDocs, followed by 5 
> DBQs, then *each* of those 5 DBQs will each be executed 91 times -- and for 
> 90 of those executions, a DUH2 INFO log messages will say {{"Reordered DBQs 
> detected. ..."}} even tough the only reason they are out of order is because 
> Solr is deliberately applying them out of order.
> * At a minimum we should improve the log messages
> * Ideally we should stop (pre-emptively) applying these deletes during tlog 
> replay.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9941) log replay redundently (pre-)applies DBQs as if they were out of order

2017-01-09 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15812443#comment-15812443
 ] 

Hoss Man commented on SOLR-9941:


The patch looks pretty good to me -- but i'm not a tlog expert.

One question i have is: if the only code paths that call 
{{recoverFromLog(boolean)}} are "startup" paths that pass {{true}} why do we 
need the optional argument?  why not just refactor the method to always use the 
new logic?

> log replay redundently (pre-)applies DBQs as if they were out of order
> --
>
> Key: SOLR-9941
> URL: https://issues.apache.org/jira/browse/SOLR-9941
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
> Attachments: SOLR-9941.patch, SOLR-9941.patch, SOLR-9941.patch
>
>
> There's kind of an odd situation that arises when a Solr node starts up 
> (after a crash) and tries to recover from it's tlog that causes deletes to be 
> redundantly & excessively applied -- at a minimum it causes confusing really 
> log messages
> * {{UpdateLog.init(...)}} creates {{TransactionLog}} instances for the most 
> recent log files found (based on numRecordsToKeep) and then builds a 
> {{RecentUpdates}} instance from them
> * Delete entries from the {{RecentUpdates}} are used to populate 2 lists:
> ** {{deleteByQueries}}
> ** {{oldDeletes}} (for deleteById).
> * Then when {{UpdateLog.recoverFromLog}} is called a {{LogReplayer}} is used 
> to replay any (uncommited) {{TransactionLog}} enteries
> ** during replay {{UpdateLog}} delegates to the UpdateRequestProcessorChain 
> to for the various adds/deletes, etc...
> ** when an add makes it to {{RunUpdateProcessor}} it delegates to 
> {{DirectUpdateHandler2}}, which (independent of the fact that we're in log 
> replay) calls {{UpdateLog.getDBQNewer}} for every add, looking for any 
> "Reordered" deletes that have a version greater then the add
> *** if it finds _any_ DBQs "newer" then the document being added, it does a 
> low level {{IndexWriter.updateDocument}} and then immediately executes _all_ 
> the newer DBQs ... _once per add_
> ** these deletes are *also* still executed as part of the normal tlog replay, 
> because they are in the tlog.
> Which means if you are recovering from a tlog with 90 addDocs, followed by 5 
> DBQs, then *each* of those 5 DBQs will each be executed 91 times -- and for 
> 90 of those executions, a DUH2 INFO log messages will say {{"Reordered DBQs 
> detected. ..."}} even tough the only reason they are out of order is because 
> Solr is deliberately applying them out of order.
> * At a minimum we should improve the log messages
> * Ideally we should stop (pre-emptively) applying these deletes during tlog 
> replay.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9941) log replay redundently (pre-)applies DBQs as if they were out of order

2017-01-09 Thread Ishan Chattopadhyaya (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15812165#comment-15812165
 ] 

Ishan Chattopadhyaya commented on SOLR-9941:


Fyi, [~shalinmangar], [~caomanhdat] please review.

> log replay redundently (pre-)applies DBQs as if they were out of order
> --
>
> Key: SOLR-9941
> URL: https://issues.apache.org/jira/browse/SOLR-9941
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Hoss Man
> Attachments: SOLR-9941.patch, SOLR-9941.patch, SOLR-9941.patch
>
>
> There's kind of an odd situation that arises when a Solr node starts up 
> (after a crash) and tries to recover from it's tlog that causes deletes to be 
> redundantly & excessively applied -- at a minimum it causes confusing really 
> log messages
> * {{UpdateLog.init(...)}} creates {{TransactionLog}} instances for the most 
> recent log files found (based on numRecordsToKeep) and then builds a 
> {{RecentUpdates}} instance from them
> * Delete entries from the {{RecentUpdates}} are used to populate 2 lists:
> ** {{deleteByQueries}}
> ** {{oldDeletes}} (for deleteById).
> * Then when {{UpdateLog.recoverFromLog}} is called a {{LogReplayer}} is used 
> to replay any (uncommited) {{TransactionLog}} enteries
> ** during replay {{UpdateLog}} delegates to the UpdateRequestProcessorChain 
> to for the various adds/deletes, etc...
> ** when an add makes it to {{RunUpdateProcessor}} it delegates to 
> {{DirectUpdateHandler2}}, which (independent of the fact that we're in log 
> replay) calls {{UpdateLog.getDBQNewer}} for every add, looking for any 
> "Reordered" deletes that have a version greater then the add
> *** if it finds _any_ DBQs "newer" then the document being added, it does a 
> low level {{IndexWriter.updateDocument}} and then immediately executes _all_ 
> the newer DBQs ... _once per add_
> ** these deletes are *also* still executed as part of the normal tlog replay, 
> because they are in the tlog.
> Which means if you are recovering from a tlog with 90 addDocs, followed by 5 
> DBQs, then *each* of those 5 DBQs will each be executed 91 times -- and for 
> 90 of those executions, a DUH2 INFO log messages will say {{"Reordered DBQs 
> detected. ..."}} even tough the only reason they are out of order is because 
> Solr is deliberately applying them out of order.
> * At a minimum we should improve the log messages
> * Ideally we should stop (pre-emptively) applying these deletes during tlog 
> replay.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-9941) log replay redundently (pre-)applies DBQs as if they were out of order

2017-01-06 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15806431#comment-15806431
 ] 

Hoss Man commented on SOLR-9941:



I'm guessing this situation exists because the {{RecentUpdates}} code was added 
to {{UpdateLog}} to handle (true) out of order updates while a system was 
"live", but it's not clear to me why the {{RecentUpdates}} are used the way 
they are during {{UpdateLog.init}} ... it seems like at a minimum DUH2 could 
inspect the UpdateCommand flags to see if this is a {{REPLAY}} command and if 
it is skip the {{UpdateLog.getDBQNewer}} call?

Here's some steps that make it easy to see how weird this situation can get...

{noformat}
bin/solr -e techproducts

curl -X POST http://localhost:8983/solr/techproducts/config -H 'Content-Type: 
application/json' --data-binary 
'{"set-property":{"updateHandler.autoCommit.maxTime":"-1"}}'

curl -X POST http://localhost:8983/solr/techproducts/update -H 'Content-Type: 
application/json' --data-binary '{
  "add":{
"doc": {
  "id": "DOCX",
  "foo_i": 41
}
   },
  "delete": { "query":"foo_i:42" },
  "delete": { "query":"foo_i:43" },
  "add":{
 "doc": {
  "id": "DOCX",
  "foo_i": { "inc" : "1" }
}
  },
  "delete": { "query":"foo_i:41" },
  "delete": { "query":"foo_i:43" },
  "add":{
"doc": {
  "id": "DOCX",
  "foo_i": { "inc" : "1" }
}
  },
  "delete": { "query":"foo_i:41" },
  "delete": { "query":"foo_i:42" }
}'

# verify the updates were applied correctly and doc didn't get deleted by 
mistake
curl 'http://localhost:8983/solr/techproducts/get?wt=json=DOCX'
{
  "doc":
  {
"id":"DOCX",
"foo_i":43,
"_version_":1555827152896655360}}


kill -9 PID # use whatever PID you get from "ps -ef | grep start.jar | grep 
techproducts"

bin/solr start -s example/techproducts/solr/

# re-verify the updates were applied correctly during replay
curl 'http://localhost:8983/solr/techproducts/get?wt=json=DOCX'
{
  "doc":
  {
"id":"DOCX",
"foo_i":43,
"_version_":1555827152896655360}}

{noformat}

And here's a snippet of solr's logs during the replay on (second) startup...

{noformat}
WARN  - 2017-01-07 01:27:45.408; [   x:techproducts] 
org.apache.solr.update.UpdateLog$LogReplayer; Starting log replay 
tlog{file=/home/hossman/lucene/dev/solr/example/techproducts/solr/techproducts/data/tlog/tlog.001
 refcount=2} active=false starting pos=0
INFO  - 2017-01-07 01:27:45.422; [   x:techproducts] 
org.apache.solr.update.DirectUpdateHandler2; Reordered DBQs detected.  
Update=add{flags=a,_version_=1555827152848420864,id=DOCX} 
DBQs=[DBQ{version=1555827152905043968,q=foo_i:42}, 
DBQ{version=1555827152897703936,q=foo_i:41}, 
DBQ{version=1555827152894558208,q=foo_i:43}, 
DBQ{version=1555827152883023872,q=foo_i:41}, 
DBQ{version=1555827152872538112,q=foo_i:43}, 
DBQ{version=1555827152849469440,q=foo_i:42}]
INFO  - 2017-01-07 01:27:45.434; [   x:techproducts] 
org.apache.solr.core.SolrCore; [techproducts]  webapp=null path=null 
params={q=static+firstSearcher+warming+in+solrconfig.xml=false=firstSearcher}
 hits=3 status=0 QTime=62
INFO  - 2017-01-07 01:27:45.446; [   x:techproducts] 
org.apache.solr.core.QuerySenderListener; QuerySenderListener done.
INFO  - 2017-01-07 01:27:45.446; [   x:techproducts] 
org.apache.solr.handler.component.SpellCheckComponent$SpellCheckerListener; 
Loading spell index for spellchecker: default
INFO  - 2017-01-07 01:27:45.446; [   x:techproducts] 
org.apache.solr.handler.component.SpellCheckComponent$SpellCheckerListener; 
Loading spell index for spellchecker: wordbreak
INFO  - 2017-01-07 01:27:45.448; [   x:techproducts] 
org.apache.solr.core.SolrCore; [techproducts] Registered new searcher 
Searcher@1826a6c8[techproducts] 
main{ExitableDirectoryReader(UninvertingDirectoryReader(Uninverting(_0(7.0.0):C32)))}
INFO  - 2017-01-07 01:27:45.532; [   x:techproducts] 
org.apache.solr.search.SolrIndexSearcher; Opening 
[Searcher@5bd6b40f[techproducts] realtime]
INFO  - 2017-01-07 01:27:45.536; [   x:techproducts] 
org.apache.solr.update.DirectUpdateHandler2; Reordered DBQs detected.  
Update=add{flags=a,_version_=1555827152881975296,id=DOCX} 
DBQs=[DBQ{version=1555827152905043968,q=foo_i:42}, 
DBQ{version=1555827152897703936,q=foo_i:41}, 
DBQ{version=1555827152894558208,q=foo_i:43}, 
DBQ{version=1555827152883023872,q=foo_i:41}]
INFO  - 2017-01-07 01:27:45.546; [   x:techproducts] 
org.apache.solr.search.SolrIndexSearcher; Opening 
[Searcher@667a386f[techproducts] realtime]
INFO  - 2017-01-07 01:27:45.555; [   x:techproducts] 
org.apache.solr.update.DirectUpdateHandler2; Reordered DBQs detected.  
Update=add{flags=a,_version_=1555827152896655360,id=DOCX} 
DBQs=[DBQ{version=1555827152905043968,q=foo_i:42}, 
DBQ{version=1555827152897703936,q=foo_i:41}]
INFO  - 2017-01-07 01:27:45.573; [   x:techproducts] 
org.apache.solr.search.SolrIndexSearcher; Opening 
[Searcher@3ed8aa7e[techproducts]