[jira] [Commented] (SOLR-13741) possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest
[ https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953281#comment-16953281 ] ASF subversion and git services commented on SOLR-13741: Commit 28c1049a258bbd060a80803c72e1c6cadc784dab in lucene-solr's branch refs/heads/branch_8x from Chris M. Hostetter [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=28c1049 ] SOLR-13741: Harden AuditLoggerIntegrationTest (cherry picked from commit 63e9bcf5d150e6324e5133a001613bd7f738a183) > possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest > -- > > Key: SOLR-13741 > URL: https://issues.apache.org/jira/browse/SOLR-13741 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, > SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, > SOLR-13741.patch > > > A while back i saw a weird non-reproducible failure from > AuditLoggerIntegrationTest. When i started reading through that code, 2 > things jumped out at me: > # the way the 'delay' option works is brittle, and makes assumptions about > CPU scheduling that aren't neccessarily going to be true (and also suffers > from the problem that Thread.sleep isn't garunteed to sleep as long as you > ask it too) > # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by > checking the size of a (List) {{buffer}} of recieved events in a sleep/poll > loop, until it contains at least N items -- but the code that adds items to > that buffer in the async Callback thread async _before_ the code that updates > other state variables (like the global {{count}} and the patch specific > {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 > events added to the buffer, but calling {{assertEquals(3, > receiver.getTotalCount())}} could subsequently fail because that variable > hadn't been udpated yet. > #2 was the source of the failures I was seeing, and while a quick fix for > that specific problem would be to update all other state _before_ adding the > event to the buffer, I set out to try and make more general improvements to > the test: > * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data > structures > * harden the assertions made about the expected events recieved (updating > some test methods that currently just assert the number of events recieved) > * add new assertions that _only_ the expected events are recieved. > In the process of doing this, I've found several oddities/descrepencies > between things the test currently claims/asserts, and what *actually* happens > under more rigerous scrutiny/assertions. > I'll attach a patch shortly that has my (in progress) updates and inlcudes > copious nocommits about things seem suspect. the summary of these concerns > is: > * SolrException status codes that do not match what the existing test says > they should (but doesn't assert) > * extra AuditEvents occuring that the existing test does not expect > * AuditEvents for incorrect credentials that do not at all match the expected > AuditEvent in the existing test -- which the current test seems to miss in > it's assertions because it's picking up some extra events from triggered by > previuos requests earlier in the test that just happen to also match the > asserctions. > ...it's not clear to me if the test logic is correct and these are "code > bugs" or if the test is faulty. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13741) possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest
[ https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953279#comment-16953279 ] ASF subversion and git services commented on SOLR-13741: Commit 63e9bcf5d150e6324e5133a001613bd7f738a183 in lucene-solr's branch refs/heads/master from Chris M. Hostetter [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=63e9bcf ] SOLR-13741: Harden AuditLoggerIntegrationTest > possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest > -- > > Key: SOLR-13741 > URL: https://issues.apache.org/jira/browse/SOLR-13741 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, > SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, > SOLR-13741.patch > > > A while back i saw a weird non-reproducible failure from > AuditLoggerIntegrationTest. When i started reading through that code, 2 > things jumped out at me: > # the way the 'delay' option works is brittle, and makes assumptions about > CPU scheduling that aren't neccessarily going to be true (and also suffers > from the problem that Thread.sleep isn't garunteed to sleep as long as you > ask it too) > # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by > checking the size of a (List) {{buffer}} of recieved events in a sleep/poll > loop, until it contains at least N items -- but the code that adds items to > that buffer in the async Callback thread async _before_ the code that updates > other state variables (like the global {{count}} and the patch specific > {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 > events added to the buffer, but calling {{assertEquals(3, > receiver.getTotalCount())}} could subsequently fail because that variable > hadn't been udpated yet. > #2 was the source of the failures I was seeing, and while a quick fix for > that specific problem would be to update all other state _before_ adding the > event to the buffer, I set out to try and make more general improvements to > the test: > * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data > structures > * harden the assertions made about the expected events recieved (updating > some test methods that currently just assert the number of events recieved) > * add new assertions that _only_ the expected events are recieved. > In the process of doing this, I've found several oddities/descrepencies > between things the test currently claims/asserts, and what *actually* happens > under more rigerous scrutiny/assertions. > I'll attach a patch shortly that has my (in progress) updates and inlcudes > copious nocommits about things seem suspect. the summary of these concerns > is: > * SolrException status codes that do not match what the existing test says > they should (but doesn't assert) > * extra AuditEvents occuring that the existing test does not expect > * AuditEvents for incorrect credentials that do not at all match the expected > AuditEvent in the existing test -- which the current test seems to miss in > it's assertions because it's picking up some extra events from triggered by > previuos requests earlier in the test that just happen to also match the > asserctions. > ...it's not clear to me if the test logic is correct and these are "code > bugs" or if the test is faulty. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13741) possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest
[ https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953269#comment-16953269 ] Jan Høydahl commented on SOLR-13741: {quote}Jan: you just beat me to it ... my updated patch looks exactly like yours, but with more lazy whitespace :) {quote} :) I'll let you take it from here and do the merge. > possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest > -- > > Key: SOLR-13741 > URL: https://issues.apache.org/jira/browse/SOLR-13741 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, > SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, > SOLR-13741.patch > > > A while back i saw a weird non-reproducible failure from > AuditLoggerIntegrationTest. When i started reading through that code, 2 > things jumped out at me: > # the way the 'delay' option works is brittle, and makes assumptions about > CPU scheduling that aren't neccessarily going to be true (and also suffers > from the problem that Thread.sleep isn't garunteed to sleep as long as you > ask it too) > # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by > checking the size of a (List) {{buffer}} of recieved events in a sleep/poll > loop, until it contains at least N items -- but the code that adds items to > that buffer in the async Callback thread async _before_ the code that updates > other state variables (like the global {{count}} and the patch specific > {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 > events added to the buffer, but calling {{assertEquals(3, > receiver.getTotalCount())}} could subsequently fail because that variable > hadn't been udpated yet. > #2 was the source of the failures I was seeing, and while a quick fix for > that specific problem would be to update all other state _before_ adding the > event to the buffer, I set out to try and make more general improvements to > the test: > * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data > structures > * harden the assertions made about the expected events recieved (updating > some test methods that currently just assert the number of events recieved) > * add new assertions that _only_ the expected events are recieved. > In the process of doing this, I've found several oddities/descrepencies > between things the test currently claims/asserts, and what *actually* happens > under more rigerous scrutiny/assertions. > I'll attach a patch shortly that has my (in progress) updates and inlcudes > copious nocommits about things seem suspect. the summary of these concerns > is: > * SolrException status codes that do not match what the existing test says > they should (but doesn't assert) > * extra AuditEvents occuring that the existing test does not expect > * AuditEvents for incorrect credentials that do not at all match the expected > AuditEvent in the existing test -- which the current test seems to miss in > it's assertions because it's picking up some extra events from triggered by > previuos requests earlier in the test that just happen to also match the > asserctions. > ...it's not clear to me if the test logic is correct and these are "code > bugs" or if the test is faulty. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13741) possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest
[ https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953268#comment-16953268 ] Chris M. Hostetter commented on SOLR-13741: --- Jan: you just beat me to it ... my updated patch looks exactly like yours, but with more lazy whitespace :) Feel free to commit. > possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest > -- > > Key: SOLR-13741 > URL: https://issues.apache.org/jira/browse/SOLR-13741 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, > SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, > SOLR-13741.patch > > > A while back i saw a weird non-reproducible failure from > AuditLoggerIntegrationTest. When i started reading through that code, 2 > things jumped out at me: > # the way the 'delay' option works is brittle, and makes assumptions about > CPU scheduling that aren't neccessarily going to be true (and also suffers > from the problem that Thread.sleep isn't garunteed to sleep as long as you > ask it too) > # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by > checking the size of a (List) {{buffer}} of recieved events in a sleep/poll > loop, until it contains at least N items -- but the code that adds items to > that buffer in the async Callback thread async _before_ the code that updates > other state variables (like the global {{count}} and the patch specific > {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 > events added to the buffer, but calling {{assertEquals(3, > receiver.getTotalCount())}} could subsequently fail because that variable > hadn't been udpated yet. > #2 was the source of the failures I was seeing, and while a quick fix for > that specific problem would be to update all other state _before_ adding the > event to the buffer, I set out to try and make more general improvements to > the test: > * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data > structures > * harden the assertions made about the expected events recieved (updating > some test methods that currently just assert the number of events recieved) > * add new assertions that _only_ the expected events are recieved. > In the process of doing this, I've found several oddities/descrepencies > between things the test currently claims/asserts, and what *actually* happens > under more rigerous scrutiny/assertions. > I'll attach a patch shortly that has my (in progress) updates and inlcudes > copious nocommits about things seem suspect. the summary of these concerns > is: > * SolrException status codes that do not match what the existing test says > they should (but doesn't assert) > * extra AuditEvents occuring that the existing test does not expect > * AuditEvents for incorrect credentials that do not at all match the expected > AuditEvent in the existing test -- which the current test seems to miss in > it's assertions because it's picking up some extra events from triggered by > previuos requests earlier in the test that just happen to also match the > asserctions. > ...it's not clear to me if the test logic is correct and these are "code > bugs" or if the test is faulty. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13741) possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest
[ https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953265#comment-16953265 ] Jan Høydahl commented on SOLR-13741: SOLR-13835 merged and updated this patch to master. > possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest > -- > > Key: SOLR-13741 > URL: https://issues.apache.org/jira/browse/SOLR-13741 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, > SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, > SOLR-13741.patch > > > A while back i saw a weird non-reproducible failure from > AuditLoggerIntegrationTest. When i started reading through that code, 2 > things jumped out at me: > # the way the 'delay' option works is brittle, and makes assumptions about > CPU scheduling that aren't neccessarily going to be true (and also suffers > from the problem that Thread.sleep isn't garunteed to sleep as long as you > ask it too) > # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by > checking the size of a (List) {{buffer}} of recieved events in a sleep/poll > loop, until it contains at least N items -- but the code that adds items to > that buffer in the async Callback thread async _before_ the code that updates > other state variables (like the global {{count}} and the patch specific > {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 > events added to the buffer, but calling {{assertEquals(3, > receiver.getTotalCount())}} could subsequently fail because that variable > hadn't been udpated yet. > #2 was the source of the failures I was seeing, and while a quick fix for > that specific problem would be to update all other state _before_ adding the > event to the buffer, I set out to try and make more general improvements to > the test: > * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data > structures > * harden the assertions made about the expected events recieved (updating > some test methods that currently just assert the number of events recieved) > * add new assertions that _only_ the expected events are recieved. > In the process of doing this, I've found several oddities/descrepencies > between things the test currently claims/asserts, and what *actually* happens > under more rigerous scrutiny/assertions. > I'll attach a patch shortly that has my (in progress) updates and inlcudes > copious nocommits about things seem suspect. the summary of these concerns > is: > * SolrException status codes that do not match what the existing test says > they should (but doesn't assert) > * extra AuditEvents occuring that the existing test does not expect > * AuditEvents for incorrect credentials that do not at all match the expected > AuditEvent in the existing test -- which the current test seems to miss in > it's assertions because it's picking up some extra events from triggered by > previuos requests earlier in the test that just happen to also match the > asserctions. > ...it's not clear to me if the test logic is correct and these are "code > bugs" or if the test is faulty. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13741) possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest
[ https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16951197#comment-16951197 ] Chris M. Hostetter commented on SOLR-13741: --- +1 > possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest > -- > > Key: SOLR-13741 > URL: https://issues.apache.org/jira/browse/SOLR-13741 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, > SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch > > > A while back i saw a weird non-reproducible failure from > AuditLoggerIntegrationTest. When i started reading through that code, 2 > things jumped out at me: > # the way the 'delay' option works is brittle, and makes assumptions about > CPU scheduling that aren't neccessarily going to be true (and also suffers > from the problem that Thread.sleep isn't garunteed to sleep as long as you > ask it too) > # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by > checking the size of a (List) {{buffer}} of recieved events in a sleep/poll > loop, until it contains at least N items -- but the code that adds items to > that buffer in the async Callback thread async _before_ the code that updates > other state variables (like the global {{count}} and the patch specific > {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 > events added to the buffer, but calling {{assertEquals(3, > receiver.getTotalCount())}} could subsequently fail because that variable > hadn't been udpated yet. > #2 was the source of the failures I was seeing, and while a quick fix for > that specific problem would be to update all other state _before_ adding the > event to the buffer, I set out to try and make more general improvements to > the test: > * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data > structures > * harden the assertions made about the expected events recieved (updating > some test methods that currently just assert the number of events recieved) > * add new assertions that _only_ the expected events are recieved. > In the process of doing this, I've found several oddities/descrepencies > between things the test currently claims/asserts, and what *actually* happens > under more rigerous scrutiny/assertions. > I'll attach a patch shortly that has my (in progress) updates and inlcudes > copious nocommits about things seem suspect. the summary of these concerns > is: > * SolrException status codes that do not match what the existing test says > they should (but doesn't assert) > * extra AuditEvents occuring that the existing test does not expect > * AuditEvents for incorrect credentials that do not at all match the expected > AuditEvent in the existing test -- which the current test seems to miss in > it's assertions because it's picking up some extra events from triggered by > previuos requests earlier in the test that just happen to also match the > asserctions. > ...it's not clear to me if the test logic is correct and these are "code > bugs" or if the test is faulty. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13741) possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest
[ https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16950798#comment-16950798 ] Jan Høydahl commented on SOLR-13741: So the next steps are * Merge SOLR-13835 now in time for 8.3 * Update patch in this issue, removing the conditional logic depending on this * Merge these test improvements * Continue in SOLR-13837 and other issues with improvements for other known issues > possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest > -- > > Key: SOLR-13741 > URL: https://issues.apache.org/jira/browse/SOLR-13741 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, > SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch > > > A while back i saw a weird non-reproducible failure from > AuditLoggerIntegrationTest. When i started reading through that code, 2 > things jumped out at me: > # the way the 'delay' option works is brittle, and makes assumptions about > CPU scheduling that aren't neccessarily going to be true (and also suffers > from the problem that Thread.sleep isn't garunteed to sleep as long as you > ask it too) > # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by > checking the size of a (List) {{buffer}} of recieved events in a sleep/poll > loop, until it contains at least N items -- but the code that adds items to > that buffer in the async Callback thread async _before_ the code that updates > other state variables (like the global {{count}} and the patch specific > {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 > events added to the buffer, but calling {{assertEquals(3, > receiver.getTotalCount())}} could subsequently fail because that variable > hadn't been udpated yet. > #2 was the source of the failures I was seeing, and while a quick fix for > that specific problem would be to update all other state _before_ adding the > event to the buffer, I set out to try and make more general improvements to > the test: > * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data > structures > * harden the assertions made about the expected events recieved (updating > some test methods that currently just assert the number of events recieved) > * add new assertions that _only_ the expected events are recieved. > In the process of doing this, I've found several oddities/descrepencies > between things the test currently claims/asserts, and what *actually* happens > under more rigerous scrutiny/assertions. > I'll attach a patch shortly that has my (in progress) updates and inlcudes > copious nocommits about things seem suspect. the summary of these concerns > is: > * SolrException status codes that do not match what the existing test says > they should (but doesn't assert) > * extra AuditEvents occuring that the existing test does not expect > * AuditEvents for incorrect credentials that do not at all match the expected > AuditEvent in the existing test -- which the current test seems to miss in > it's assertions because it's picking up some extra events from triggered by > previuos requests earlier in the test that just happen to also match the > asserctions. > ...it's not clear to me if the test logic is correct and these are "code > bugs" or if the test is faulty. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13741) possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest
[ https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16950466#comment-16950466 ] Jan Høydahl commented on SOLR-13741: Removed the code changes from from patch again so it only contains test improvements. > possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest > -- > > Key: SOLR-13741 > URL: https://issues.apache.org/jira/browse/SOLR-13741 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, > SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch > > > A while back i saw a weird non-reproducible failure from > AuditLoggerIntegrationTest. When i started reading through that code, 2 > things jumped out at me: > # the way the 'delay' option works is brittle, and makes assumptions about > CPU scheduling that aren't neccessarily going to be true (and also suffers > from the problem that Thread.sleep isn't garunteed to sleep as long as you > ask it too) > # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by > checking the size of a (List) {{buffer}} of recieved events in a sleep/poll > loop, until it contains at least N items -- but the code that adds items to > that buffer in the async Callback thread async _before_ the code that updates > other state variables (like the global {{count}} and the patch specific > {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 > events added to the buffer, but calling {{assertEquals(3, > receiver.getTotalCount())}} could subsequently fail because that variable > hadn't been udpated yet. > #2 was the source of the failures I was seeing, and while a quick fix for > that specific problem would be to update all other state _before_ adding the > event to the buffer, I set out to try and make more general improvements to > the test: > * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data > structures > * harden the assertions made about the expected events recieved (updating > some test methods that currently just assert the number of events recieved) > * add new assertions that _only_ the expected events are recieved. > In the process of doing this, I've found several oddities/descrepencies > between things the test currently claims/asserts, and what *actually* happens > under more rigerous scrutiny/assertions. > I'll attach a patch shortly that has my (in progress) updates and inlcudes > copious nocommits about things seem suspect. the summary of these concerns > is: > * SolrException status codes that do not match what the existing test says > they should (but doesn't assert) > * extra AuditEvents occuring that the existing test does not expect > * AuditEvents for incorrect credentials that do not at all match the expected > AuditEvent in the existing test -- which the current test seems to miss in > it's assertions because it's picking up some extra events from triggered by > previuos requests earlier in the test that just happen to also match the > asserctions. > ...it's not clear to me if the test logic is correct and these are "code > bugs" or if the test is faulty. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13741) possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest
[ https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16950421#comment-16950421 ] Jan Høydahl commented on SOLR-13741: Spun off SOLR-13840 for the bugs when generating event based on HttpServletRequest > possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest > -- > > Key: SOLR-13741 > URL: https://issues.apache.org/jira/browse/SOLR-13741 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, > SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch > > > A while back i saw a weird non-reproducible failure from > AuditLoggerIntegrationTest. When i started reading through that code, 2 > things jumped out at me: > # the way the 'delay' option works is brittle, and makes assumptions about > CPU scheduling that aren't neccessarily going to be true (and also suffers > from the problem that Thread.sleep isn't garunteed to sleep as long as you > ask it too) > # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by > checking the size of a (List) {{buffer}} of recieved events in a sleep/poll > loop, until it contains at least N items -- but the code that adds items to > that buffer in the async Callback thread async _before_ the code that updates > other state variables (like the global {{count}} and the patch specific > {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 > events added to the buffer, but calling {{assertEquals(3, > receiver.getTotalCount())}} could subsequently fail because that variable > hadn't been udpated yet. > #2 was the source of the failures I was seeing, and while a quick fix for > that specific problem would be to update all other state _before_ adding the > event to the buffer, I set out to try and make more general improvements to > the test: > * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data > structures > * harden the assertions made about the expected events recieved (updating > some test methods that currently just assert the number of events recieved) > * add new assertions that _only_ the expected events are recieved. > In the process of doing this, I've found several oddities/descrepencies > between things the test currently claims/asserts, and what *actually* happens > under more rigerous scrutiny/assertions. > I'll attach a patch shortly that has my (in progress) updates and inlcudes > copious nocommits about things seem suspect. the summary of these concerns > is: > * SolrException status codes that do not match what the existing test says > they should (but doesn't assert) > * extra AuditEvents occuring that the existing test does not expect > * AuditEvents for incorrect credentials that do not at all match the expected > AuditEvent in the existing test -- which the current test seems to miss in > it's assertions because it's picking up some extra events from triggered by > previuos requests earlier in the test that just happen to also match the > asserctions. > ...it's not clear to me if the test logic is correct and these are "code > bugs" or if the test is faulty. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13741) possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest
[ https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16950410#comment-16950410 ] Jan Høydahl commented on SOLR-13741: Hmm, will check where that json file could have hidden.. > possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest > -- > > Key: SOLR-13741 > URL: https://issues.apache.org/jira/browse/SOLR-13741 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, > SOLR-13741.patch, SOLR-13741.patch > > > A while back i saw a weird non-reproducible failure from > AuditLoggerIntegrationTest. When i started reading through that code, 2 > things jumped out at me: > # the way the 'delay' option works is brittle, and makes assumptions about > CPU scheduling that aren't neccessarily going to be true (and also suffers > from the problem that Thread.sleep isn't garunteed to sleep as long as you > ask it too) > # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by > checking the size of a (List) {{buffer}} of recieved events in a sleep/poll > loop, until it contains at least N items -- but the code that adds items to > that buffer in the async Callback thread async _before_ the code that updates > other state variables (like the global {{count}} and the patch specific > {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 > events added to the buffer, but calling {{assertEquals(3, > receiver.getTotalCount())}} could subsequently fail because that variable > hadn't been udpated yet. > #2 was the source of the failures I was seeing, and while a quick fix for > that specific problem would be to update all other state _before_ adding the > event to the buffer, I set out to try and make more general improvements to > the test: > * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data > structures > * harden the assertions made about the expected events recieved (updating > some test methods that currently just assert the number of events recieved) > * add new assertions that _only_ the expected events are recieved. > In the process of doing this, I've found several oddities/descrepencies > between things the test currently claims/asserts, and what *actually* happens > under more rigerous scrutiny/assertions. > I'll attach a patch shortly that has my (in progress) updates and inlcudes > copious nocommits about things seem suspect. the summary of these concerns > is: > * SolrException status codes that do not match what the existing test says > they should (but doesn't assert) > * extra AuditEvents occuring that the existing test does not expect > * AuditEvents for incorrect credentials that do not at all match the expected > AuditEvent in the existing test -- which the current test seems to miss in > it's assertions because it's picking up some extra events from triggered by > previuos requests earlier in the test that just happen to also match the > asserctions. > ...it's not clear to me if the test logic is correct and these are "code > bugs" or if the test is faulty. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13741) possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest
[ https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16949658#comment-16949658 ] Chris M. Hostetter commented on SOLR-13741: --- {quote}I will attach a new patch with some of this fixed: {quote} Jan, your latest patch seem to be incomplete? it's definitely missing the {{auditlog_plugin_security.json}} changes from my earlier patch ... I didn't look closely to be certain if anything else is wonky? In any case: since what you're describing seems like an actual bug fix (or improvement) to the behavior of the AuditLogger that will actual affect users – and not just a "test fix" -- i think we should spin it off to it's own Jira for tracking purposes. I'm also not sure if you're confident in those changes and want to move forward & commit them now, or if you wanted to discuss & consider them more in the context of how much active parsing the AuditLogger/AuditEvent code should _do_ in this un-authenticated sitaution since it is an "untrusted request". Either way, we can move forward independently with both efforts: * a new jira to track improvements to the logging in the failed-authentication situation * this jira for tracking test improvements that can document & assert the behavior of the auditlogger on un-authenticated requests until/unless the new jira is fixed first (similar to what i have in the patch for SOLR-13835) does that sound good? > possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest > -- > > Key: SOLR-13741 > URL: https://issues.apache.org/jira/browse/SOLR-13741 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, > SOLR-13741.patch, SOLR-13741.patch > > > A while back i saw a weird non-reproducible failure from > AuditLoggerIntegrationTest. When i started reading through that code, 2 > things jumped out at me: > # the way the 'delay' option works is brittle, and makes assumptions about > CPU scheduling that aren't neccessarily going to be true (and also suffers > from the problem that Thread.sleep isn't garunteed to sleep as long as you > ask it too) > # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by > checking the size of a (List) {{buffer}} of recieved events in a sleep/poll > loop, until it contains at least N items -- but the code that adds items to > that buffer in the async Callback thread async _before_ the code that updates > other state variables (like the global {{count}} and the patch specific > {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 > events added to the buffer, but calling {{assertEquals(3, > receiver.getTotalCount())}} could subsequently fail because that variable > hadn't been udpated yet. > #2 was the source of the failures I was seeing, and while a quick fix for > that specific problem would be to update all other state _before_ adding the > event to the buffer, I set out to try and make more general improvements to > the test: > * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data > structures > * harden the assertions made about the expected events recieved (updating > some test methods that currently just assert the number of events recieved) > * add new assertions that _only_ the expected events are recieved. > In the process of doing this, I've found several oddities/descrepencies > between things the test currently claims/asserts, and what *actually* happens > under more rigerous scrutiny/assertions. > I'll attach a patch shortly that has my (in progress) updates and inlcudes > copious nocommits about things seem suspect. the summary of these concerns > is: > * SolrException status codes that do not match what the existing test says > they should (but doesn't assert) > * extra AuditEvents occuring that the existing test does not expect > * AuditEvents for incorrect credentials that do not at all match the expected > AuditEvent in the existing test -- which the current test seems to miss in > it's assertions because it's picking up some extra events from triggered by > previuos requests earlier in the test that just happen to also match the > asserctions. > ...it's not clear to me if the test logic is correct and these are "code > bugs" or if the test is faulty. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13741) possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest
[ https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16949366#comment-16949366 ] Jan Høydahl commented on SOLR-13741: Ok, uploaded yet another patch with a new test for V2 API. Discovered that the path is {{/v2/c}} and not {{/api/c}} as expected, so modified the ADMIN detection based on that. Also for V2 request we are lacking in several ways: * We do not audit log the BODY of the request (which is where the action is) * We do not detect what collections the request is for (so the AuditEvent#collections array is null) * The resource path is internal format {{/v2/c}} instead of {{/api/c}} (should we convert the prefix in the AuditEvent?) I spun V2 improvements off into SOLR-13837 to not delay this effort > possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest > -- > > Key: SOLR-13741 > URL: https://issues.apache.org/jira/browse/SOLR-13741 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-13741.patch, SOLR-13741.patch, SOLR-13741.patch, > SOLR-13741.patch, SOLR-13741.patch > > > A while back i saw a weird non-reproducible failure from > AuditLoggerIntegrationTest. When i started reading through that code, 2 > things jumped out at me: > # the way the 'delay' option works is brittle, and makes assumptions about > CPU scheduling that aren't neccessarily going to be true (and also suffers > from the problem that Thread.sleep isn't garunteed to sleep as long as you > ask it too) > # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by > checking the size of a (List) {{buffer}} of recieved events in a sleep/poll > loop, until it contains at least N items -- but the code that adds items to > that buffer in the async Callback thread async _before_ the code that updates > other state variables (like the global {{count}} and the patch specific > {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 > events added to the buffer, but calling {{assertEquals(3, > receiver.getTotalCount())}} could subsequently fail because that variable > hadn't been udpated yet. > #2 was the source of the failures I was seeing, and while a quick fix for > that specific problem would be to update all other state _before_ adding the > event to the buffer, I set out to try and make more general improvements to > the test: > * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data > structures > * harden the assertions made about the expected events recieved (updating > some test methods that currently just assert the number of events recieved) > * add new assertions that _only_ the expected events are recieved. > In the process of doing this, I've found several oddities/descrepencies > between things the test currently claims/asserts, and what *actually* happens > under more rigerous scrutiny/assertions. > I'll attach a patch shortly that has my (in progress) updates and inlcudes > copious nocommits about things seem suspect. the summary of these concerns > is: > * SolrException status codes that do not match what the existing test says > they should (but doesn't assert) > * extra AuditEvents occuring that the existing test does not expect > * AuditEvents for incorrect credentials that do not at all match the expected > AuditEvent in the existing test -- which the current test seems to miss in > it's assertions because it's picking up some extra events from triggered by > previuos requests earlier in the test that just happen to also match the > asserctions. > ...it's not clear to me if the test logic is correct and these are "code > bugs" or if the test is faulty. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13741) possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest
[ https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16949321#comment-16949321 ] Jan Høydahl commented on SOLR-13741: {quote}why did the comment for a "wrong password" claim it was going to get a 403 exception + audit log ? {quote} It should expect 401 when wrong password, this was probably all confused by SOLR-13835 in the initial test. {quote}Are {{/admin/info/key}} events expect when auth is enabled ? ...is it ok for the test to explicitly ignore these events {quote} I can't recall dealing specially with this path. So muting during tests sounds like the right thing to do. Guess you could argue that it could be muted by default but the framework since it is a public always-open path? {quote}why the _actual_ audit log recieved in the "wrong password" situation is so different (and sparse) compared to other audit log events ? // - the resource is *JUST* '/solr' // - note that "resource" for every other expected event in this test class doesn't even // *START* with (or include) the "/solr" portion of the URL // - event 'resource' values are typically "/admin/etc..." // - the requestType is 'UNKNOWN' // - as opposed to the ADMIN that the existing test exists (and seems like should be correct){quote} I will attach a new patch with some of this fixed: * Parsing "resource" from {{httpRequest.getPathInfo()}} instead of {{httpRequest.getContextPath()}} which is always /solr. * Detecting {{/admin/..}} as admin path in {{AuditEvent.findRequestType}} now that the resource is changed, giving requestType=ADMIN * However, principal is not filled since BasicAuth failed, which I believe is correct. But the HTTP headers are there for inspection... It would be nice to have the user field in AuditEvent also in this case, but that would mean that AuthPlugins would need to set it on MDC or something. It would be wrong to set principal on the request since that always means authenticated user, not? {quote}// - this event has no solrParams at all // - even though the httpQueryString show it's from the CREATE test2 req{quote} This event is generated based on {{HttpServletRequest}} so we have no solrParams at this stage. In the new patch I have initialized the solrParams map from the httpRequest for a more consistent AuditEvent experience. Hoss, this test is now so much better than what I managed to whip up the first time, thanks a ton for digging! > possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest > -- > > Key: SOLR-13741 > URL: https://issues.apache.org/jira/browse/SOLR-13741 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Assignee: Chris M. Hostetter >Priority: Major > Attachments: SOLR-13741.patch, SOLR-13741.patch > > > A while back i saw a weird non-reproducible failure from > AuditLoggerIntegrationTest. When i started reading through that code, 2 > things jumped out at me: > # the way the 'delay' option works is brittle, and makes assumptions about > CPU scheduling that aren't neccessarily going to be true (and also suffers > from the problem that Thread.sleep isn't garunteed to sleep as long as you > ask it too) > # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by > checking the size of a (List) {{buffer}} of recieved events in a sleep/poll > loop, until it contains at least N items -- but the code that adds items to > that buffer in the async Callback thread async _before_ the code that updates > other state variables (like the global {{count}} and the patch specific > {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 > events added to the buffer, but calling {{assertEquals(3, > receiver.getTotalCount())}} could subsequently fail because that variable > hadn't been udpated yet. > #2 was the source of the failures I was seeing, and while a quick fix for > that specific problem would be to update all other state _before_ adding the > event to the buffer, I set out to try and make more general improvements to > the test: > * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data > structures > * harden the assertions made about the expected events recieved (updating > some test methods that currently just assert the number of events recieved) > * add new assertions that _only_ the expected events are recieved. > In the process of doing this, I've found several oddities/descrepencies > between things the test currently claims/asserts, and what *actually* happens > under more rigerous scrutiny/assertions. > I'll attach a patch shortly that has my (in progress) updates and inlcudes > copiou
[jira] [Commented] (SOLR-13741) possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest
[ https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16932435#comment-16932435 ] Jan Høydahl commented on SOLR-13741: Btw: I just love IntelliJ's "right-click->git->Show history for selection" feature, it makes it so easy to inspect the history of a small code block! I just discovered the feature a few weeks ago! > possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest > -- > > Key: SOLR-13741 > URL: https://issues.apache.org/jira/browse/SOLR-13741 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Hoss Man >Assignee: Hoss Man >Priority: Major > Attachments: SOLR-13741.patch > > > A while back i saw a weird non-reproducible failure from > AuditLoggerIntegrationTest. When i started reading through that code, 2 > things jumped out at me: > # the way the 'delay' option works is brittle, and makes assumptions about > CPU scheduling that aren't neccessarily going to be true (and also suffers > from the problem that Thread.sleep isn't garunteed to sleep as long as you > ask it too) > # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by > checking the size of a (List) {{buffer}} of recieved events in a sleep/poll > loop, until it contains at least N items -- but the code that adds items to > that buffer in the async Callback thread async _before_ the code that updates > other state variables (like the global {{count}} and the patch specific > {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 > events added to the buffer, but calling {{assertEquals(3, > receiver.getTotalCount())}} could subsequently fail because that variable > hadn't been udpated yet. > #2 was the source of the failures I was seeing, and while a quick fix for > that specific problem would be to update all other state _before_ adding the > event to the buffer, I set out to try and make more general improvements to > the test: > * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data > structures > * harden the assertions made about the expected events recieved (updating > some test methods that currently just assert the number of events recieved) > * add new assertions that _only_ the expected events are recieved. > In the process of doing this, I've found several oddities/descrepencies > between things the test currently claims/asserts, and what *actually* happens > under more rigerous scrutiny/assertions. > I'll attach a patch shortly that has my (in progress) updates and inlcudes > copious nocommits about things seem suspect. the summary of these concerns > is: > * SolrException status codes that do not match what the existing test says > they should (but doesn't assert) > * extra AuditEvents occuring that the existing test does not expect > * AuditEvents for incorrect credentials that do not at all match the expected > AuditEvent in the existing test -- which the current test seems to miss in > it's assertions because it's picking up some extra events from triggered by > previuos requests earlier in the test that just happen to also match the > asserctions. > ...it's not clear to me if the test logic is correct and these are "code > bugs" or if the test is faulty. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13741) possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest
[ https://issues.apache.org/jira/browse/SOLR-13741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16932428#comment-16932428 ] Jan Høydahl commented on SOLR-13741: Thanks for taking a look at this. I see you have some great improvements on how to assert exceptions etc. Wrt the REJECTED + UNAUTHORIZED events I see the same as you, and I believe there is a code bug, not a test bug. In HttpSolrCall#471 in the {{authorize()}} call, if authResponse == PROMPT, it will actually match both blocks and emit two audit events: [https://github.com/apache/lucene-solr/blob/26ede632e6259eb9d16861a3c0f782c9c8999762/solr/core/src/java/org/apache/solr/servlet/HttpSolrCall.java#L475:L493] {code:java} if (authResponse.statusCode == AuthorizationResponse.PROMPT.statusCode) {...} if (!(authResponse.statusCode == HttpStatus.SC_ACCEPTED) && !(authResponse.statusCode == HttpStatus.SC_OK)) {...} {code} When code==401, it is also true that code!=200. Intuitively there should be both a sendErrora and return RETURN before line #484 in the first if block? The first if block was introduced back in 2005 as part of SOLR-7757. [~noble.paul] why does the if not return? It will *always* fall through to and trigger the next if block! > possible AuditLogger bugs uncovered while hardening AuditLoggerIntegrationTest > -- > > Key: SOLR-13741 > URL: https://issues.apache.org/jira/browse/SOLR-13741 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Hoss Man >Assignee: Hoss Man >Priority: Major > Attachments: SOLR-13741.patch > > > A while back i saw a weird non-reproducible failure from > AuditLoggerIntegrationTest. When i started reading through that code, 2 > things jumped out at me: > # the way the 'delay' option works is brittle, and makes assumptions about > CPU scheduling that aren't neccessarily going to be true (and also suffers > from the problem that Thread.sleep isn't garunteed to sleep as long as you > ask it too) > # the way the existing {{waitForAuditEventCallbacks(number)}} logic works by > checking the size of a (List) {{buffer}} of recieved events in a sleep/poll > loop, until it contains at least N items -- but the code that adds items to > that buffer in the async Callback thread async _before_ the code that updates > other state variables (like the global {{count}} and the patch specific > {{resourceCounts}}) meaning that a test waiting on 3 events could "see" 3 > events added to the buffer, but calling {{assertEquals(3, > receiver.getTotalCount())}} could subsequently fail because that variable > hadn't been udpated yet. > #2 was the source of the failures I was seeing, and while a quick fix for > that specific problem would be to update all other state _before_ adding the > event to the buffer, I set out to try and make more general improvements to > the test: > * eliminate the dependency on sleep loops by {{await}}-ing on concurrent data > structures > * harden the assertions made about the expected events recieved (updating > some test methods that currently just assert the number of events recieved) > * add new assertions that _only_ the expected events are recieved. > In the process of doing this, I've found several oddities/descrepencies > between things the test currently claims/asserts, and what *actually* happens > under more rigerous scrutiny/assertions. > I'll attach a patch shortly that has my (in progress) updates and inlcudes > copious nocommits about things seem suspect. the summary of these concerns > is: > * SolrException status codes that do not match what the existing test says > they should (but doesn't assert) > * extra AuditEvents occuring that the existing test does not expect > * AuditEvents for incorrect credentials that do not at all match the expected > AuditEvent in the existing test -- which the current test seems to miss in > it's assertions because it's picking up some extra events from triggered by > previuos requests earlier in the test that just happen to also match the > asserctions. > ...it's not clear to me if the test logic is correct and these are "code > bugs" or if the test is faulty. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org