[jira] [Commented] (CASSANDRA-10848) Upgrade paging dtests involving deletion flap on CassCI
[ https://issues.apache.org/jira/browse/CASSANDRA-10848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15428379#comment-15428379 ] Russ Hatch commented on CASSANDRA-10848: Given the current (happy) state of upgrade tests, I think this issue is safe to resolve. If there turns out to be a recurrence I think it would be better handled by a fresh issue for clarity. > Upgrade paging dtests involving deletion flap on CassCI > --- > > Key: CASSANDRA-10848 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10848 > Project: Cassandra > Issue Type: Bug >Reporter: Jim Witschey >Assignee: Russ Hatch > Labels: dtest > Fix For: 3.0.x, 3.x > > > A number of dtests in the {{upgrade_tests.paging_tests}} that involve > deletion flap with the following error: > {code} > Requested pages were not delivered before timeout. > {code} > This may just be an effect of CASSANDRA-10730, but it's worth having a look > at separately. Here are some examples of tests flapping in this way: -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10848) Upgrade paging dtests involving deletion flap on CassCI
[ https://issues.apache.org/jira/browse/CASSANDRA-10848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15412448#comment-15412448 ] Russ Hatch commented on CASSANDRA-10848: I think of the remaining failures, a large number (maybe all) can be attributed to CASSANDRA-12249. I'd like to keep this open until 12249 is resolved, but that combined with the proto v4 test fixes might have us in pretty good shape with these tests. > Upgrade paging dtests involving deletion flap on CassCI > --- > > Key: CASSANDRA-10848 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10848 > Project: Cassandra > Issue Type: Bug >Reporter: Jim Witschey >Assignee: Russ Hatch > Labels: dtest > Fix For: 3.0.x, 3.x > > > A number of dtests in the {{upgrade_tests.paging_tests}} that involve > deletion flap with the following error: > {code} > Requested pages were not delivered before timeout. > {code} > This may just be an effect of CASSANDRA-10730, but it's worth having a look > at separately. Here are some examples of tests flapping in this way: -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10848) Upgrade paging dtests involving deletion flap on CassCI
[ https://issues.apache.org/jira/browse/CASSANDRA-10848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15412149#comment-15412149 ] Russ Hatch commented on CASSANDRA-10848: Ah ok, I misunderstood. Got a PR open for that change too, here: https://github.com/riptano/cassandra-dtest/pull/1190. > Upgrade paging dtests involving deletion flap on CassCI > --- > > Key: CASSANDRA-10848 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10848 > Project: Cassandra > Issue Type: Bug >Reporter: Jim Witschey >Assignee: Russ Hatch > Labels: dtest > Fix For: 3.0.x, 3.x > > > A number of dtests in the {{upgrade_tests.paging_tests}} that involve > deletion flap with the following error: > {code} > Requested pages were not delivered before timeout. > {code} > This may just be an effect of CASSANDRA-10730, but it's worth having a look > at separately. Here are some examples of tests flapping in this way: -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10848) Upgrade paging dtests involving deletion flap on CassCI
[ https://issues.apache.org/jira/browse/CASSANDRA-10848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15409018#comment-15409018 ] Sylvain Lebresne commented on CASSANDRA-10848: -- bq. I created a dtest fix for the 2.2/3.0 problem We should really do that for any 2.x -> 3.0 upgrade, so 2.1/3.0 too. > Upgrade paging dtests involving deletion flap on CassCI > --- > > Key: CASSANDRA-10848 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10848 > Project: Cassandra > Issue Type: Bug >Reporter: Jim Witschey >Assignee: DS Test Eng > Labels: dtest > Fix For: 3.0.x, 3.x > > > A number of dtests in the {{upgrade_tests.paging_tests}} that involve > deletion flap with the following error: > {code} > Requested pages were not delivered before timeout. > {code} > This may just be an effect of CASSANDRA-10730, but it's worth having a look > at separately. Here are some examples of tests flapping in this way: -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10848) Upgrade paging dtests involving deletion flap on CassCI
[ https://issues.apache.org/jira/browse/CASSANDRA-10848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408551#comment-15408551 ] Russ Hatch commented on CASSANDRA-10848: I created a dtest fix for the 2.2/3.0 problem (CASSANDRA-10880), so hopefully that can help at least a few tests. Dtest pr here: https://github.com/riptano/cassandra-dtest/pull/1176 Digging into the results, looks like we're still seeing timeouts on other version combinations, so I don't think the 2.2/3.0 v4 problem could account for everything. Still not quite sure what else could be going on, perhaps a subtle problem with the test code itself when the timeouts are encountered. > Upgrade paging dtests involving deletion flap on CassCI > --- > > Key: CASSANDRA-10848 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10848 > Project: Cassandra > Issue Type: Bug >Reporter: Jim Witschey >Assignee: DS Test Eng > Labels: dtest > Fix For: 3.0.x, 3.x > > > A number of dtests in the {{upgrade_tests.paging_tests}} that involve > deletion flap with the following error: > {code} > Requested pages were not delivered before timeout. > {code} > This may just be an effect of CASSANDRA-10730, but it's worth having a look > at separately. Here are some examples of tests flapping in this way: -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10848) Upgrade paging dtests involving deletion flap on CassCI
[ https://issues.apache.org/jira/browse/CASSANDRA-10848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408261#comment-15408261 ] Russ Hatch commented on CASSANDRA-10848: I'll look into the protocol version issue you mentioned, that could very well be impacting the tests. > Upgrade paging dtests involving deletion flap on CassCI > --- > > Key: CASSANDRA-10848 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10848 > Project: Cassandra > Issue Type: Bug >Reporter: Jim Witschey >Assignee: DS Test Eng > Labels: dtest > Fix For: 3.0.x, 3.x > > > A number of dtests in the {{upgrade_tests.paging_tests}} that involve > deletion flap with the following error: > {code} > Requested pages were not delivered before timeout. > {code} > This may just be an effect of CASSANDRA-10730, but it's worth having a look > at separately. Here are some examples of tests flapping in this way: -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10848) Upgrade paging dtests involving deletion flap on CassCI
[ https://issues.apache.org/jira/browse/CASSANDRA-10848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403925#comment-15403925 ] Sylvain Lebresne commented on CASSANDRA-10848: -- As the test we're talking about here are paging tests, one thing that could be worth a double check is CASSANDRA-10880. That is, we should make sure that during upgrades from 2.X to 3.0.x, we make sure the native protocol version is fixed to version 3 (we should not be using version 4 *even* when talking to 3.0 nodes, and this until the upgrade is over). If it's not guaranteed by the tests, bad things will happen, including server side error that would break query and get the {{Requested pages were not delivered before timeout}} error above. I'm mentioning that because I ran into this with another test today, so it seems that by default the python driver will try to connect with the max possible version to each node. I'm not familiar enough with the upgrade test code enough to say if it fixes the protocol version or not, but if it doesn't, it should. > Upgrade paging dtests involving deletion flap on CassCI > --- > > Key: CASSANDRA-10848 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10848 > Project: Cassandra > Issue Type: Bug >Reporter: Jim Witschey >Assignee: DS Test Eng > Labels: dtest > Fix For: 3.0.x, 3.x > > > A number of dtests in the {{upgrade_tests.paging_tests}} that involve > deletion flap with the following error: > {code} > Requested pages were not delivered before timeout. > {code} > This may just be an effect of CASSANDRA-10730, but it's worth having a look > at separately. Here are some examples of tests flapping in this way: -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10848) Upgrade paging dtests involving deletion flap on CassCI
[ https://issues.apache.org/jira/browse/CASSANDRA-10848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399643#comment-15399643 ] Russ Hatch commented on CASSANDRA-10848: I've had similar difficulty trying to get a test error to repro with 500 iterations. But the CI results still stand, so I'm not sure how we repro and fix the test issue. > Upgrade paging dtests involving deletion flap on CassCI > --- > > Key: CASSANDRA-10848 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10848 > Project: Cassandra > Issue Type: Bug >Reporter: Jim Witschey >Assignee: DS Test Eng > Labels: dtest > Fix For: 3.0.x, 3.x > > > A number of dtests in the {{upgrade_tests.paging_tests}} that involve > deletion flap with the following error: > {code} > Requested pages were not delivered before timeout. > {code} > This may just be an effect of CASSANDRA-10730, but it's worth having a look > at separately. Here are some examples of tests flapping in this way: -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10848) Upgrade paging dtests involving deletion flap on CassCI
[ https://issues.apache.org/jira/browse/CASSANDRA-10848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15395342#comment-15395342 ] Sylvain Lebresne commented on CASSANDRA-10848: -- bq. can't seem to get a local repro after approx 200 runs of test_single_partitions_deletions To be honest, I'm not really sure how to track this problem down if I can't reproduce locally at all. I guess it would interesting to have a look at the nodes logs for one of the CI failures, but I'm not sure how to do that, assuming CI does keep the logs around (and if it doesn't, there is no hope to make progress here unless we change that). Also, all the links to that ticket now 404, do we have links to recent failure of this? I'm also not even sure how to actually run one of those paging upgrade tests locally if I wanted to reproduce locally. Any hints here? > Upgrade paging dtests involving deletion flap on CassCI > --- > > Key: CASSANDRA-10848 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10848 > Project: Cassandra > Issue Type: Bug >Reporter: Jim Witschey >Assignee: Sylvain Lebresne > Labels: dtest > Fix For: 3.0.x, 3.x > > > A number of dtests in the {{upgrade_tests.paging_tests}} that involve > deletion flap with the following error: > {code} > Requested pages were not delivered before timeout. > {code} > This may just be an effect of CASSANDRA-10730, but it's worth having a look > at separately. Here are some examples of tests flapping in this way: -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10848) Upgrade paging dtests involving deletion flap on CassCI
[ https://issues.apache.org/jira/browse/CASSANDRA-10848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246798#comment-15246798 ] Russ Hatch commented on CASSANDRA-10848: Interestingly, tripling the timeout value didn't drop test failures by a very significant rate. I think this should probably be considered a bug given the long timespan that's still not able to deliver pages in these random mixed version failures. > Upgrade paging dtests involving deletion flap on CassCI > --- > > Key: CASSANDRA-10848 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10848 > Project: Cassandra > Issue Type: Sub-task >Reporter: Jim Witschey >Assignee: Russ Hatch > Labels: dtest > Fix For: 3.0.x, 3.x > > > A number of dtests in the {{upgrade_tests.paging_tests}} that involve > deletion flap with the following error: > {code} > Requested pages were not delivered before timeout. > {code} > This may just be an effect of CASSANDRA-10730, but it's worth having a look > at separately. Here are some examples of tests flapping in this way: -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10848) Upgrade paging dtests involving deletion flap on CassCI
[ https://issues.apache.org/jira/browse/CASSANDRA-10848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246679#comment-15246679 ] Russ Hatch commented on CASSANDRA-10848: already at a 10 second timeout, but going to try giving the dtest 30 second timeouts just to see if that improves (then perhaps the next question would be 'why is it taking so long?'): http://cassci.datastax.com/view/Parameterized/job/parameterized_dtest_multiplexer/70/ > Upgrade paging dtests involving deletion flap on CassCI > --- > > Key: CASSANDRA-10848 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10848 > Project: Cassandra > Issue Type: Sub-task >Reporter: Jim Witschey >Assignee: Russ Hatch > Labels: dtest > Fix For: 3.0.x, 3.x > > > A number of dtests in the {{upgrade_tests.paging_tests}} that involve > deletion flap with the following error: > {code} > Requested pages were not delivered before timeout. > {code} > This may just be an effect of CASSANDRA-10730, but it's worth having a look > at separately. Here are some examples of tests flapping in this way: -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10848) Upgrade paging dtests involving deletion flap on CassCI
[ https://issues.apache.org/jira/browse/CASSANDRA-10848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15238235#comment-15238235 ] Russ Hatch commented on CASSANDRA-10848: results aggregation was a bit weird on the last run, running one more here to try and fix that (so we have an accurate idea of how much this actually occurs): http://cassci.datastax.com/view/Parameterized/job/parameterized_dtest_multiplexer/66/ > Upgrade paging dtests involving deletion flap on CassCI > --- > > Key: CASSANDRA-10848 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10848 > Project: Cassandra > Issue Type: Sub-task >Reporter: Jim Witschey >Assignee: Russ Hatch > Labels: dtest > Fix For: 3.0.x, 3.x > > > A number of dtests in the {{upgrade_tests.paging_tests}} that involve > deletion flap with the following error: > {code} > Requested pages were not delivered before timeout. > {code} > This may just be an effect of CASSANDRA-10730, but it's worth having a look > at separately. Here are some examples of tests flapping in this way: -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10848) Upgrade paging dtests involving deletion flap on CassCI
[ https://issues.apache.org/jira/browse/CASSANDRA-10848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15238233#comment-15238233 ] Russ Hatch commented on CASSANDRA-10848: still appears to be happening on CI however. > Upgrade paging dtests involving deletion flap on CassCI > --- > > Key: CASSANDRA-10848 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10848 > Project: Cassandra > Issue Type: Sub-task >Reporter: Jim Witschey >Assignee: Russ Hatch > Labels: dtest > Fix For: 3.0.x, 3.x > > > A number of dtests in the {{upgrade_tests.paging_tests}} that involve > deletion flap with the following error: > {code} > Requested pages were not delivered before timeout. > {code} > This may just be an effect of CASSANDRA-10730, but it's worth having a look > at separately. Here are some examples of tests flapping in this way: -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10848) Upgrade paging dtests involving deletion flap on CassCI
[ https://issues.apache.org/jira/browse/CASSANDRA-10848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15238066#comment-15238066 ] Russ Hatch commented on CASSANDRA-10848: can't seem to get a local repro after approx 200 runs of test_single_partitions_deletions. going to try a multiplexer run for that particular test here: http://cassci.datastax.com/view/Parameterized/job/parameterized_dtest_multiplexer/65/ > Upgrade paging dtests involving deletion flap on CassCI > --- > > Key: CASSANDRA-10848 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10848 > Project: Cassandra > Issue Type: Sub-task >Reporter: Jim Witschey >Assignee: Russ Hatch > Labels: dtest > Fix For: 3.0.x, 3.x > > > A number of dtests in the {{upgrade_tests.paging_tests}} that involve > deletion flap with the following error: > {code} > Requested pages were not delivered before timeout. > {code} > This may just be an effect of CASSANDRA-10730, but it's worth having a look > at separately. Here are some examples of tests flapping in this way: -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10848) Upgrade paging dtests involving deletion flap on CassCI
[ https://issues.apache.org/jira/browse/CASSANDRA-10848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15237797#comment-15237797 ] Russ Hatch commented on CASSANDRA-10848: nevermind, looks like at least one of the failing tests are getting 10 seconds for each page, that should be plenty enough. must be something else going on here. > Upgrade paging dtests involving deletion flap on CassCI > --- > > Key: CASSANDRA-10848 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10848 > Project: Cassandra > Issue Type: Sub-task >Reporter: Jim Witschey >Assignee: Russ Hatch > Labels: dtest > Fix For: 3.0.x, 3.x > > > A number of dtests in the {{upgrade_tests.paging_tests}} that involve > deletion flap with the following error: > {code} > Requested pages were not delivered before timeout. > {code} > This may just be an effect of CASSANDRA-10730, but it's worth having a look > at separately. Here are some examples of tests flapping in this way: -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10848) Upgrade paging dtests involving deletion flap on CassCI
[ https://issues.apache.org/jira/browse/CASSANDRA-10848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15237584#comment-15237584 ] Russ Hatch commented on CASSANDRA-10848: looks like the default of 5 seconds might just not be enough time in some cases, fix may just be to bump up the default timeout used in this tests. > Upgrade paging dtests involving deletion flap on CassCI > --- > > Key: CASSANDRA-10848 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10848 > Project: Cassandra > Issue Type: Sub-task >Reporter: Jim Witschey >Assignee: Russ Hatch > Labels: dtest > Fix For: 3.0.x, 3.x > > > A number of dtests in the {{upgrade_tests.paging_tests}} that involve > deletion flap with the following error: > {code} > Requested pages were not delivered before timeout. > {code} > This may just be an effect of CASSANDRA-10730, but it's worth having a look > at separately. Here are some examples of tests flapping in this way: -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10848) Upgrade paging dtests involving deletion flap on CassCI
[ https://issues.apache.org/jira/browse/CASSANDRA-10848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15237581#comment-15237581 ] Russ Hatch commented on CASSANDRA-10848: some more recent failures: http://cassci.datastax.com/view/Upgrades/job/upgrade_tests-all/33/testReport/upgrade_tests.paging_test/TestPagingWithDeletionsNodes3RF3_2_1_UpTo_3_0_HEAD/test_single_partition_deletions/ http://cassci.datastax.com/view/Upgrades/job/upgrade_tests-all/32/testReport/upgrade_tests.paging_test/TestPagingWithDeletionsNodes2RF1_2_2_HEAD_UpTo_Trunk/test_single_partition_deletions/ http://cassci.datastax.com/view/Upgrades/job/upgrade_tests-all/32/testReport/upgrade_tests.paging_test/TestPagingWithDeletionsNodes2RF1_2_2_UpTo_Trunk/test_single_partition_deletions/ > Upgrade paging dtests involving deletion flap on CassCI > --- > > Key: CASSANDRA-10848 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10848 > Project: Cassandra > Issue Type: Sub-task >Reporter: Jim Witschey >Assignee: Russ Hatch > Labels: dtest > Fix For: 3.0.x, 3.x > > > A number of dtests in the {{upgrade_tests.paging_tests}} that involve > deletion flap with the following error: > {code} > Requested pages were not delivered before timeout. > {code} > This may just be an effect of CASSANDRA-10730, but it's worth having a look > at separately. Here are some examples of tests flapping in this way: > http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-3.0_dtest/422/testReport/junit/upgrade_tests.paging_test/TestPagingWithDeletionsNodes2RF1/test_multiple_partition_deletions/ > http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-3.0_dtest/422/testReport/junit/upgrade_tests.paging_test/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10848) Upgrade paging dtests involving deletion flap on CassCI
[ https://issues.apache.org/jira/browse/CASSANDRA-10848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15108981#comment-15108981 ] Jim Witschey commented on CASSANDRA-10848: -- This seems to not be fixed despite CASSANDRA-10730 being resolved. It could be a manifestation of CASSANDRA-11016, but we should keep an eye on it even once that ticket is fixed. > Upgrade paging dtests involving deletion flap on CassCI > --- > > Key: CASSANDRA-10848 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10848 > Project: Cassandra > Issue Type: Sub-task >Reporter: Jim Witschey > Fix For: 3.0.x, 3.x > > > A number of dtests in the {{upgrade_tests.paging_tests}} that involve > deletion flap with the following error: > {code} > Requested pages were not delivered before timeout. > {code} > This may just be an effect of CASSANDRA-10730, but it's worth having a look > at separately. Here are some examples of tests flapping in this way: > http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-3.0_dtest/422/testReport/junit/upgrade_tests.paging_test/TestPagingWithDeletionsNodes2RF1/test_multiple_partition_deletions/ > http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-3.0_dtest/422/testReport/junit/upgrade_tests.paging_test/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)