[jira] [Updated] (CASSANDRA-13313) Compaction leftovers not removed on upgrade 2.1/2.2 -> 3.0

2018-01-03 Thread Jeff Jirsa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-13313:
---
Labels: Correctness  (was: )

> Compaction leftovers not removed on upgrade 2.1/2.2 -> 3.0
> --
>
> Key: CASSANDRA-13313
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13313
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: Jeff Jirsa
>Priority: Minor
>  Labels: Correctness
> Fix For: 3.0.x, 3.11.x
>
>
> Before 3.0 we used sstable ancestors to figure out if an sstable was left 
> over after a compaction. In 3.0 the ancestors are ignored and instead we use 
> LogTransaction files to figure it out. 3.0 should still clean up 2.1/2.2 
> compaction leftovers using the on-disk sstable ancestors when available.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13313) Compaction leftovers not removed on upgrade 2.1/2.2 -> 3.0

2018-01-03 Thread Jeff Jirsa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-13313:
---
Priority: Major  (was: Minor)

> Compaction leftovers not removed on upgrade 2.1/2.2 -> 3.0
> --
>
> Key: CASSANDRA-13313
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13313
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: Jeff Jirsa
>  Labels: Correctness
> Fix For: 3.0.x, 3.11.x
>
>
> Before 3.0 we used sstable ancestors to figure out if an sstable was left 
> over after a compaction. In 3.0 the ancestors are ignored and instead we use 
> LogTransaction files to figure it out. 3.0 should still clean up 2.1/2.2 
> compaction leftovers using the on-disk sstable ancestors when available.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13313) Compaction leftovers not removed on upgrade 2.1/2.2 -> 3.0

2018-01-03 Thread Jeff Jirsa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-13313:
---
Fix Version/s: 3.11.x
   3.0.x

> Compaction leftovers not removed on upgrade 2.1/2.2 -> 3.0
> --
>
> Key: CASSANDRA-13313
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13313
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Jeff Jirsa
>Assignee: Jeff Jirsa
>Priority: Minor
>  Labels: Correctness
> Fix For: 3.0.x, 3.11.x
>
>
> Before 3.0 we used sstable ancestors to figure out if an sstable was left 
> over after a compaction. In 3.0 the ancestors are ignored and instead we use 
> LogTransaction files to figure it out. 3.0 should still clean up 2.1/2.2 
> compaction leftovers using the on-disk sstable ancestors when available.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14134) Migrate dtests to use pytest and python3

2018-01-03 Thread Michael Kjellman (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16310724#comment-16310724
 ] 

Michael Kjellman commented on CASSANDRA-14134:
--

I committed quite a lot of extra fixes today to the branch. Some was to deal 
with broken tests that weren't being executed due to the regex not matching 
some of the test classes as [~spo...@gmail.com] noticed this morning. I think 
that's fully resolved now but I'd appreciate it if you eyeball the branch as it 
exists now!

I also merged in a few tests fixed today by [~beobal] and [~bdeggleston]. With 
all of that work, we are *very* close to passing without any test failures but 
there are a few flaky tests that keep popping up and preventing victory...

+The latest two runs are below:+
* With vnodes
** https://circleci.com/gh/mkjellman/cassandra/339
** ran 771 tests with 3 failures (run time 18:51)
* Without vnodes 
** https://circleci.com/gh/mkjellman/cassandra/338
** ran 796 tests with 3 failures (run time 11:51)

> Migrate dtests to use pytest and python3
> 
>
> Key: CASSANDRA-14134
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14134
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Testing
>Reporter: Michael Kjellman
>Assignee: Michael Kjellman
>
> h4. Get the C* dtests running on the pytest framework.
> C* DTests currently run using the python test framework nosetest. This 
> framework has been largely abandoned with no releases since 2015 and a 
> general strong consensus in the python community that pytest is the future.
> h4. Why should we do this.
> Currently (and historically) dtests have always been difficult to run, flaky 
> and unpredictable in CI environments, and almost impossible to debug.
> On November 28th, 2017, I proposed on the dev@ list that we move the dtests 
> from nosetests to pytests. I got replies from Jon Haddad, Philip Thompson, 
> and kurt greaves with really only "+1" like replies to the proposal.
> Since then I've been working pretty much non stop to complete the large 
> refactor of dtests to pytests. As part of this effort (and due to the 
> migration tools that exist require it) I also ported the code to python3 
> (from the current python 2.7 based code-base).
> h4. High-level summary of key changes, improvements, and new features.
> * Migrate dtests from executing using the nosetest framework to pytest
> * Port the entire code base from Python 2.7 to Python 3.6
> * Update run_dtests.py to work with pytest
> * Add --dtest-print-tests-only option to run_dtests.py to get easily parsable 
> list of all available collected tests
> * Update README.md for executing the dtests with pytest
> * Add new debugging tips section to README.md to help with some basics of 
> debugging python3 and pytest
> * Migrate all existing Enviornment Variable usage as a means to control dtest 
> operation modes to argparse command line options with documented help on each 
> toggles intended usage
> * Migration of old unitTest and nose based test structure to modern pytest 
> fixture approach
> * Automatic detection of physical system resources to automatically determine 
> if @pytest.mark.resource_intensive annotated tests should be collected and 
> run on the system where they are being executed
> * new pytest fixture replacements for @since and @pytest.mark.upgrade_test 
> annotations
> * Migration to python logging framework
> * Upgrade thrift bindings to latest version with full python3 compatibility
> * Remove deprecated cql and pycassa dependencies and migrate any remaining 
> tests to fully remove those dependencies
> * Fixed dozens of tests that would hang the pytest framework forever when run 
> in CI enviornments
> * Ran code nearly 300 times in CircleCI during the migration and to find, 
> identify, and fix any tests capable of hanging CI
> * Upgrade Tests do not yet run in CI and still need additional migration work 
> (although all upgrade test classes compile successfully)
> I started with the *nose2pytest* [https://github.com/pytest-dev/nose2pytest] 
> migration tool. As this required python 3 language support I found myself 
> down the 2to3 python migration path. While painful to do this, the benefits 
> of python3 over python2.7 are numerous and moving to python3 for the 
> additional debugging tools now available to use when fixing dtests makes the 
> effort worth it for that reason alone!
> After the automated tools did their thing I began what was a much longer and 
> tedious manual process than I ever could have expected due to the custom many 
> ways we did things in dtests (frequently to work around nosetest limitations 
> of missing features that thankfully are now all included with the pytest 
> framework). I've done nearly 300 test runs of my migration branch with 
> 

[jira] [Commented] (CASSANDRA-14041) test_dead_sync_initiator - repair_tests.repair_test.TestRepair fails: Unexpected error in log, see stdout

2018-01-03 Thread Michael Kjellman (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16310681#comment-16310681
 ] 

Michael Kjellman commented on CASSANDRA-14041:
--

this is failing very reliably... so we should prioritize fixing this one... 
some more contents of stdout with the actual failure..

{code}
def handle_external_tool_process(process, cmd_args):
out, err = process.communicate()
rc = process.returncode

if rc != 0:
>   raise ToolError(cmd_args, rc, out, err)
E   ccmlib.node.ToolError: Subprocess ['stress', 'write', 'n=1k', 
'no-warmup', 'cl=ONE', '-schema', 'replication(factor=3)', '-rate', 
'threads=10'] exited with non-zero status; exit status: 1; 
E   stdout:  Stress Settings 
E   Command:
E Type: write
E Count: 1,000
E No Warmup: true
E Consistency Level: ONE
E Target Uncertainty: not applicable
E Key Size (bytes): 10
E Counter Increment Distibution: add=fixed(1)
E   Rate:
E Auto: false
E Thread Count: 10
E OpsPer Sec: 0
E   Population:
E Sequence: 1..1000
E Order: ARBITRARY
E Wrap: true
E   Insert:
E Revisits: Uniform:  min=1,max=100
E Visits: Fixed:  key=1
E Row Population Ratio: Ratio: divisor=1.00;delegate=Fixed:  
key=1
E Batch Type: not batching
E   Columns:
E Max Columns Per Key: 5
E Column Names: [C0, C1, C2, C3, C4]
E Comparator: AsciiType
E Timestamp: null
E Variable Column Count: false
E Slice: false
E Size Distribution: Fixed:  key=34
E Count Distribution: Fixed:  key=5
E   Errors:
E Ignore: false
E Tries: 10
E   Log:
E No Summary: false
E No Settings: false
E File: null
E Interval Millis: 1000
E Level: NORMAL
E   Mode:
E API: JAVA_DRIVER_NATIVE
E Connection Style: CQL_PREPARED
E CQL Version: CQL3
E Protocol Version: V4
E Username: null
E Password: null
E Auth Provide Class: null
E Max Pending Per Connection: 128
E Connections Per Host: 8
E Compression: NONE
E   Node:
E Nodes: [127.0.0.1]
E Is White List: false
E Datacenter: null
E   Schema:
E Keyspace: keyspace1
E Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
E Replication Strategy Options: {replication_factor=3}
E Table Compression: null
E Table Compaction Strategy: null
E Table Compaction Strategy Options: {}
E   Transport:
E truststore=null; truststore-password=null; keystore=null; 
keystore-password=null; ssl-protocol=TLS; ssl-alg=SunX509; 
ssl-ciphers=TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA; 
E   Port:
E Native Port: 9042
E JMX Port: 7100
E   Send To Daemon:
E *not set*
E   Graph:
E File: null
E Revision: unknown
E Title: null
E Operation: WRITE
E   TokenRange:
E Wrap: false
E Split Factor: 1
E   
E   Connected to cluster: test, max pending requests per connection 
128, max connections per host 8
E   Datacenter: datacenter1; Host: /127.0.0.1; Rack: rack1
E   Datacenter: datacenter1; Host: /127.0.0.2; Rack: rack1
E   Datacenter: datacenter1; Host: /127.0.0.3; Rack: rack1
E   ; 
E   stderr: WARN  03:35:11,529 Error creating connection to 
/127.0.0.3:9042
E   com.datastax.driver.core.exceptions.TransportException: 
[/127.0.0.3:9042] Cannot connect
E   at 
com.datastax.driver.core.Connection$1.operationComplete(Connection.java:165) 
[cassandra-driver-core-3.3.2-0461ed35-SNAPSHOT-shaded.jar:na]
E   at 
com.datastax.driver.core.Connection$1.operationComplete(Connection.java:148) 
[cassandra-driver-core-3.3.2-0461ed35-SNAPSHOT-shaded.jar:na]
E   at 
com.datastax.shaded.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)
 [cassandra-driver-core-3.3.2-0461ed35-SNAPSHOT-shaded.jar:na]
E   at 
com.datastax.shaded.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:500)
 [cassandra-driver-core-3.3.2-0461ed35-SNAPSHOT-shaded.jar:na]
E   at 
com.datastax.shaded.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:479)
 [cassandra-driver-core-3.3.2-0461ed35-SNAPSHOT-shaded.jar:na]
E 

[jira] [Created] (CASSANDRA-14146) [DTEST] cdc_test::TestCDC::test_insertion_and_commitlog_behavior_after_reaching_cdc_total_space assertion always fails (Extra items in the left set)

2018-01-03 Thread Michael Kjellman (JIRA)
Michael Kjellman created CASSANDRA-14146:


 Summary: [DTEST] 
cdc_test::TestCDC::test_insertion_and_commitlog_behavior_after_reaching_cdc_total_space
 assertion always fails (Extra items in the left set)
 Key: CASSANDRA-14146
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14146
 Project: Cassandra
  Issue Type: Bug
  Components: Testing
Reporter: Michael Kjellman


Dtest 
cdc_test::TestCDC::test_insertion_and_commitlog_behavior_after_reaching_cdc_total_space
 always fails on an assertion.

the assert is the final step of the test and it checks that 
pre_non_cdc_write_cdc_raw_segments == _get_cdc_raw_files(node.get_path())

This fails 100% of the time locally, 100% of the time on circleci executed 
under pytest, and 100% of the time for the past 40 test runs on ASF Jenkins 
runs against trunk.

This is the only test failure (excluding flaky one-off failures) remaining on 
the pytest dtest branch. I'm going to annotate the test with a skip marker 
(including a reason reference to this JIRA)... when it's fixed we should also 
remove the skip annotation from the test.

{code}
>   assert pre_non_cdc_write_cdc_raw_segments == 
> _get_cdc_raw_files(node.get_path())
E   AssertionError: assert {'/tmp/dtest-...169.log', ...} == 
{'/tmp/dtest-v...169.log', ...}
E Extra items in the left set:
E '/tmp/dtest-vrn4k8ov/test/node1/cdc_raw/CommitLog-7-1515030005097.log'
E '/tmp/dtest-vrn4k8ov/test/node1/cdc_raw/CommitLog-7-1515030005098.log'
E Extra items in the right set:
E '/tmp/dtest-vrn4k8ov/test/node1/cdc_raw/CommitLog-7-1515030005099.log'
E '/tmp/dtest-vrn4k8ov/test/node1/cdc_raw/CommitLog-7-1515030005100.log'
E Use -v to get the full diff
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14145) Detecting data resurrection during read

2018-01-03 Thread Jeff Jirsa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-14145:
---
Description: 
We have seen several bugs in which deleted data gets resurrected. We should try 
to see if we can detect this on the read path and possibly fix it. Here are a 
few examples which brought back data

A replica lost an sstable on startup which caused one replica to lose the 
tombstone and not the data. This tombstone was past gc grace which means this 
could resurrect data. We can detect such invalid states by looking at other 
replicas. 

If we are running incremental repair, Cassandra will keep repaired and 
non-repaired data separate. Every-time incremental repair will run, it will 
move the data from non-repaired to repaired. Repaired data across all replicas 
should be 100% consistent. 

Here is an example of how we can detect and mitigate the issue in most cases. 
Say we have 3 machines, A,B and C. All these machines will have data split b/w 
repaired and non-repaired. 
1. Machine A due to some bug bring backs data D. This data D is in repaired 
dataset. All other replicas will have data D and tombstone T 
2. Read for data D comes from application which involve replicas A and B. The 
data being read involves data which is in repaired state.  A will respond back 
to co-ordinator with data D and B will send nothing as tombstone is past gc 
grace. This will cause digest mismatch. 
3. This patch will only kick in when there is a digest mismatch. Co-ordinator 
will ask both replicas to send back all data like we do today but with this 
patch, replicas will respond back what data it is returning is coming from 
repaired vs non-repaired. If data coming from repaired does not match, we know 
there is a something wrong!! At this time, co-ordinator cannot determine if 
replica A has resurrected some data or replica B has lost some data. We can 
still log error in the logs saying we hit an invalid state.
4. Besides the log, we can take this further and even correct the response to 
the query. After logging an invalid state, we can ask replica A and B (and also 
C if alive) to send back all data for this including gcable tombstones. If any 
machine returns a tombstone which is after this data, we know we cannot return 
this data. This way we can avoid returning data which has been deleted. 

Some Challenges with this 
1. When data will be moved from non-repaired to repaired, there could be a race 
here. We can look at which incremental repairs have promoted things on which 
replica to avoid false positives.  
2. If the third replica is down and live replica does not have any tombstone, 
we wont be able to break the tie in deciding whether data was actually deleted 
or resurrected. 
3. If the read is for latest data only, we wont be able to detect it as the 
read will be served from non-repaired data. 
4. If the replica where we lose a tombstone is the last replica to compact the 
tombstone, we wont be able to decide if data is coming back or rest of the 
replicas has lost that data. But we will still detect something is wrong. 
5. We wont affect 99.9% of the read queries as we only do extra work during 
digest mismatch.
6. CL.ONE reads will not be able to detect this. 

  was:
We have seen several bugs in which deleted data gets resurrected. We should try 
to see if we can detect this on the read path and possibly fix it. Here are a 
few examples which brought back data

A replica lost an sstable on startup which caused one replica to lose the 
tombstone and not the data. This tombstone was past gc grace which means this 
could resurrect data. We can deduct such invalid states by looking at other 
replicas. 

If we are running incremental repair, Cassandra will keep repaired and 
non-repaired data separate. Every-time incremental repair will run, it will 
move the data from non-repaired to repaired. Repaired data across all replicas 
should be 100% consistent. 

Here is an example of how we can detect and mitigate the issue in most cases. 
Say we have 3 machines, A,B and C. All these machines will have data split b/w 
repaired and non-repaired. 
1. Machine A due to some bug bring backs data D. This data D is in repaired 
dataset. All other replicas will have data D and tombstone T 
2. Read for data D comes from application which involve replicas A and B. The 
data being read involves data which is in repaired state.  A will respond back 
to co-ordinator with data D and B will send nothing as tombstone is past gc 
grace. This will cause digest mismatch. 
3. This patch will only kick in when there is a digest mismatch. Co-ordinator 
will ask both replicas to send back all data like we do today but with this 
patch, replicas will respond back what data it is returning is coming from 
repaired vs non-repaired. If data coming from repaired does not match, we know 
there is a something wrong!! At this 

[jira] [Created] (CASSANDRA-14145) Detecting data resurrection during read

2018-01-03 Thread sankalp kohli (JIRA)
sankalp kohli created CASSANDRA-14145:
-

 Summary:  Detecting data resurrection during read
 Key: CASSANDRA-14145
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14145
 Project: Cassandra
  Issue Type: Improvement
Reporter: sankalp kohli
Priority: Minor


We have seen several bugs in which deleted data gets resurrected. We should try 
to see if we can detect this on the read path and possibly fix it. Here are a 
few examples which brought back data

A replica lost an sstable on startup which caused one replica to lose the 
tombstone and not the data. This tombstone was past gc grace which means this 
could resurrect data. We can deduct such invalid states by looking at other 
replicas. 

If we are running incremental repair, Cassandra will keep repaired and 
non-repaired data separate. Every-time incremental repair will run, it will 
move the data from non-repaired to repaired. Repaired data across all replicas 
should be 100% consistent. 

Here is an example of how we can detect and mitigate the issue in most cases. 
Say we have 3 machines, A,B and C. All these machines will have data split b/w 
repaired and non-repaired. 
1. Machine A due to some bug bring backs data D. This data D is in repaired 
dataset. All other replicas will have data D and tombstone T 
2. Read for data D comes from application which involve replicas A and B. The 
data being read involves data which is in repaired state.  A will respond back 
to co-ordinator with data D and B will send nothing as tombstone is past gc 
grace. This will cause digest mismatch. 
3. This patch will only kick in when there is a digest mismatch. Co-ordinator 
will ask both replicas to send back all data like we do today but with this 
patch, replicas will respond back what data it is returning is coming from 
repaired vs non-repaired. If data coming from repaired does not match, we know 
there is a something wrong!! At this time, co-ordinator cannot determine if 
replica A has resurrected some data or replica B has lost some data. We can 
still log error in the logs saying we hit an invalid state.
4. Besides the log, we can take this further and even correct the response to 
the query. After logging an invalid state, we can ask replica A and B (and also 
C if alive) to send back all data for this including gcable tombstones. If any 
machine returns a tombstone which is after this data, we know we cannot return 
this data. This way we can avoid returning data which has been deleted. 

Some Challenges with this 
1. When data will be moved from non-repaired to repaired, there could be a race 
here. We can look at which incremental repairs have promoted things on which 
replica to avoid false positives.  
2. If the third replica is down and live replica does not have any tombstone, 
we wont be able to break the tie in deciding whether data was actually deleted 
or resurrected. 
3. If the read is for latest data only, we wont be able to detect it as the 
read will be served from non-repaired data. 
4. If the replica where we lose a tombstone is the last replica to compact the 
tombstone, we wont be able to decide if data is coming back or rest of the 
replicas has lost that data. But we will still detect something is wrong. 
5. We wont affect 99.9% of the read queries as we only do extra work during 
digest mismatch.
6. CL.ONE reads will not be able to detect this. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-12125) ERROR [MemtableFlushWriter:4] 2016-07-01 06:20:41,137 CassandraDaemon.java:185 - Exception in thread Thread[MemtableFlushWriter:4,5,main] java.lang.RuntimeEx

2018-01-03 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16310485#comment-16310485
 ] 

Jeff Jirsa edited comment on CASSANDRA-12125 at 1/4/18 12:26 AM:
-

{{DecoratedKey}} is really two parts: {{Token token, ByteBuffer key}}

The {{token}} is a hash of the {{key}}, and we write sorted by {{token}} (not 
by {{key}}).


Those of you hitting this bug, can you please post:
- What version of cassandra you're using

- Schema of the impacted table (anonymize column names if needed)

- What memtable settings you're using (onheap, offheap)




was (Author: jjirsa):
{{DecoratedKey}} is really two parts: {{Token token, ByteBuffer key}}

The {{token}} is a hash of the {{key}}, and we write sorted by {{token}} (not 
by {{key}}).



> ERROR [MemtableFlushWriter:4] 2016-07-01 06:20:41,137 
> CassandraDaemon.java:185 - Exception in thread 
> Thread[MemtableFlushWriter:4,5,main]  java.lang.RuntimeException: Last 
> written key DecoratedKey(.XX, X) >= current key DecoratedKey
> 
>
> Key: CASSANDRA-12125
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12125
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: RHEL-6.5 64-bit Apache Cassandra 2.2.5v
>Reporter: Relish Chackochan
> Fix For: 2.2.x
>
>
> We are running on RHEL-6.5 64-bit with Apache Cassandra 2.2.5v on 4 node 
> cluster and getting the following error on multiple node while running the 
> repair job and when getting the error repair job is hang.
> Can some one help to identify the issue.
> {code}
> ERROR [MemtableFlushWriter:4] 2016-07-01 06:20:41,137 
> CassandraDaemon.java:185 - Exception in thread 
> Thread[MemtableFlushWriter:4,5,main]
> java.lang.RuntimeException: Last written key DecoratedKey(1467371986.8870, 
> 313436373337313938362e38383730) >= current key DecoratedKey(, 
> 313436373337323030312e38383730) writing into 
> /opt/cassandra/data/proddb/log_data1-0a5092a0a4fa11e5872fc1ce0a46dc27/.maxdatetimeindex_idx/tmp-la-470-big-Data.db
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12125) ERROR [MemtableFlushWriter:4] 2016-07-01 06:20:41,137 CassandraDaemon.java:185 - Exception in thread Thread[MemtableFlushWriter:4,5,main] java.lang.RuntimeExcepti

2018-01-03 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16310485#comment-16310485
 ] 

Jeff Jirsa commented on CASSANDRA-12125:


{{DecoratedKey}} is really two parts: {{Token token, ByteBuffer key}}

The {{token}} is a hash of the {{key}}, and we write sorted by {{token}} (not 
by {{key}}).



> ERROR [MemtableFlushWriter:4] 2016-07-01 06:20:41,137 
> CassandraDaemon.java:185 - Exception in thread 
> Thread[MemtableFlushWriter:4,5,main]  java.lang.RuntimeException: Last 
> written key DecoratedKey(.XX, X) >= current key DecoratedKey
> 
>
> Key: CASSANDRA-12125
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12125
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: RHEL-6.5 64-bit Apache Cassandra 2.2.5v
>Reporter: Relish Chackochan
> Fix For: 2.2.x
>
>
> We are running on RHEL-6.5 64-bit with Apache Cassandra 2.2.5v on 4 node 
> cluster and getting the following error on multiple node while running the 
> repair job and when getting the error repair job is hang.
> Can some one help to identify the issue.
> {code}
> ERROR [MemtableFlushWriter:4] 2016-07-01 06:20:41,137 
> CassandraDaemon.java:185 - Exception in thread 
> Thread[MemtableFlushWriter:4,5,main]
> java.lang.RuntimeException: Last written key DecoratedKey(1467371986.8870, 
> 313436373337313938362e38383730) >= current key DecoratedKey(, 
> 313436373337323030312e38383730) writing into 
> /opt/cassandra/data/proddb/log_data1-0a5092a0a4fa11e5872fc1ce0a46dc27/.maxdatetimeindex_idx/tmp-la-470-big-Data.db
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12125) ERROR [MemtableFlushWriter:4] 2016-07-01 06:20:41,137 CassandraDaemon.java:185 - Exception in thread Thread[MemtableFlushWriter:4,5,main] java.lang.RuntimeExcepti

2018-01-03 Thread Aditya Bharadwaj (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16310275#comment-16310275
 ] 

Aditya Bharadwaj commented on CASSANDRA-12125:
--

A slightly odd thing in the logs that i keep seeing, similar to the above 
comment

{quote}Last written key DecoratedKey(0, fffe04cc) >= current key 
DecoratedKey(-129843, fffe04cd){quote}

fffe04cc is actually less than fffe04cd.
Why does the log message say the opposite ?

> ERROR [MemtableFlushWriter:4] 2016-07-01 06:20:41,137 
> CassandraDaemon.java:185 - Exception in thread 
> Thread[MemtableFlushWriter:4,5,main]  java.lang.RuntimeException: Last 
> written key DecoratedKey(.XX, X) >= current key DecoratedKey
> 
>
> Key: CASSANDRA-12125
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12125
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: RHEL-6.5 64-bit Apache Cassandra 2.2.5v
>Reporter: Relish Chackochan
> Fix For: 2.2.x
>
>
> We are running on RHEL-6.5 64-bit with Apache Cassandra 2.2.5v on 4 node 
> cluster and getting the following error on multiple node while running the 
> repair job and when getting the error repair job is hang.
> Can some one help to identify the issue.
> {code}
> ERROR [MemtableFlushWriter:4] 2016-07-01 06:20:41,137 
> CassandraDaemon.java:185 - Exception in thread 
> Thread[MemtableFlushWriter:4,5,main]
> java.lang.RuntimeException: Last written key DecoratedKey(1467371986.8870, 
> 313436373337313938362e38383730) >= current key DecoratedKey(, 
> 313436373337323030312e38383730) writing into 
> /opt/cassandra/data/proddb/log_data1-0a5092a0a4fa11e5872fc1ce0a46dc27/.maxdatetimeindex_idx/tmp-la-470-big-Data.db
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14134) Migrate dtests to use pytest and python3

2018-01-03 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16310274#comment-16310274
 ] 

Ariel Weisberg commented on CASSANDRA-14134:


I also liked the environment variables, but I think they aren't the best way 
forward. One of my biggest complaints is the extensive incantations you had to 
remember to get a usable debuggable run out of the dtests.

KEEP_TEST_DIR
CASSANDR_DIR
A bunch of stuff related to having stdout do something reasonable, logging to a 
file, making output visible immediately rather than at the end, including 
output for succeeding tests and not just failed tests.
Something related to logging so serious errors didn't just silently cause tests 
to fail with tool output not logged

I had all that stuff exported in my profile, but maybe a better way is to have 
a documented config file that is picked up so that rather than have to search 
for the common options you just edit the file and set/uncomment the ones you 
want. Environment variables aren't really that great because they aren't 
discoverable like a well documented config file.

If we really wanted to be minimal it could just be a shell script that invokes 
pytest and has all the optional stuff as environment variables that you can 
modify/uncomment to get the common behaviors people want. Then you don't have 
to do any config file plumbing and it's consistent with the pytest help output.

> Migrate dtests to use pytest and python3
> 
>
> Key: CASSANDRA-14134
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14134
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Testing
>Reporter: Michael Kjellman
>Assignee: Michael Kjellman
>
> h4. Get the C* dtests running on the pytest framework.
> C* DTests currently run using the python test framework nosetest. This 
> framework has been largely abandoned with no releases since 2015 and a 
> general strong consensus in the python community that pytest is the future.
> h4. Why should we do this.
> Currently (and historically) dtests have always been difficult to run, flaky 
> and unpredictable in CI environments, and almost impossible to debug.
> On November 28th, 2017, I proposed on the dev@ list that we move the dtests 
> from nosetests to pytests. I got replies from Jon Haddad, Philip Thompson, 
> and kurt greaves with really only "+1" like replies to the proposal.
> Since then I've been working pretty much non stop to complete the large 
> refactor of dtests to pytests. As part of this effort (and due to the 
> migration tools that exist require it) I also ported the code to python3 
> (from the current python 2.7 based code-base).
> h4. High-level summary of key changes, improvements, and new features.
> * Migrate dtests from executing using the nosetest framework to pytest
> * Port the entire code base from Python 2.7 to Python 3.6
> * Update run_dtests.py to work with pytest
> * Add --dtest-print-tests-only option to run_dtests.py to get easily parsable 
> list of all available collected tests
> * Update README.md for executing the dtests with pytest
> * Add new debugging tips section to README.md to help with some basics of 
> debugging python3 and pytest
> * Migrate all existing Enviornment Variable usage as a means to control dtest 
> operation modes to argparse command line options with documented help on each 
> toggles intended usage
> * Migration of old unitTest and nose based test structure to modern pytest 
> fixture approach
> * Automatic detection of physical system resources to automatically determine 
> if @pytest.mark.resource_intensive annotated tests should be collected and 
> run on the system where they are being executed
> * new pytest fixture replacements for @since and @pytest.mark.upgrade_test 
> annotations
> * Migration to python logging framework
> * Upgrade thrift bindings to latest version with full python3 compatibility
> * Remove deprecated cql and pycassa dependencies and migrate any remaining 
> tests to fully remove those dependencies
> * Fixed dozens of tests that would hang the pytest framework forever when run 
> in CI enviornments
> * Ran code nearly 300 times in CircleCI during the migration and to find, 
> identify, and fix any tests capable of hanging CI
> * Upgrade Tests do not yet run in CI and still need additional migration work 
> (although all upgrade test classes compile successfully)
> I started with the *nose2pytest* [https://github.com/pytest-dev/nose2pytest] 
> migration tool. As this required python 3 language support I found myself 
> down the 2to3 python migration path. While painful to do this, the benefits 
> of python3 over python2.7 are numerous and moving to python3 for the 
> additional debugging tools now available to use when fixing dtests makes the 
> effort worth it 

[jira] [Comment Edited] (CASSANDRA-14136) MemtableFlushWriter DecoratedKey Exception

2018-01-03 Thread Aditya Bharadwaj (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305216#comment-16305216
 ] 

Aditya Bharadwaj edited comment on CASSANDRA-14136 at 1/3/18 8:58 PM:
--

Looks related. Seen this issue 38 times in the last 1 month, in all but 1 
scenario, it seems to have come while writing to a secondary index.
This is the exception

bq. java.lang.RuntimeException: Last written key 
DecoratedKey(-6638873113115166967, 81e77da2723b483d8c0d49f800c1e288) >= current 
key DecoratedKey(-8794293631676762023, 
9130cbbaa8e911e79641aba7018ec35280ee000807b600085bcc0006d8add8acd985000ad983d8a8d98ad8b1d8a90008000100992f2f696d616765732d63646e2d79756d2e6d6172746a61636b2e636f6d2f617a7572652f79756d2d7265736f75726365732f38316537376461322d373233622d343833642d386330642d3439663830306331653238382f496d616765732f50726f64756374496d616765732f5377617463682f4c617267655f49636f6e2f69636f6e5f73617563655f4242512e706e673b77696474683d33360008000200010004000261723436007e4d081e06000182ed830ffc0590395ad535f900f0010034080850697a7a6148757408063230373236340c0808546f7070696e677308065361756365730c0c0c080a43553030323135373638080a435530303231353737380c0c0c0803e80880852f2f696d616765732d63646e2d79756d2e6d6172746a61636b2e636f6d2f617a7572652f79756d2d7265736f75726365732f38316537376461322d373233622d343833642d386330642d3439663830306331653238382f496d616765732f50726f64756374496d616765732f536f757263652f69636f6e5f73617563655f4242512e706e670c08080001083dcd0c080817d8b5d984d8b5d8a920d8a7d984d8a8d98ad8aad8b2d8a708014108007e4d080c080e4f6d6e2d537563732d507a5375630880852f2f696d616765732d63646e2d79756d2e6d6172746a61636b2e636f6d2f617a7572652f79756d2d7265736f75726365732f38316537376461322d373233622d343833642d386330642d3439663830306331653238382f496d616765732f50726f64756374496d616765732f536f757263652f69636f6e5f73617563655f4242512e706e670c0806d8add8acd985085bce080cd985d8aad988d8b3d8b7d8a9083dcd01081091408324a8e911e7ae1b578dff8303d380f807b600085bce0006d8add8acd985000cd985d8aad988d8b3d8b7d8a90008000100992f2f696d616765732d63646e2d79756d2e6d6172746a61636b2e636f6d2f617a7572652f79756d2d7265736f75726365732f38316537376461322d373233622d343833642d386330642d3439663830306331653238382f496d616765732f50726f64756374496d616765732f5377617463682f4c617267655f49636f6e2f69636f6e5f73617563655f4242512e706e673b77696474683d33360008000300010004000261723436007e4d081e08000182e98314fc0590395c213df900f0010034080850697a7a6148757408063230373236340c0808546f7070696e677308065361756365730c0c0c080a43553030323135373638080a435530303231353737380c0c0c0803e80880852f2f696d616765732d63646e2d79756d2e6d6172746a61636b2e636f6d2f617a7572652f79756d2d7265736f75726365732f38316537376461322d373233622d343833642d386330642d3439663830306331653238382f496d616765732f50726f64756374496d616765732f536f757263652f69636f6e5f73617563655f4242512e706e670c08080001083dcd0c080817d8b5d984d8b5d8a920d8a7d984d8a8d98ad8aad8b2d8a708014108007e4d080c080e4f6d6e2d537563732d507a5375630880852f2f696d616765732d63646e2d79756d2e6d6172746a61636b2e636f6d2f617a7572652f79756d2d7265736f75726365732f38316537376461322d373233622d343833642d386330642d3439663830306331653238382f496d616765732f50726f64756374496d616765732f536f757263652f69636f6e5f73617563655f4242512e706e670c0806d8add8acd985085bd0080ad8b5d8bad98ad8b1d8a9083dcd010810914d7b73a8e911e7ae1b578dff8303d380ee000807b600085bd6d8add8acd985000ad8b5d8bad98ad8b1d8a90008000100992f2f696d616765732d63646e2d79756d2e6d6172746a61636b2e636f6d2f617a7572652f79756d2d7265736f75726365732f38316537376461322d373233622d343833642d386330642d3439663830306331653238382f496d616765732f50726f64756374496d616765732f5377617463682f4c617267655f49636f6e2f69636f6e5f73617563655f4242512e706e673b77696474683d33360008000400010004000261723436007e4d3a1e22000183148310fc0889665cb9adf90400050030080850697a7a61487574080632303732363408046e756c6c0c080543727573740c0c0c0c080a435530303231353738320c0c0c0c089c400c0808000108409a0c080812d985d8a7d8b1d8acd8a7d8b1d98ad8aad8a708014108007e4d3a0c0c0c08104d6172676865726974612043727573740c080c4f6d6e4372742d4d726774610c0815d986d988d8b920d8a7d984d8b9d8acd9)
 writing into 
/mnt/DATA/cassandra/data/product/productdetails_by_storeid_variants-ec590ad0108611e7a92033b648576005/mc-25-big-Data.db

Even in this scenario, this table has a secondary index though.

{quote}CREATE TABLE 

[jira] [Comment Edited] (CASSANDRA-14136) MemtableFlushWriter DecoratedKey Exception

2018-01-03 Thread Aditya Bharadwaj (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305216#comment-16305216
 ] 

Aditya Bharadwaj edited comment on CASSANDRA-14136 at 1/3/18 8:58 PM:
--

Looks related. Seen this issue 38 times in the last 1 month, in all but 1 
scenario, it seems to have come while writing to a secondary index.
This is the exception

bq. java.lang.RuntimeException: Last written key 
DecoratedKey(-6638873113115166967, 81e77da2723b483d8c0d49f800c1e288) >= current 
key DecoratedKey(-8794293631676762023, 
9130cbbaa8e911e79641aba7018ec35280ee000807b600085bcc0006d8add8acd985000ad983d8a8d98ad8b1d8a90008000100992f2f696d616765732d63646e2d79756d2e6d6172746a61636b2e636f6d2f617a7572652f79756d2d7265736f75726365732f38316537376461322d373233622d343833642d386330642d3439663830306331653238382f496d616765732f50726f64756374496d616765732f5377617463682f4c617267655f49636f6e2f69636f6e5f73617563655f4242512e706e673b77696474683d33360008000200010004000261723436007e4d081e06000182ed830ffc0590395ad535f900f0010034080850697a7a6148757408063230373236340c0808546f7070696e677308065361756365730c0c0c080a43553030323135373638080a435530303231353737380c0c0c0803e80880852f2f696d616765732d63646e2d79756d2e6d6172746a61636b2e636f6d2f617a7572652f79756d2d7265736f75726365732f38316537376461322d373233622d343833642d386330642d3439663830306331653238382f496d616765732f50726f64756374496d616765732f536f757263652f69636f6e5f73617563655f4242512e706e670c08080001083dcd0c080817d8b5d984d8b5d8a920d8a7d984d8a8d98ad8aad8b2d8a708014108007e4d080c080e4f6d6e2d537563732d507a5375630880852f2f696d616765732d63646e2d79756d2e6d6172746a61636b2e636f6d2f617a7572652f79756d2d7265736f75726365732f38316537376461322d373233622d343833642d386330642d3439663830306331653238382f496d616765732f50726f64756374496d616765732f536f757263652f69636f6e5f73617563655f4242512e706e670c0806d8add8acd985085bce080cd985d8aad988d8b3d8b7d8a9083dcd01081091408324a8e911e7ae1b578dff8303d380f807b600085bce0006d8add8acd985000cd985d8aad988d8b3d8b7d8a90008000100992f2f696d616765732d63646e2d79756d2e6d6172746a61636b2e636f6d2f617a7572652f79756d2d7265736f75726365732f38316537376461322d373233622d343833642d386330642d3439663830306331653238382f496d616765732f50726f64756374496d616765732f5377617463682f4c617267655f49636f6e2f69636f6e5f73617563655f4242512e706e673b77696474683d33360008000300010004000261723436007e4d081e08000182e98314fc0590395c213df900f0010034080850697a7a6148757408063230373236340c0808546f7070696e677308065361756365730c0c0c080a43553030323135373638080a435530303231353737380c0c0c0803e80880852f2f696d616765732d63646e2d79756d2e6d6172746a61636b2e636f6d2f617a7572652f79756d2d7265736f75726365732f38316537376461322d373233622d343833642d386330642d3439663830306331653238382f496d616765732f50726f64756374496d616765732f536f757263652f69636f6e5f73617563655f4242512e706e670c08080001083dcd0c080817d8b5d984d8b5d8a920d8a7d984d8a8d98ad8aad8b2d8a708014108007e4d080c080e4f6d6e2d537563732d507a5375630880852f2f696d616765732d63646e2d79756d2e6d6172746a61636b2e636f6d2f617a7572652f79756d2d7265736f75726365732f38316537376461322d373233622d343833642d386330642d3439663830306331653238382f496d616765732f50726f64756374496d616765732f536f757263652f69636f6e5f73617563655f4242512e706e670c0806d8add8acd985085bd0080ad8b5d8bad98ad8b1d8a9083dcd010810914d7b73a8e911e7ae1b578dff8303d380ee000807b600085bd6d8add8acd985000ad8b5d8bad98ad8b1d8a90008000100992f2f696d616765732d63646e2d79756d2e6d6172746a61636b2e636f6d2f617a7572652f79756d2d7265736f75726365732f38316537376461322d373233622d343833642d386330642d3439663830306331653238382f496d616765732f50726f64756374496d616765732f5377617463682f4c617267655f49636f6e2f69636f6e5f73617563655f4242512e706e673b77696474683d33360008000400010004000261723436007e4d3a1e22000183148310fc0889665cb9adf90400050030080850697a7a61487574080632303732363408046e756c6c0c080543727573740c0c0c0c080a435530303231353738320c0c0c0c089c400c0808000108409a0c080812d985d8a7d8b1d8acd8a7d8b1d98ad8aad8a708014108007e4d3a0c0c0c08104d6172676865726974612043727573740c080c4f6d6e4372742d4d726774610c0815d986d988d8b920d8a7d984d8b9d8acd9)
 writing into 
/mnt/DATA/cassandra/data/product/productdetails_by_storeid_variants-ec590ad0108611e7a92033b648576005/mc-25-big-Data.db

Even in this scenario, this table has a secondary index though.

{quote}CREATE TABLE 

[jira] [Updated] (CASSANDRA-14136) MemtableFlushWriter DecoratedKey Exception

2018-01-03 Thread Aditya Bharadwaj (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya Bharadwaj updated CASSANDRA-14136:
-
Description: 
Running into this issue on my cluster periodically for different tables. After 
this error is encountered, all the post flushes stop and eventually the system 
runs out of memory.
On a restart all the commit logs get played normally and things go back to 
normal. 

I'm unable to understand the scenario, but the issue is recreating every few 
days.

{code}DEBUG [MemtableFlushWriter:884] 2017-12-26 18:19:40,883 Memtable.java:401 
- Completed flushing 
/mnt/DATA/cassandra/data/products/products_by_hierarchy5storeid_pascdesc-411cabe0632411e7b25a1b665c06298b/.id
x_hierarchy1category/mc-2050-big-Data.db (508.127KiB) for commitlog position 
ReplayPosition(segmentId=1513929386900,
 position=19110822)
DEBUG [MemtableFlushWriter:884] 2017-12-26 18:19:41,150 Memtable.java:368 - 
Writing 
Memtable-products_by_hierarchy5storeid_pascdesc.idx_hierarchy3category@551487729(545.926KiB
 serialized bytes, 324073 ops, 0%/0% of on/off-heap limit)
ERROR [MemtableFlushWriter:884] 2017-12-26 18:19:41,316 
CassandraDaemon.java:205 - Exception in thread 
Thread[MemtableFlushWriter:884,5,main]
java.lang.RuntimeException: Last written key DecoratedKey(CU00328612, 
43553030333238363132) >= current key DecoratedKey(^@^@^@^@^@^@^@^@^@^@, 
43553030333238363838) writing into 
/mnt/DATA/cassandra/data/products/products_by_hierarchy5storeid_pascdesc-411cabe0632411e7b25a1b665c06298b/.idx_hierarchy3category/mc-2134-big-Data.db
at 
org.apache.cassandra.io.sstable.format.big.BigTableWriter.beforeAppend(BigTableWriter.java:106)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(BigTableWriter.java:145)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.io.sstable.SimpleSSTableMultiWriter.append(SimpleSSTableMultiWriter.java:45)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.io.sstable.SSTableTxnWriter.append(SSTableTxnWriter.java:52)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:394) 
~[apache-cassandra-3.0.9.jar:3.0.9]
at org.apache.cassandra.db.Memtable.flush(Memtable.java:332) 
~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1054)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
~[na:1.8.0_112]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
~[na:1.8.0_112]
at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_112]
{code}

  was:
Running into this issue on my cluster periodically for different tables. After 
this error is encountered, all the post flushes stop and eventually the system 
runs out of memory.
On a restart all the commit logs get played normally and things go back to 
normal. 

I'm unable to understand the scenario, but the issue is recreating every few 
days.

{code}DEBUG [MemtableFlushWriter:884] 2017-12-26 18:19:40,883 Memtable.java:401 
- Completed flushing 
/mnt/DATA/cassandra/data/martjack/products_by_hierarchy5storeid_pascdesc-411cabe0632411e7b25a1b665c06298b/.id
x_hierarchy1category/mc-2050-big-Data.db (508.127KiB) for commitlog position 
ReplayPosition(segmentId=1513929386900,
 position=19110822)
DEBUG [MemtableFlushWriter:884] 2017-12-26 18:19:41,150 Memtable.java:368 - 
Writing 
Memtable-products_by_hierarchy5storeid_pascdesc.idx_hierarchy3category@551487729(545.926KiB
 serialized bytes, 324073 ops, 0%/0% of on/off-heap limit)
ERROR [MemtableFlushWriter:884] 2017-12-26 18:19:41,316 
CassandraDaemon.java:205 - Exception in thread 
Thread[MemtableFlushWriter:884,5,main]
java.lang.RuntimeException: Last written key DecoratedKey(CU00328612, 
43553030333238363132) >= current key DecoratedKey(^@^@^@^@^@^@^@^@^@^@, 
43553030333238363838) writing into 
/mnt/DATA/cassandra/data/martjack/products_by_hierarchy5storeid_pascdesc-411cabe0632411e7b25a1b665c06298b/.idx_hierarchy3category/mc-2134-big-Data.db
at 
org.apache.cassandra.io.sstable.format.big.BigTableWriter.beforeAppend(BigTableWriter.java:106)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(BigTableWriter.java:145)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.io.sstable.SimpleSSTableMultiWriter.append(SimpleSSTableMultiWriter.java:45)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.io.sstable.SSTableTxnWriter.append(SSTableTxnWriter.java:52)
 ~[apache-cassandra-3.0.9.jar:3.0.9]
at 
org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:394) 
~[apache-cassandra-3.0.9.jar:3.0.9]
at 

[jira] [Comment Edited] (CASSANDRA-12125) ERROR [MemtableFlushWriter:4] 2016-07-01 06:20:41,137 CassandraDaemon.java:185 - Exception in thread Thread[MemtableFlushWriter:4,5,main] java.lang.RuntimeEx

2018-01-03 Thread Aditya Bharadwaj (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305231#comment-16305231
 ] 

Aditya Bharadwaj edited comment on CASSANDRA-12125 at 1/3/18 8:56 PM:
--

I'm running into the same issue, every few days (19 times in the last 2 months 
to be precise).

In all but one scenario, it happened on the Secondary Index tables.

The exception is that it happened on the SSTable of this table
{{CREATE TABLE productdetails_by_storeid_variants (
merchantid uuid,
languagecode text,
storeid bigint,
productid bigint,
variantproductid bigint,
quantity bigint,
brand text,
brandid text,
bundlegroups text,
catalogcode text,
deliverymode text,
deliverytime text,
h1catname text,
h2catname text,
h3catname text,
h4catname text,
h5catname text,
hierarchy1category text,
hierarchy2category text,
hierarchy3category text,
hierarchy4category text,
hierarchy5category text,
image text,
inventory bigint,
largeimage text,
longdescription text,
maximumorderquantity bigint,
minimumorderquantity bigint,
mrp float,
offerdesc text,
primaryproductid bigint,
producttitle text,
producttype text,
refid bigint,
seodescription text,
seokeywords text,
seopagetitle text,
seourlkey text,
shortdescription text,
sku text,
smallimage text,
tags text,
variantproducts list,
variantproperty text,
variantpropertyvalueid bigint,
variantvalue text,
webprice float,
PRIMARY KEY (merchantid, languagecode, storeid, productid, variantproductid, 
quantity)
)  WITH CLUSTERING ORDER BY (languagecode ASC, storeid ASC, productid ASC, 
variantproductid ASC, quantity ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 86400
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
CREATE INDEX productdetails_by_storeid_variants_refid_idx ON 
productdetails_by_storeid_variants (refid);}}

At this frequency, it is a blocker for us too.


was (Author: adityabharadwaj):
I'm running into the same issue, every few days (19 times in the last 2 months 
to be precise).

In all but one scenario, it happened on the Secondary Index tables.

The exception is that it happened on the SSTable of this table
{{CREATE TABLE martjack.productdetails_by_storeid_variants (
merchantid uuid,
languagecode text,
storeid bigint,
productid bigint,
variantproductid bigint,
quantity bigint,
brand text,
brandid text,
bundlegroups text,
catalogcode text,
deliverymode text,
deliverytime text,
h1catname text,
h2catname text,
h3catname text,
h4catname text,
h5catname text,
hierarchy1category text,
hierarchy2category text,
hierarchy3category text,
hierarchy4category text,
hierarchy5category text,
image text,
inventory bigint,
largeimage text,
longdescription text,
maximumorderquantity bigint,
minimumorderquantity bigint,
mrp float,
offerdesc text,
primaryproductid bigint,
producttitle text,
producttype text,
refid bigint,
seodescription text,
seokeywords text,
seopagetitle text,
seourlkey text,
shortdescription text,
sku text,
smallimage text,
tags text,
variantproducts list,
variantproperty text,
variantpropertyvalueid bigint,
variantvalue text,
webprice float,
PRIMARY KEY (merchantid, languagecode, storeid, productid, variantproductid, 
quantity)
)  WITH CLUSTERING ORDER BY (languagecode ASC, storeid ASC, productid ASC, 
variantproductid ASC, quantity ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 86400
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
CREATE INDEX productdetails_by_storeid_variants_refid_idx ON 
martjack.productdetails_by_storeid_variants (refid);}}

At this frequency, it is a blocker for us too.

> ERROR [MemtableFlushWriter:4] 2016-07-01 06:20:41,137 
> CassandraDaemon.java:185 - Exception in thread 
> Thread[MemtableFlushWriter:4,5,main]  

[jira] [Commented] (CASSANDRA-14134) Migrate dtests to use pytest and python3

2018-01-03 Thread Blake Eggleston (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16310148#comment-16310148
 ] 

Blake Eggleston commented on CASSANDRA-14134:
-

I'm not saying we should remove the command line parameters, just that we 
shouldn't remove the existing environment variables and ini stuff.

Also, regarding the {{~}} expansion, {{os.path.expanduser}} will do that. 
Probably better to do that internally than make users put in the full path?

> Migrate dtests to use pytest and python3
> 
>
> Key: CASSANDRA-14134
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14134
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Testing
>Reporter: Michael Kjellman
>Assignee: Michael Kjellman
>
> h4. Get the C* dtests running on the pytest framework.
> C* DTests currently run using the python test framework nosetest. This 
> framework has been largely abandoned with no releases since 2015 and a 
> general strong consensus in the python community that pytest is the future.
> h4. Why should we do this.
> Currently (and historically) dtests have always been difficult to run, flaky 
> and unpredictable in CI environments, and almost impossible to debug.
> On November 28th, 2017, I proposed on the dev@ list that we move the dtests 
> from nosetests to pytests. I got replies from Jon Haddad, Philip Thompson, 
> and kurt greaves with really only "+1" like replies to the proposal.
> Since then I've been working pretty much non stop to complete the large 
> refactor of dtests to pytests. As part of this effort (and due to the 
> migration tools that exist require it) I also ported the code to python3 
> (from the current python 2.7 based code-base).
> h4. High-level summary of key changes, improvements, and new features.
> * Migrate dtests from executing using the nosetest framework to pytest
> * Port the entire code base from Python 2.7 to Python 3.6
> * Update run_dtests.py to work with pytest
> * Add --dtest-print-tests-only option to run_dtests.py to get easily parsable 
> list of all available collected tests
> * Update README.md for executing the dtests with pytest
> * Add new debugging tips section to README.md to help with some basics of 
> debugging python3 and pytest
> * Migrate all existing Enviornment Variable usage as a means to control dtest 
> operation modes to argparse command line options with documented help on each 
> toggles intended usage
> * Migration of old unitTest and nose based test structure to modern pytest 
> fixture approach
> * Automatic detection of physical system resources to automatically determine 
> if @pytest.mark.resource_intensive annotated tests should be collected and 
> run on the system where they are being executed
> * new pytest fixture replacements for @since and @pytest.mark.upgrade_test 
> annotations
> * Migration to python logging framework
> * Upgrade thrift bindings to latest version with full python3 compatibility
> * Remove deprecated cql and pycassa dependencies and migrate any remaining 
> tests to fully remove those dependencies
> * Fixed dozens of tests that would hang the pytest framework forever when run 
> in CI enviornments
> * Ran code nearly 300 times in CircleCI during the migration and to find, 
> identify, and fix any tests capable of hanging CI
> * Upgrade Tests do not yet run in CI and still need additional migration work 
> (although all upgrade test classes compile successfully)
> I started with the *nose2pytest* [https://github.com/pytest-dev/nose2pytest] 
> migration tool. As this required python 3 language support I found myself 
> down the 2to3 python migration path. While painful to do this, the benefits 
> of python3 over python2.7 are numerous and moving to python3 for the 
> additional debugging tools now available to use when fixing dtests makes the 
> effort worth it for that reason alone!
> After the automated tools did their thing I began what was a much longer and 
> tedious manual process than I ever could have expected due to the custom many 
> ways we did things in dtests (frequently to work around nosetest limitations 
> of missing features that thankfully are now all included with the pytest 
> framework). I've done nearly 300 test runs of my migration branch with 
> circleci.
> The latest CircleCI runs can be found at:
> (dtests without vnodes) [https://circleci.com/gh/mkjellman/cassandra/277]
> (dtests with vnodes) [https://circleci.com/gh/mkjellman/cassandra/278]
> With vnodes, there are currently only 6 remaining dtest test failures.
> Without vnodes, there are 12 remaining dtest failures.
> It turns out that after the dtests were moved to ASF Jenkins from cassci, the 
> jobs were misconfigured and we actually haven't been running the dtests in 
> the non-vnodes configuration. The current most 

[jira] [Commented] (CASSANDRA-14134) Migrate dtests to use pytest and python3

2018-01-03 Thread Michael Kjellman (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16310129#comment-16310129
 ] 

Michael Kjellman commented on CASSANDRA-14134:
--

[~bdeggleston] thanks. committed the changes for the two tests you fixed above.

> Migrate dtests to use pytest and python3
> 
>
> Key: CASSANDRA-14134
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14134
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Testing
>Reporter: Michael Kjellman
>Assignee: Michael Kjellman
>
> h4. Get the C* dtests running on the pytest framework.
> C* DTests currently run using the python test framework nosetest. This 
> framework has been largely abandoned with no releases since 2015 and a 
> general strong consensus in the python community that pytest is the future.
> h4. Why should we do this.
> Currently (and historically) dtests have always been difficult to run, flaky 
> and unpredictable in CI environments, and almost impossible to debug.
> On November 28th, 2017, I proposed on the dev@ list that we move the dtests 
> from nosetests to pytests. I got replies from Jon Haddad, Philip Thompson, 
> and kurt greaves with really only "+1" like replies to the proposal.
> Since then I've been working pretty much non stop to complete the large 
> refactor of dtests to pytests. As part of this effort (and due to the 
> migration tools that exist require it) I also ported the code to python3 
> (from the current python 2.7 based code-base).
> h4. High-level summary of key changes, improvements, and new features.
> * Migrate dtests from executing using the nosetest framework to pytest
> * Port the entire code base from Python 2.7 to Python 3.6
> * Update run_dtests.py to work with pytest
> * Add --dtest-print-tests-only option to run_dtests.py to get easily parsable 
> list of all available collected tests
> * Update README.md for executing the dtests with pytest
> * Add new debugging tips section to README.md to help with some basics of 
> debugging python3 and pytest
> * Migrate all existing Enviornment Variable usage as a means to control dtest 
> operation modes to argparse command line options with documented help on each 
> toggles intended usage
> * Migration of old unitTest and nose based test structure to modern pytest 
> fixture approach
> * Automatic detection of physical system resources to automatically determine 
> if @pytest.mark.resource_intensive annotated tests should be collected and 
> run on the system where they are being executed
> * new pytest fixture replacements for @since and @pytest.mark.upgrade_test 
> annotations
> * Migration to python logging framework
> * Upgrade thrift bindings to latest version with full python3 compatibility
> * Remove deprecated cql and pycassa dependencies and migrate any remaining 
> tests to fully remove those dependencies
> * Fixed dozens of tests that would hang the pytest framework forever when run 
> in CI enviornments
> * Ran code nearly 300 times in CircleCI during the migration and to find, 
> identify, and fix any tests capable of hanging CI
> * Upgrade Tests do not yet run in CI and still need additional migration work 
> (although all upgrade test classes compile successfully)
> I started with the *nose2pytest* [https://github.com/pytest-dev/nose2pytest] 
> migration tool. As this required python 3 language support I found myself 
> down the 2to3 python migration path. While painful to do this, the benefits 
> of python3 over python2.7 are numerous and moving to python3 for the 
> additional debugging tools now available to use when fixing dtests makes the 
> effort worth it for that reason alone!
> After the automated tools did their thing I began what was a much longer and 
> tedious manual process than I ever could have expected due to the custom many 
> ways we did things in dtests (frequently to work around nosetest limitations 
> of missing features that thankfully are now all included with the pytest 
> framework). I've done nearly 300 test runs of my migration branch with 
> circleci.
> The latest CircleCI runs can be found at:
> (dtests without vnodes) [https://circleci.com/gh/mkjellman/cassandra/277]
> (dtests with vnodes) [https://circleci.com/gh/mkjellman/cassandra/278]
> With vnodes, there are currently only 6 remaining dtest test failures.
> Without vnodes, there are 12 remaining dtest failures.
> It turns out that after the dtests were moved to ASF Jenkins from cassci, the 
> jobs were misconfigured and we actually haven't been running the dtests in 
> the non-vnodes configuration. The current most recent trunk dtest job to 
> complete on ASF Jenkins (with vnodes) was 
> [https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-dtest/387/].
>  That test run had 36 test failures.
> There are 

[jira] [Commented] (CASSANDRA-14134) Migrate dtests to use pytest and python3

2018-01-03 Thread Michael Kjellman (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16310125#comment-16310125
 ] 

Michael Kjellman commented on CASSANDRA-14134:
--

i strongly disagree with this [~bdeggleston] you shouldn't need to look 
thru the source or read random documentation to know how to run the dtests. 
these are required parameters and we should have good --help around them and 
validation when they are provided as arguments to see if they exist etc.

> Migrate dtests to use pytest and python3
> 
>
> Key: CASSANDRA-14134
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14134
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Testing
>Reporter: Michael Kjellman
>Assignee: Michael Kjellman
>
> h4. Get the C* dtests running on the pytest framework.
> C* DTests currently run using the python test framework nosetest. This 
> framework has been largely abandoned with no releases since 2015 and a 
> general strong consensus in the python community that pytest is the future.
> h4. Why should we do this.
> Currently (and historically) dtests have always been difficult to run, flaky 
> and unpredictable in CI environments, and almost impossible to debug.
> On November 28th, 2017, I proposed on the dev@ list that we move the dtests 
> from nosetests to pytests. I got replies from Jon Haddad, Philip Thompson, 
> and kurt greaves with really only "+1" like replies to the proposal.
> Since then I've been working pretty much non stop to complete the large 
> refactor of dtests to pytests. As part of this effort (and due to the 
> migration tools that exist require it) I also ported the code to python3 
> (from the current python 2.7 based code-base).
> h4. High-level summary of key changes, improvements, and new features.
> * Migrate dtests from executing using the nosetest framework to pytest
> * Port the entire code base from Python 2.7 to Python 3.6
> * Update run_dtests.py to work with pytest
> * Add --dtest-print-tests-only option to run_dtests.py to get easily parsable 
> list of all available collected tests
> * Update README.md for executing the dtests with pytest
> * Add new debugging tips section to README.md to help with some basics of 
> debugging python3 and pytest
> * Migrate all existing Enviornment Variable usage as a means to control dtest 
> operation modes to argparse command line options with documented help on each 
> toggles intended usage
> * Migration of old unitTest and nose based test structure to modern pytest 
> fixture approach
> * Automatic detection of physical system resources to automatically determine 
> if @pytest.mark.resource_intensive annotated tests should be collected and 
> run on the system where they are being executed
> * new pytest fixture replacements for @since and @pytest.mark.upgrade_test 
> annotations
> * Migration to python logging framework
> * Upgrade thrift bindings to latest version with full python3 compatibility
> * Remove deprecated cql and pycassa dependencies and migrate any remaining 
> tests to fully remove those dependencies
> * Fixed dozens of tests that would hang the pytest framework forever when run 
> in CI enviornments
> * Ran code nearly 300 times in CircleCI during the migration and to find, 
> identify, and fix any tests capable of hanging CI
> * Upgrade Tests do not yet run in CI and still need additional migration work 
> (although all upgrade test classes compile successfully)
> I started with the *nose2pytest* [https://github.com/pytest-dev/nose2pytest] 
> migration tool. As this required python 3 language support I found myself 
> down the 2to3 python migration path. While painful to do this, the benefits 
> of python3 over python2.7 are numerous and moving to python3 for the 
> additional debugging tools now available to use when fixing dtests makes the 
> effort worth it for that reason alone!
> After the automated tools did their thing I began what was a much longer and 
> tedious manual process than I ever could have expected due to the custom many 
> ways we did things in dtests (frequently to work around nosetest limitations 
> of missing features that thankfully are now all included with the pytest 
> framework). I've done nearly 300 test runs of my migration branch with 
> circleci.
> The latest CircleCI runs can be found at:
> (dtests without vnodes) [https://circleci.com/gh/mkjellman/cassandra/277]
> (dtests with vnodes) [https://circleci.com/gh/mkjellman/cassandra/278]
> With vnodes, there are currently only 6 remaining dtest test failures.
> Without vnodes, there are 12 remaining dtest failures.
> It turns out that after the dtests were moved to ASF Jenkins from cassci, the 
> jobs were misconfigured and we actually haven't been running the dtests in 
> the non-vnodes configuration. The 

[jira] [Commented] (CASSANDRA-14134) Migrate dtests to use pytest and python3

2018-01-03 Thread Blake Eggleston (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16310124#comment-16310124
 ] 

Blake Eggleston commented on CASSANDRA-14134:
-

also, here are fixes for 2 tests that were broken by the 2->3 / nose->pytest 
translation:

[fixing 
repair_tests.incremental_repair_test.TestIncRepair.test_subrange|https://github.com/bdeggleston/cassandra-dtest/commit/82b5179bc3bc35c13ef2caa1274def041421160e]
[fixing 
sstable_generation_loading_test.TestSSTableGenerationAndLoading.test_remove_index_file|https://github.com/bdeggleston/cassandra-dtest/commit/86bd7945e40c5bf0d2835b50fd408e847a2dd643]

> Migrate dtests to use pytest and python3
> 
>
> Key: CASSANDRA-14134
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14134
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Testing
>Reporter: Michael Kjellman
>Assignee: Michael Kjellman
>
> h4. Get the C* dtests running on the pytest framework.
> C* DTests currently run using the python test framework nosetest. This 
> framework has been largely abandoned with no releases since 2015 and a 
> general strong consensus in the python community that pytest is the future.
> h4. Why should we do this.
> Currently (and historically) dtests have always been difficult to run, flaky 
> and unpredictable in CI environments, and almost impossible to debug.
> On November 28th, 2017, I proposed on the dev@ list that we move the dtests 
> from nosetests to pytests. I got replies from Jon Haddad, Philip Thompson, 
> and kurt greaves with really only "+1" like replies to the proposal.
> Since then I've been working pretty much non stop to complete the large 
> refactor of dtests to pytests. As part of this effort (and due to the 
> migration tools that exist require it) I also ported the code to python3 
> (from the current python 2.7 based code-base).
> h4. High-level summary of key changes, improvements, and new features.
> * Migrate dtests from executing using the nosetest framework to pytest
> * Port the entire code base from Python 2.7 to Python 3.6
> * Update run_dtests.py to work with pytest
> * Add --dtest-print-tests-only option to run_dtests.py to get easily parsable 
> list of all available collected tests
> * Update README.md for executing the dtests with pytest
> * Add new debugging tips section to README.md to help with some basics of 
> debugging python3 and pytest
> * Migrate all existing Enviornment Variable usage as a means to control dtest 
> operation modes to argparse command line options with documented help on each 
> toggles intended usage
> * Migration of old unitTest and nose based test structure to modern pytest 
> fixture approach
> * Automatic detection of physical system resources to automatically determine 
> if @pytest.mark.resource_intensive annotated tests should be collected and 
> run on the system where they are being executed
> * new pytest fixture replacements for @since and @pytest.mark.upgrade_test 
> annotations
> * Migration to python logging framework
> * Upgrade thrift bindings to latest version with full python3 compatibility
> * Remove deprecated cql and pycassa dependencies and migrate any remaining 
> tests to fully remove those dependencies
> * Fixed dozens of tests that would hang the pytest framework forever when run 
> in CI enviornments
> * Ran code nearly 300 times in CircleCI during the migration and to find, 
> identify, and fix any tests capable of hanging CI
> * Upgrade Tests do not yet run in CI and still need additional migration work 
> (although all upgrade test classes compile successfully)
> I started with the *nose2pytest* [https://github.com/pytest-dev/nose2pytest] 
> migration tool. As this required python 3 language support I found myself 
> down the 2to3 python migration path. While painful to do this, the benefits 
> of python3 over python2.7 are numerous and moving to python3 for the 
> additional debugging tools now available to use when fixing dtests makes the 
> effort worth it for that reason alone!
> After the automated tools did their thing I began what was a much longer and 
> tedious manual process than I ever could have expected due to the custom many 
> ways we did things in dtests (frequently to work around nosetest limitations 
> of missing features that thankfully are now all included with the pytest 
> framework). I've done nearly 300 test runs of my migration branch with 
> circleci.
> The latest CircleCI runs can be found at:
> (dtests without vnodes) [https://circleci.com/gh/mkjellman/cassandra/277]
> (dtests with vnodes) [https://circleci.com/gh/mkjellman/cassandra/278]
> With vnodes, there are currently only 6 remaining dtest test failures.
> Without vnodes, there are 12 remaining dtest failures.
> It turns out that after the dtests were moved 

[jira] [Commented] (CASSANDRA-14134) Migrate dtests to use pytest and python3

2018-01-03 Thread Blake Eggleston (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16310118#comment-16310118
 ] 

Blake Eggleston commented on CASSANDRA-14134:
-

I think we should also preserve the CASSANDRA_DIR/CASSANDRA_VERSION 
configuration options. Being able to set an environment variable or keep an ini 
can be much more convenient than having to always include it as a cli option

> Migrate dtests to use pytest and python3
> 
>
> Key: CASSANDRA-14134
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14134
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Testing
>Reporter: Michael Kjellman
>Assignee: Michael Kjellman
>
> h4. Get the C* dtests running on the pytest framework.
> C* DTests currently run using the python test framework nosetest. This 
> framework has been largely abandoned with no releases since 2015 and a 
> general strong consensus in the python community that pytest is the future.
> h4. Why should we do this.
> Currently (and historically) dtests have always been difficult to run, flaky 
> and unpredictable in CI environments, and almost impossible to debug.
> On November 28th, 2017, I proposed on the dev@ list that we move the dtests 
> from nosetests to pytests. I got replies from Jon Haddad, Philip Thompson, 
> and kurt greaves with really only "+1" like replies to the proposal.
> Since then I've been working pretty much non stop to complete the large 
> refactor of dtests to pytests. As part of this effort (and due to the 
> migration tools that exist require it) I also ported the code to python3 
> (from the current python 2.7 based code-base).
> h4. High-level summary of key changes, improvements, and new features.
> * Migrate dtests from executing using the nosetest framework to pytest
> * Port the entire code base from Python 2.7 to Python 3.6
> * Update run_dtests.py to work with pytest
> * Add --dtest-print-tests-only option to run_dtests.py to get easily parsable 
> list of all available collected tests
> * Update README.md for executing the dtests with pytest
> * Add new debugging tips section to README.md to help with some basics of 
> debugging python3 and pytest
> * Migrate all existing Enviornment Variable usage as a means to control dtest 
> operation modes to argparse command line options with documented help on each 
> toggles intended usage
> * Migration of old unitTest and nose based test structure to modern pytest 
> fixture approach
> * Automatic detection of physical system resources to automatically determine 
> if @pytest.mark.resource_intensive annotated tests should be collected and 
> run on the system where they are being executed
> * new pytest fixture replacements for @since and @pytest.mark.upgrade_test 
> annotations
> * Migration to python logging framework
> * Upgrade thrift bindings to latest version with full python3 compatibility
> * Remove deprecated cql and pycassa dependencies and migrate any remaining 
> tests to fully remove those dependencies
> * Fixed dozens of tests that would hang the pytest framework forever when run 
> in CI enviornments
> * Ran code nearly 300 times in CircleCI during the migration and to find, 
> identify, and fix any tests capable of hanging CI
> * Upgrade Tests do not yet run in CI and still need additional migration work 
> (although all upgrade test classes compile successfully)
> I started with the *nose2pytest* [https://github.com/pytest-dev/nose2pytest] 
> migration tool. As this required python 3 language support I found myself 
> down the 2to3 python migration path. While painful to do this, the benefits 
> of python3 over python2.7 are numerous and moving to python3 for the 
> additional debugging tools now available to use when fixing dtests makes the 
> effort worth it for that reason alone!
> After the automated tools did their thing I began what was a much longer and 
> tedious manual process than I ever could have expected due to the custom many 
> ways we did things in dtests (frequently to work around nosetest limitations 
> of missing features that thankfully are now all included with the pytest 
> framework). I've done nearly 300 test runs of my migration branch with 
> circleci.
> The latest CircleCI runs can be found at:
> (dtests without vnodes) [https://circleci.com/gh/mkjellman/cassandra/277]
> (dtests with vnodes) [https://circleci.com/gh/mkjellman/cassandra/278]
> With vnodes, there are currently only 6 remaining dtest test failures.
> Without vnodes, there are 12 remaining dtest failures.
> It turns out that after the dtests were moved to ASF Jenkins from cassci, the 
> jobs were misconfigured and we actually haven't been running the dtests in 
> the non-vnodes configuration. The current most recent trunk dtest job to 
> complete on ASF Jenkins (with vnodes) 

[jira] [Commented] (CASSANDRA-14134) Migrate dtests to use pytest and python3

2018-01-03 Thread Michael Kjellman (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16310108#comment-16310108
 ] 

Michael Kjellman commented on CASSANDRA-14134:
--

[~aweisberg] just pushed up a fix for the missing pytest import on 
metadata_test.py

> Migrate dtests to use pytest and python3
> 
>
> Key: CASSANDRA-14134
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14134
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Testing
>Reporter: Michael Kjellman
>Assignee: Michael Kjellman
>
> h4. Get the C* dtests running on the pytest framework.
> C* DTests currently run using the python test framework nosetest. This 
> framework has been largely abandoned with no releases since 2015 and a 
> general strong consensus in the python community that pytest is the future.
> h4. Why should we do this.
> Currently (and historically) dtests have always been difficult to run, flaky 
> and unpredictable in CI environments, and almost impossible to debug.
> On November 28th, 2017, I proposed on the dev@ list that we move the dtests 
> from nosetests to pytests. I got replies from Jon Haddad, Philip Thompson, 
> and kurt greaves with really only "+1" like replies to the proposal.
> Since then I've been working pretty much non stop to complete the large 
> refactor of dtests to pytests. As part of this effort (and due to the 
> migration tools that exist require it) I also ported the code to python3 
> (from the current python 2.7 based code-base).
> h4. High-level summary of key changes, improvements, and new features.
> * Migrate dtests from executing using the nosetest framework to pytest
> * Port the entire code base from Python 2.7 to Python 3.6
> * Update run_dtests.py to work with pytest
> * Add --dtest-print-tests-only option to run_dtests.py to get easily parsable 
> list of all available collected tests
> * Update README.md for executing the dtests with pytest
> * Add new debugging tips section to README.md to help with some basics of 
> debugging python3 and pytest
> * Migrate all existing Enviornment Variable usage as a means to control dtest 
> operation modes to argparse command line options with documented help on each 
> toggles intended usage
> * Migration of old unitTest and nose based test structure to modern pytest 
> fixture approach
> * Automatic detection of physical system resources to automatically determine 
> if @pytest.mark.resource_intensive annotated tests should be collected and 
> run on the system where they are being executed
> * new pytest fixture replacements for @since and @pytest.mark.upgrade_test 
> annotations
> * Migration to python logging framework
> * Upgrade thrift bindings to latest version with full python3 compatibility
> * Remove deprecated cql and pycassa dependencies and migrate any remaining 
> tests to fully remove those dependencies
> * Fixed dozens of tests that would hang the pytest framework forever when run 
> in CI enviornments
> * Ran code nearly 300 times in CircleCI during the migration and to find, 
> identify, and fix any tests capable of hanging CI
> * Upgrade Tests do not yet run in CI and still need additional migration work 
> (although all upgrade test classes compile successfully)
> I started with the *nose2pytest* [https://github.com/pytest-dev/nose2pytest] 
> migration tool. As this required python 3 language support I found myself 
> down the 2to3 python migration path. While painful to do this, the benefits 
> of python3 over python2.7 are numerous and moving to python3 for the 
> additional debugging tools now available to use when fixing dtests makes the 
> effort worth it for that reason alone!
> After the automated tools did their thing I began what was a much longer and 
> tedious manual process than I ever could have expected due to the custom many 
> ways we did things in dtests (frequently to work around nosetest limitations 
> of missing features that thankfully are now all included with the pytest 
> framework). I've done nearly 300 test runs of my migration branch with 
> circleci.
> The latest CircleCI runs can be found at:
> (dtests without vnodes) [https://circleci.com/gh/mkjellman/cassandra/277]
> (dtests with vnodes) [https://circleci.com/gh/mkjellman/cassandra/278]
> With vnodes, there are currently only 6 remaining dtest test failures.
> Without vnodes, there are 12 remaining dtest failures.
> It turns out that after the dtests were moved to ASF Jenkins from cassci, the 
> jobs were misconfigured and we actually haven't been running the dtests in 
> the non-vnodes configuration. The current most recent trunk dtest job to 
> complete on ASF Jenkins (with vnodes) was 
> [https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-dtest/387/].
>  That test run had 36 test failures.
> There 

[jira] [Commented] (CASSANDRA-14134) Migrate dtests to use pytest and python3

2018-01-03 Thread Michael Kjellman (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16310099#comment-16310099
 ] 

Michael Kjellman commented on CASSANDRA-14134:
--

[~spo...@gmail.com] good catch! i just pushed a commit to rename the additional 
snowflake test classes that i missed. i assume it's going to take me a few 
additional commits here to fix runtime exceptions on those tests from python 3 
fallout on these test classes as they weren't being run in my testing thus far

> Migrate dtests to use pytest and python3
> 
>
> Key: CASSANDRA-14134
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14134
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Testing
>Reporter: Michael Kjellman
>Assignee: Michael Kjellman
>
> h4. Get the C* dtests running on the pytest framework.
> C* DTests currently run using the python test framework nosetest. This 
> framework has been largely abandoned with no releases since 2015 and a 
> general strong consensus in the python community that pytest is the future.
> h4. Why should we do this.
> Currently (and historically) dtests have always been difficult to run, flaky 
> and unpredictable in CI environments, and almost impossible to debug.
> On November 28th, 2017, I proposed on the dev@ list that we move the dtests 
> from nosetests to pytests. I got replies from Jon Haddad, Philip Thompson, 
> and kurt greaves with really only "+1" like replies to the proposal.
> Since then I've been working pretty much non stop to complete the large 
> refactor of dtests to pytests. As part of this effort (and due to the 
> migration tools that exist require it) I also ported the code to python3 
> (from the current python 2.7 based code-base).
> h4. High-level summary of key changes, improvements, and new features.
> * Migrate dtests from executing using the nosetest framework to pytest
> * Port the entire code base from Python 2.7 to Python 3.6
> * Update run_dtests.py to work with pytest
> * Add --dtest-print-tests-only option to run_dtests.py to get easily parsable 
> list of all available collected tests
> * Update README.md for executing the dtests with pytest
> * Add new debugging tips section to README.md to help with some basics of 
> debugging python3 and pytest
> * Migrate all existing Enviornment Variable usage as a means to control dtest 
> operation modes to argparse command line options with documented help on each 
> toggles intended usage
> * Migration of old unitTest and nose based test structure to modern pytest 
> fixture approach
> * Automatic detection of physical system resources to automatically determine 
> if @pytest.mark.resource_intensive annotated tests should be collected and 
> run on the system where they are being executed
> * new pytest fixture replacements for @since and @pytest.mark.upgrade_test 
> annotations
> * Migration to python logging framework
> * Upgrade thrift bindings to latest version with full python3 compatibility
> * Remove deprecated cql and pycassa dependencies and migrate any remaining 
> tests to fully remove those dependencies
> * Fixed dozens of tests that would hang the pytest framework forever when run 
> in CI enviornments
> * Ran code nearly 300 times in CircleCI during the migration and to find, 
> identify, and fix any tests capable of hanging CI
> * Upgrade Tests do not yet run in CI and still need additional migration work 
> (although all upgrade test classes compile successfully)
> I started with the *nose2pytest* [https://github.com/pytest-dev/nose2pytest] 
> migration tool. As this required python 3 language support I found myself 
> down the 2to3 python migration path. While painful to do this, the benefits 
> of python3 over python2.7 are numerous and moving to python3 for the 
> additional debugging tools now available to use when fixing dtests makes the 
> effort worth it for that reason alone!
> After the automated tools did their thing I began what was a much longer and 
> tedious manual process than I ever could have expected due to the custom many 
> ways we did things in dtests (frequently to work around nosetest limitations 
> of missing features that thankfully are now all included with the pytest 
> framework). I've done nearly 300 test runs of my migration branch with 
> circleci.
> The latest CircleCI runs can be found at:
> (dtests without vnodes) [https://circleci.com/gh/mkjellman/cassandra/277]
> (dtests with vnodes) [https://circleci.com/gh/mkjellman/cassandra/278]
> With vnodes, there are currently only 6 remaining dtest test failures.
> Without vnodes, there are 12 remaining dtest failures.
> It turns out that after the dtests were moved to ASF Jenkins from cassci, the 
> jobs were misconfigured and we actually haven't been running the dtests in 
> the non-vnodes 

[jira] [Commented] (CASSANDRA-14134) Migrate dtests to use pytest and python3

2018-01-03 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16310095#comment-16310095
 ] 

Ariel Weisberg commented on CASSANDRA-14134:


For --cassandra-dir= you can't use ~/somepath because it won't expand ~. So 
sanitize the readme to not rely on that.

> Migrate dtests to use pytest and python3
> 
>
> Key: CASSANDRA-14134
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14134
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Testing
>Reporter: Michael Kjellman
>Assignee: Michael Kjellman
>
> h4. Get the C* dtests running on the pytest framework.
> C* DTests currently run using the python test framework nosetest. This 
> framework has been largely abandoned with no releases since 2015 and a 
> general strong consensus in the python community that pytest is the future.
> h4. Why should we do this.
> Currently (and historically) dtests have always been difficult to run, flaky 
> and unpredictable in CI environments, and almost impossible to debug.
> On November 28th, 2017, I proposed on the dev@ list that we move the dtests 
> from nosetests to pytests. I got replies from Jon Haddad, Philip Thompson, 
> and kurt greaves with really only "+1" like replies to the proposal.
> Since then I've been working pretty much non stop to complete the large 
> refactor of dtests to pytests. As part of this effort (and due to the 
> migration tools that exist require it) I also ported the code to python3 
> (from the current python 2.7 based code-base).
> h4. High-level summary of key changes, improvements, and new features.
> * Migrate dtests from executing using the nosetest framework to pytest
> * Port the entire code base from Python 2.7 to Python 3.6
> * Update run_dtests.py to work with pytest
> * Add --dtest-print-tests-only option to run_dtests.py to get easily parsable 
> list of all available collected tests
> * Update README.md for executing the dtests with pytest
> * Add new debugging tips section to README.md to help with some basics of 
> debugging python3 and pytest
> * Migrate all existing Enviornment Variable usage as a means to control dtest 
> operation modes to argparse command line options with documented help on each 
> toggles intended usage
> * Migration of old unitTest and nose based test structure to modern pytest 
> fixture approach
> * Automatic detection of physical system resources to automatically determine 
> if @pytest.mark.resource_intensive annotated tests should be collected and 
> run on the system where they are being executed
> * new pytest fixture replacements for @since and @pytest.mark.upgrade_test 
> annotations
> * Migration to python logging framework
> * Upgrade thrift bindings to latest version with full python3 compatibility
> * Remove deprecated cql and pycassa dependencies and migrate any remaining 
> tests to fully remove those dependencies
> * Fixed dozens of tests that would hang the pytest framework forever when run 
> in CI enviornments
> * Ran code nearly 300 times in CircleCI during the migration and to find, 
> identify, and fix any tests capable of hanging CI
> * Upgrade Tests do not yet run in CI and still need additional migration work 
> (although all upgrade test classes compile successfully)
> I started with the *nose2pytest* [https://github.com/pytest-dev/nose2pytest] 
> migration tool. As this required python 3 language support I found myself 
> down the 2to3 python migration path. While painful to do this, the benefits 
> of python3 over python2.7 are numerous and moving to python3 for the 
> additional debugging tools now available to use when fixing dtests makes the 
> effort worth it for that reason alone!
> After the automated tools did their thing I began what was a much longer and 
> tedious manual process than I ever could have expected due to the custom many 
> ways we did things in dtests (frequently to work around nosetest limitations 
> of missing features that thankfully are now all included with the pytest 
> framework). I've done nearly 300 test runs of my migration branch with 
> circleci.
> The latest CircleCI runs can be found at:
> (dtests without vnodes) [https://circleci.com/gh/mkjellman/cassandra/277]
> (dtests with vnodes) [https://circleci.com/gh/mkjellman/cassandra/278]
> With vnodes, there are currently only 6 remaining dtest test failures.
> Without vnodes, there are 12 remaining dtest failures.
> It turns out that after the dtests were moved to ASF Jenkins from cassci, the 
> jobs were misconfigured and we actually haven't been running the dtests in 
> the non-vnodes configuration. The current most recent trunk dtest job to 
> complete on ASF Jenkins (with vnodes) was 
> [https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-trunk-dtest/387/].
>  That test run 

[jira] [Commented] (CASSANDRA-14134) Migrate dtests to use pytest and python3

2018-01-03 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16310049#comment-16310049
 ] 

Ariel Weisberg commented on CASSANDRA-14134:


Some minor readme feedback.

* Remove INSTALL.md and reference from README.md
* There is no virtualenv brew package, people should probably brew install 
brew-pip and then pip install virtualenv although I haven't tested this
* ~/.cassandra-dtest is still referenced in the readme and doesn't work anymore

I followed the instructions in the README, patched CCM in the virtual env, used 
(commit 05e434d1930298635f1de993483d1f682c3a9380 (HEAD, 
kjellman/dtests_on_pytest_v2), cassandra (commit 
21be9d2f50cc6a6bdceff56389adc015f811d5d6 (HEAD, apache/trunk)) and got:
{noformat}
(venv-dtest) aweisberg-MacBook-Pro:cassandra-dtest aweisberg$ pytest 
--cassandra-dir=~/repos/cassandra
 test session starts 
=
platform darwin -- Python 3.6.4, pytest-3.3.1, py-1.5.2, pluggy-0.6.0
rootdir: /Users/aweisberg/repos/cassandra-dtest, inifile: pytest.ini
plugins: timeout-1.2.1, flaky-3.4.0
collected 1885 items / 1 errors

===Flaky Test Report===


===End Flaky Test Report===
=== ERRORS 
===
_ ERROR collecting 
metadata_test.py __
metadata_test.py:10: in 
class TestMetadata(Tester):
metadata_test.py:29: in TestMetadata
@pytest.mark.skip(reason='hangs CI')
E   NameError: name 'pytest' is not defined
!! Interrupted: 1 errors during 
collection !!!
=== 1133 tests deselected 

== 1133 deselected, 1 error in 4.32 
seconds ==
(venv-dtest) aweisberg-MacBook-Pro:cassandra-dtest aweisberg$ ./run_dtests.py 
--cassandra-dir=~/repos/cassandra
= test session starts 
=={noformat} 

> Migrate dtests to use pytest and python3
> 
>
> Key: CASSANDRA-14134
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14134
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Testing
>Reporter: Michael Kjellman
>Assignee: Michael Kjellman
>
> h4. Get the C* dtests running on the pytest framework.
> C* DTests currently run using the python test framework nosetest. This 
> framework has been largely abandoned with no releases since 2015 and a 
> general strong consensus in the python community that pytest is the future.
> h4. Why should we do this.
> Currently (and historically) dtests have always been difficult to run, flaky 
> and unpredictable in CI environments, and almost impossible to debug.
> On November 28th, 2017, I proposed on the dev@ list that we move the dtests 
> from nosetests to pytests. I got replies from Jon Haddad, Philip Thompson, 
> and kurt greaves with really only "+1" like replies to the proposal.
> Since then I've been working pretty much non stop to complete the large 
> refactor of dtests to pytests. As part of this effort (and due to the 
> migration tools that exist require it) I also ported the code to python3 
> (from the current python 2.7 based code-base).
> h4. High-level summary of key changes, improvements, and new features.
> * Migrate dtests from executing using the nosetest framework to pytest
> * Port the entire code base from Python 2.7 to Python 3.6
> * Update run_dtests.py to work with pytest
> * Add --dtest-print-tests-only option to run_dtests.py to get easily parsable 
> list of all available collected tests
> * Update README.md for executing the dtests with pytest
> * Add new debugging tips section to README.md to help with some basics of 
> debugging python3 and pytest
> * Migrate all existing Enviornment Variable usage as a means to control dtest 
> operation modes to argparse command line options with documented help on each 
> toggles intended usage
> * Migration of old unitTest and nose based test structure to modern pytest 
> fixture approach
> * Automatic detection of physical system resources to automatically determine 
> if @pytest.mark.resource_intensive annotated tests should be collected and 
> run on the system where they are being executed
> * new pytest fixture replacements for @since and @pytest.mark.upgrade_test 
> annotations
> * Migration to python logging framework
> * Upgrade thrift bindings to latest version with full python3 compatibility
> * 

[jira] [Commented] (CASSANDRA-13851) Allow existing nodes to use all peers in shadow round

2018-01-03 Thread Sam Tunnicliffe (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16309845#comment-16309845
 ] 

Sam Tunnicliffe commented on CASSANDRA-13851:
-

I'm +1 on this latest version, though it occurs to me that there is something 
else we could do to help full cluster bounces that are done in one shot 
(per-replica set or otherwise partial bounces will now proceed ok).

Failure to receive an ack within RING_DELAY will terminate the shadow round, 
fatally for a node not in it's own seed list. So if we make non-seeds remain in 
the SR for longer than seeds, (e.g. for RING_DELAY * 2), then as long as a 
single seed is contactable, startup should be able to proceed.
 
e.g. all peers have nodes 1, 2 & 3 configured as seeds, but 2 & 3 have failed. 
If the cluster is completely stopped and restarted, node1 will exit its SR 
after RING_DELAY and be available to ack the other nodes' syn requests. Once 
other, non-seeds start to come up, they will also now ack shadow round syns. 
This would increase startup times for a full bounce when some seeds are 
failing/missing, but in "normal" circumstances it would have no impact. 
It wouldn't help if all of the seeds 1, 2 & 3 were down during a full bounce, 
but I'd consider that tradeoff acceptable.


> Allow existing nodes to use all peers in shadow round
> -
>
> Key: CASSANDRA-13851
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13851
> Project: Cassandra
>  Issue Type: Bug
>  Components: Lifecycle
>Reporter: Kurt Greaves
>Assignee: Kurt Greaves
> Fix For: 3.11.x, 4.x
>
>
> In CASSANDRA-10134 we made collision checks necessary on every startup. A 
> side-effect was introduced that then requires a nodes seeds to be contacted 
> on every startup. Prior to this change an existing node could start up 
> regardless whether it could contact a seed node or not (because 
> checkForEndpointCollision() was only called for bootstrapping nodes). 
> Now if a nodes seeds are removed/deleted/fail it will no longer be able to 
> start up until live seeds are configured (or itself is made a seed), even 
> though it already knows about the rest of the ring. This is inconvenient for 
> operators and has the potential to cause some nasty surprises and increase 
> downtime.
> One solution would be to use all a nodes existing peers as seeds in the 
> shadow round. Not a Gossip guru though so not sure of implications.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14103) Fix potential race during compaction strategy reload

2018-01-03 Thread Paulo Motta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-14103:

Status: Patch Available  (was: Open)

The patch below implements the solution described above of keeping an SSTable 
set in the {{CompactionStrategyManager}} which is updated when receiving 
notifications from the tracker, what should prevent double adding of sstables 
if the strategies are reloaded by some other thread when processing a 
notification from the tracker. I also added a test to check that the sstables 
are properly added to the compaction strategies when receiving tracker 
notifications.

On the trunk patch I also fixed a bad merge from CASSANDRA-14082 
([here|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-14103#diff-1d4755900f9e76a3cf93810d98189951L706]).

CI looks good:

||3.11||trunk||
|[branch|https://github.com/apache/cassandra/compare/cassandra-3.11...pauloricardomg:3.11-14103]|[branch|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-14103]|
|[testall|https://issues.apache.org/jira/secure/attachment/12904399/3.11-14103-testall.png]|[testall|https://issues.apache.org/jira/secure/attachment/12904400/trunk-14103-dtest.png]|
|[dtest|https://issues.apache.org/jira/secure/attachment/12904398/3.11-14103-dtest.png]|[dtest|https://issues.apache.org/jira/secure/attachment/12904401/trunk-14103-testall.png]|

> Fix potential race during compaction strategy reload
> 
>
> Key: CASSANDRA-14103
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14103
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>Priority: Minor
> Attachments: 3.11-14103-dtest.png, 3.11-14103-testall.png, 
> trunk-14103-dtest.png, trunk-14103-testall.png
>
>
> When the compaction strategies are reloaded after disk boundary changes 
> (CASSANDRA-13948), it's possible that a recently finished SSTable is added 
> twice to the compaction strategy: once when the compaction strategies are 
> reloaded due to the disk boundary change ({{maybeReloadDiskBoundarie}}), and 
> another when the {{CompactionStrategyManager}} is processing the 
> {{SSTableAddedNotification}}.
> This should be quite unlikely because a compaction must finish as soon as the 
> disk boundary changes, and even if it happens most compaction strategies 
> would not be affected by it since they deduplicate sstables internally, but 
> we should protect against such scenario. 
> For more context see [this 
> comment|https://issues.apache.org/jira/browse/CASSANDRA-13948?focusedCommentId=16280448=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16280448]
>  from Marcus.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14103) Fix potential race during compaction strategy reload

2018-01-03 Thread Paulo Motta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-14103:

Attachment: trunk-14103-testall.png
trunk-14103-dtest.png
3.11-14103-testall.png
3.11-14103-dtest.png

> Fix potential race during compaction strategy reload
> 
>
> Key: CASSANDRA-14103
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14103
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>Priority: Minor
> Attachments: 3.11-14103-dtest.png, 3.11-14103-testall.png, 
> trunk-14103-dtest.png, trunk-14103-testall.png
>
>
> When the compaction strategies are reloaded after disk boundary changes 
> (CASSANDRA-13948), it's possible that a recently finished SSTable is added 
> twice to the compaction strategy: once when the compaction strategies are 
> reloaded due to the disk boundary change ({{maybeReloadDiskBoundarie}}), and 
> another when the {{CompactionStrategyManager}} is processing the 
> {{SSTableAddedNotification}}.
> This should be quite unlikely because a compaction must finish as soon as the 
> disk boundary changes, and even if it happens most compaction strategies 
> would not be affected by it since they deduplicate sstables internally, but 
> we should protect against such scenario. 
> For more context see [this 
> comment|https://issues.apache.org/jira/browse/CASSANDRA-13948?focusedCommentId=16280448=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16280448]
>  from Marcus.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14134) Migrate dtests to use pytest and python3

2018-01-03 Thread Stefan Podkowinski (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16309483#comment-16309483
 ] 

Stefan Podkowinski edited comment on CASSANDRA-14134 at 1/3/18 11:10 AM:
-

There are a couple of tests left that need renaming to get picked up by pytest. 
Class names need to start with {{Test}} now if they have {{test_}} methods that 
should be run and are not purely used as superclass.

{noformat}
> find . -name \*.py |xargs egrep '^class [^(]+\(Tester\)' | grep -v 'class 
> Test'
./thrift_hsha_test.py:class ThriftHSHATest(Tester):
./cql_test.py:class CQLTester(Tester):
./delete_insert_test.py:class DeleteInsertTest(Tester):
./sstable_generation_loading_test.py:class BaseSStableLoaderTest(Tester):
./replace_address_test.py:class BaseReplaceAddressTest(Tester):
./snapshot_test.py:class SnapshotTester(Tester):
./paging_test.py:class BasePagingTester(Tester):
./replication_test.py:class ReplicationTest(Tester):
./replication_test.py:class SnitchConfigurationUpdateTest(Tester):
./thrift_test.py:class ThriftTester(Tester):
./cqlsh_tests/cqlsh_copy_tests.py:class CqlshCopyTest(Tester):
./cqlsh_tests/cqlsh_tests.py:class CqlshSmokeTest(Tester):
./cqlsh_tests/cqlsh_tests.py:class CqlLoginTest(Tester):
./sstableutil_test.py:class SSTableUtilTest(Tester):
./upgrade_tests/compatibility_flag_test.py:class CompatibilityFlagTest(Tester):
./upgrade_tests/thrift_upgrade_test.py:class UpgradeSuperColumnsThrough(Tester):
./upgrade_tests/upgrade_through_versions_test.py:class UpgradeTester(Tester):
./upgrade_tests/upgrade_compact_storage.py:class 
UpgradeSuperColumnsThrough(Tester):
./json_test.py:class ToJsonSelectTests(Tester):
./json_test.py:class FromJsonUpdateTests(Tester):
./json_test.py:class FromJsonSelectTests(Tester):
./json_test.py:class FromJsonInsertTests(Tester):
./json_test.py:class FromJsonDeleteTests(Tester):
./json_test.py:class JsonFullRowInsertSelect(Tester):
./native_transport_ssl_test.py:class NativeTransportSSL(Tester):
./repair_tests/preview_repair_test.py:class PreviewRepairTest(Tester):
./repair_tests/repair_test.py:class BaseRepairTest(Tester):
{noformat}




was (Author: spo...@gmail.com):
There are a couple of tests left that need renaming to get picked up by pytest. 
Class names need to start with {{Test}} now.

{noformat}
> egrep '^class [^(]+\(Tester\)' *.py | grep -v 'class Test'
cql_test.py:class CQLTester(Tester):
delete_insert_test.py:class DeleteInsertTest(Tester):
json_test.py:class ToJsonSelectTests(Tester):
json_test.py:class FromJsonUpdateTests(Tester):
json_test.py:class FromJsonSelectTests(Tester):
json_test.py:class FromJsonInsertTests(Tester):
json_test.py:class FromJsonDeleteTests(Tester):
json_test.py:class JsonFullRowInsertSelect(Tester):
native_transport_ssl_test.py:class NativeTransportSSL(Tester):
paging_test.py:class BasePagingTester(Tester):
replace_address_test.py:class BaseReplaceAddressTest(Tester):
replication_test.py:class ReplicationTest(Tester):
replication_test.py:class SnitchConfigurationUpdateTest(Tester):
snapshot_test.py:class SnapshotTester(Tester):
sstable_generation_loading_test.py:class BaseSStableLoaderTest(Tester):
sstableutil_test.py:class SSTableUtilTest(Tester):
thrift_hsha_test.py:class ThriftHSHATest(Tester):
thrift_test.py:class ThriftTester(Tester):
{noformat}


> Migrate dtests to use pytest and python3
> 
>
> Key: CASSANDRA-14134
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14134
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Testing
>Reporter: Michael Kjellman
>Assignee: Michael Kjellman
>
> h4. Get the C* dtests running on the pytest framework.
> C* DTests currently run using the python test framework nosetest. This 
> framework has been largely abandoned with no releases since 2015 and a 
> general strong consensus in the python community that pytest is the future.
> h4. Why should we do this.
> Currently (and historically) dtests have always been difficult to run, flaky 
> and unpredictable in CI environments, and almost impossible to debug.
> On November 28th, 2017, I proposed on the dev@ list that we move the dtests 
> from nosetests to pytests. I got replies from Jon Haddad, Philip Thompson, 
> and kurt greaves with really only "+1" like replies to the proposal.
> Since then I've been working pretty much non stop to complete the large 
> refactor of dtests to pytests. As part of this effort (and due to the 
> migration tools that exist require it) I also ported the code to python3 
> (from the current python 2.7 based code-base).
> h4. High-level summary of key changes, improvements, and new features.
> * Migrate dtests from executing using the nosetest framework to pytest
> * Port the entire code base from Python 2.7 to Python 3.6
> * Update run_dtests.py 

[jira] [Commented] (CASSANDRA-14134) Migrate dtests to use pytest and python3

2018-01-03 Thread Stefan Podkowinski (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16309483#comment-16309483
 ] 

Stefan Podkowinski commented on CASSANDRA-14134:


There are a couple of tests left that need renaming to get picked up by pytest. 
Class names need to start with {{Test}} now.

{noformat}
> egrep '^class [^(]+\(Tester\)' *.py | grep -v 'class Test'
cql_test.py:class CQLTester(Tester):
delete_insert_test.py:class DeleteInsertTest(Tester):
json_test.py:class ToJsonSelectTests(Tester):
json_test.py:class FromJsonUpdateTests(Tester):
json_test.py:class FromJsonSelectTests(Tester):
json_test.py:class FromJsonInsertTests(Tester):
json_test.py:class FromJsonDeleteTests(Tester):
json_test.py:class JsonFullRowInsertSelect(Tester):
native_transport_ssl_test.py:class NativeTransportSSL(Tester):
paging_test.py:class BasePagingTester(Tester):
replace_address_test.py:class BaseReplaceAddressTest(Tester):
replication_test.py:class ReplicationTest(Tester):
replication_test.py:class SnitchConfigurationUpdateTest(Tester):
snapshot_test.py:class SnapshotTester(Tester):
sstable_generation_loading_test.py:class BaseSStableLoaderTest(Tester):
sstableutil_test.py:class SSTableUtilTest(Tester):
thrift_hsha_test.py:class ThriftHSHATest(Tester):
thrift_test.py:class ThriftTester(Tester):
{noformat}


> Migrate dtests to use pytest and python3
> 
>
> Key: CASSANDRA-14134
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14134
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Testing
>Reporter: Michael Kjellman
>Assignee: Michael Kjellman
>
> h4. Get the C* dtests running on the pytest framework.
> C* DTests currently run using the python test framework nosetest. This 
> framework has been largely abandoned with no releases since 2015 and a 
> general strong consensus in the python community that pytest is the future.
> h4. Why should we do this.
> Currently (and historically) dtests have always been difficult to run, flaky 
> and unpredictable in CI environments, and almost impossible to debug.
> On November 28th, 2017, I proposed on the dev@ list that we move the dtests 
> from nosetests to pytests. I got replies from Jon Haddad, Philip Thompson, 
> and kurt greaves with really only "+1" like replies to the proposal.
> Since then I've been working pretty much non stop to complete the large 
> refactor of dtests to pytests. As part of this effort (and due to the 
> migration tools that exist require it) I also ported the code to python3 
> (from the current python 2.7 based code-base).
> h4. High-level summary of key changes, improvements, and new features.
> * Migrate dtests from executing using the nosetest framework to pytest
> * Port the entire code base from Python 2.7 to Python 3.6
> * Update run_dtests.py to work with pytest
> * Add --dtest-print-tests-only option to run_dtests.py to get easily parsable 
> list of all available collected tests
> * Update README.md for executing the dtests with pytest
> * Add new debugging tips section to README.md to help with some basics of 
> debugging python3 and pytest
> * Migrate all existing Enviornment Variable usage as a means to control dtest 
> operation modes to argparse command line options with documented help on each 
> toggles intended usage
> * Migration of old unitTest and nose based test structure to modern pytest 
> fixture approach
> * Automatic detection of physical system resources to automatically determine 
> if @pytest.mark.resource_intensive annotated tests should be collected and 
> run on the system where they are being executed
> * new pytest fixture replacements for @since and @pytest.mark.upgrade_test 
> annotations
> * Migration to python logging framework
> * Upgrade thrift bindings to latest version with full python3 compatibility
> * Remove deprecated cql and pycassa dependencies and migrate any remaining 
> tests to fully remove those dependencies
> * Fixed dozens of tests that would hang the pytest framework forever when run 
> in CI enviornments
> * Ran code nearly 300 times in CircleCI during the migration and to find, 
> identify, and fix any tests capable of hanging CI
> * Upgrade Tests do not yet run in CI and still need additional migration work 
> (although all upgrade test classes compile successfully)
> I started with the *nose2pytest* [https://github.com/pytest-dev/nose2pytest] 
> migration tool. As this required python 3 language support I found myself 
> down the 2to3 python migration path. While painful to do this, the benefits 
> of python3 over python2.7 are numerous and moving to python3 for the 
> additional debugging tools now available to use when fixing dtests makes the 
> effort worth it for that reason alone!
> After the automated tools did their thing I began what was a much longer and