[jira] [Assigned] (CASSANDRA-16083) Missing JMX objects and attributes upgrading from 3.0 to 4.0

2020-10-01 Thread Uchenna (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uchenna reassigned CASSANDRA-16083:
---

Assignee: Uchenna

> Missing JMX objects and attributes upgrading from 3.0 to 4.0
> 
>
> Key: CASSANDRA-16083
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16083
> Project: Cassandra
>  Issue Type: Task
>  Components: Observability/Metrics
>Reporter: David Capwell
>Assignee: Uchenna
>Priority: Normal
> Fix For: 4.0-beta, 4.0-triage
>
>
> Using the tools added in CASSANDRA-16082, below are the list of metrics 
> missing in 4.0 but present in 3.0.  The work here is to make sure we had 
> proper deprecation for each metric, and if not to add it back.
> {code}
> $ tools/bin/jmxtool diff -f yaml cassandra-3.0-jmx.yaml trunk-jmx.yaml 
> --ignore-missing-on-left
> Objects not in right:
> org.apache.cassandra.metrics:type=Table,keyspace=system,scope=schema_columnfamilies,name=CasPrepareLatency
> org.apache.cassandra.metrics:type=Table,keyspace=system,scope=batchlog,name=EstimatedPartitionSizeHistogram
> org.apache.cassandra.metrics:type=Table,keyspace=system,scope=hints,name=BloomFilterFalseRatio
> org.apache.cassandra.metrics:type=Table,keyspace=system,scope=views_builds_in_progress,name=ReplicaFilteringProtectionRequests
> org.apache.cassandra.metrics:type=ColumnFamily,keyspace=system,scope=batchlog,name=RowCacheHitOutOfRange
> org.apache.cassandra.metrics:type=Table,keyspace=system,scope=views_builds_in_progress,name=CasPrepareLatency
> org.apache.cassandra.metrics:type=ThreadPools,path=request,scope=CounterMutationStage,name=MaxPoolSize
> org.apache.cassandra.metrics:type=ColumnFamily,keyspace=system,scope=views_builds_in_progress,name=ColUpdateTimeDeltaHistogram
> org.apache.cassandra.metrics:type=ColumnFamily,keyspace=system,scope=batchlog,name=TombstoneScannedHistogram
> org.apache.cassandra.metrics:type=ThreadPools,path=request,scope=ReadStage,name=ActiveTasks
> org.apache.cassandra.metrics:type=Table,keyspace=system,scope=hints,name=WaitingOnFreeMemtableSpace
> org.apache.cassandra.metrics:type=Table,keyspace=system,scope=schema_columnfamilies,name=CasCommitTotalLatency
> org.apache.cassandra.metrics:type=Table,keyspace=system,scope=hints,name=MemtableOnHeapSize
> org.apache.cassandra.metrics:type=Table,keyspace=system,scope=schema_aggregates,name=CasProposeLatency
> org.apache.cassandra.metrics:type=ColumnFamily,keyspace=system,scope=batchlog,name=AllMemtablesLiveDataSize
> org.apache.cassandra.metrics:type=ColumnFamily,keyspace=system,scope=hints,name=ViewReadTime
> org.apache.cassandra.db:type=HintedHandoffManager
> org.apache.cassandra.metrics:type=Table,keyspace=system,scope=batchlog,name=BloomFilterDiskSpaceUsed
> org.apache.cassandra.metrics:type=ThreadPools,path=request,scope=RequestResponseStage,name=PendingTasks
> org.apache.cassandra.metrics:type=ColumnFamily,keyspace=system,scope=views_builds_in_progress,name=MemtableSwitchCount
> org.apache.cassandra.metrics:type=ColumnFamily,keyspace=system,scope=hints,name=MemtableOnHeapSize
> org.apache.cassandra.metrics:type=Table,keyspace=system,scope=range_xfers,name=ReplicaFilteringProtectionRowsCachedPerQuery
> org.apache.cassandra.metrics:type=ColumnFamily,keyspace=system,scope=views_builds_in_progress,name=SnapshotsSize
> org.apache.cassandra.metrics:type=Table,keyspace=system,scope=batchlog,name=RecentBloomFilterFalsePositives
> org.apache.cassandra.metrics:type=Table,keyspace=system,scope=views_builds_in_progress,name=ColUpdateTimeDeltaHistogram
> org.apache.cassandra.metrics:type=ColumnFamily,keyspace=system,scope=range_xfers,name=SpeculativeRetries
> org.apache.cassandra.metrics:type=Table,keyspace=system,scope=batchlog,name=LiveDiskSpaceUsed
> org.apache.cassandra.metrics:type=Table,keyspace=system,scope=views_builds_in_progress,name=ViewReadTime
> org.apache.cassandra.metrics:type=ThreadPools,path=internal,scope=MigrationStage,name=CompletedTasks
> org.apache.cassandra.metrics:type=Table,keyspace=system,scope=batchlog,name=AllMemtablesLiveDataSize
> org.apache.cassandra.metrics:type=Table,keyspace=system,scope=batchlog,name=ViewReadTime
> org.apache.cassandra.metrics:type=ColumnFamily,keyspace=system,scope=hints,name=BloomFilterFalsePositives
> org.apache.cassandra.metrics:type=ColumnFamily,keyspace=system,scope=range_xfers,name=CompressionMetadataOffHeapMemoryUsed
> org.apache.cassandra.metrics:type=ThreadPools,path=request,scope=ReadStage,name=TotalBlockedTasks
> org.apache.cassandra.metrics:type=ColumnFamily,keyspace=system,scope=views_builds_in_progress,name=LiveScannedHistogram
> org.apache.cassandra.db:type=Tables,keyspace=system,table=views_builds_in_progress
> org.apache.cassandra.metrics:type=ThreadPools,path=internal,scope=MiscStage,name=ActiveTasks
> 

[jira] [Commented] (CASSANDRA-16153) Cassandra 4b2 - JVM options from *.options not read/set

2020-10-01 Thread Thomas Steinmaurer (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205973#comment-17205973
 ] 

Thomas Steinmaurer commented on CASSANDRA-16153:


[^system.log.2020-10-01.0.zip]

Search for:
{noformat}
...
INFO  [main] 2020-10-01 06:17:53,135 CassandraDaemon.java:507 - Hostname: 
ip-X-Y-68-230:7000:7001
...
{noformat}

> Cassandra 4b2 - JVM options from *.options not read/set
> ---
>
> Key: CASSANDRA-16153
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16153
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Scripts
>Reporter: Thomas Steinmaurer
>Priority: Normal
> Attachments: system.log.2020-10-01.0.zip
>
>
> Trying out Cassandra 4 beta 2 with Java 8 (AdoptOpenJDK) in AWS.
> {noformat}
> NAME="Amazon Linux AMI"
> VERSION="2018.03"
> ID="amzn"
> ID_LIKE="rhel fedora"
> VERSION_ID="2018.03"
> PRETTY_NAME="Amazon Linux AMI 2018.03"
> ANSI_COLOR="0;33"
> CPE_NAME="cpe:/o:amazon:linux:2018.03:ga"
> HOME_URL="http://aws.amazon.com/amazon-linux-ami/;
> {noformat}
> It seems the Cassandra JVM results in using Parallel GC.
> {noformat}
> INFO  [Service Thread] 2020-10-01 00:00:56,233 GCInspector.java:299 - PS 
> Scavenge GC in 541ms.  PS Old Gen: 5152844776 -> 5726724752;
> WARN  [Service Thread] 2020-10-01 00:00:56,234 GCInspector.java:297 - PS 
> MarkSweep GC in 1969ms.  PS Eden Space: 2111307776 -> 0; PS Old Gen: 
> 5726724752 -> 2581334376; PS Survivor Space: 363850224 -> 0
> {noformat}
> Although {{jvm8-server.options}} is using CMS.
> {noformat}
> #
> #  GC SETTINGS  #
> #
> ### CMS Settings
> -XX:+UseParNewGC
> -XX:+UseConcMarkSweepGC
> -XX:+CMSParallelRemarkEnabled
> -XX:SurvivorRatio=8
> -XX:MaxTenuringThreshold=1
> -XX:CMSInitiatingOccupancyFraction=75
> -XX:+UseCMSInitiatingOccupancyOnly
> -XX:CMSWaitDuration=1
> -XX:+CMSParallelInitialMarkEnabled
> -XX:+CMSEdenChunksRecordAlways
> ## some JVMs will fill up their heap when accessed via JMX, see CASSANDRA-6541
> -XX:+CMSClassUnloadingEnabled
> ...
> {noformat}
> In Cassandra 3, default has been CMS.
> So, possibly there is something wrong in reading/processing 
> {{jvm8-server.options}}?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16153) Cassandra 4b2 - JVM options from *.options not read/set

2020-10-01 Thread Thomas Steinmaurer (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Steinmaurer updated CASSANDRA-16153:
---
Attachment: system.log.2020-10-01.0.zip

> Cassandra 4b2 - JVM options from *.options not read/set
> ---
>
> Key: CASSANDRA-16153
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16153
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Scripts
>Reporter: Thomas Steinmaurer
>Priority: Normal
> Attachments: system.log.2020-10-01.0.zip
>
>
> Trying out Cassandra 4 beta 2 with Java 8 (AdoptOpenJDK) in AWS.
> {noformat}
> NAME="Amazon Linux AMI"
> VERSION="2018.03"
> ID="amzn"
> ID_LIKE="rhel fedora"
> VERSION_ID="2018.03"
> PRETTY_NAME="Amazon Linux AMI 2018.03"
> ANSI_COLOR="0;33"
> CPE_NAME="cpe:/o:amazon:linux:2018.03:ga"
> HOME_URL="http://aws.amazon.com/amazon-linux-ami/;
> {noformat}
> It seems the Cassandra JVM results in using Parallel GC.
> {noformat}
> INFO  [Service Thread] 2020-10-01 00:00:56,233 GCInspector.java:299 - PS 
> Scavenge GC in 541ms.  PS Old Gen: 5152844776 -> 5726724752;
> WARN  [Service Thread] 2020-10-01 00:00:56,234 GCInspector.java:297 - PS 
> MarkSweep GC in 1969ms.  PS Eden Space: 2111307776 -> 0; PS Old Gen: 
> 5726724752 -> 2581334376; PS Survivor Space: 363850224 -> 0
> {noformat}
> Although {{jvm8-server.options}} is using CMS.
> {noformat}
> #
> #  GC SETTINGS  #
> #
> ### CMS Settings
> -XX:+UseParNewGC
> -XX:+UseConcMarkSweepGC
> -XX:+CMSParallelRemarkEnabled
> -XX:SurvivorRatio=8
> -XX:MaxTenuringThreshold=1
> -XX:CMSInitiatingOccupancyFraction=75
> -XX:+UseCMSInitiatingOccupancyOnly
> -XX:CMSWaitDuration=1
> -XX:+CMSParallelInitialMarkEnabled
> -XX:+CMSEdenChunksRecordAlways
> ## some JVMs will fill up their heap when accessed via JMX, see CASSANDRA-6541
> -XX:+CMSClassUnloadingEnabled
> ...
> {noformat}
> In Cassandra 3, default has been CMS.
> So, possibly there is something wrong in reading/processing 
> {{jvm8-server.options}}?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16153) Cassandra 4b2 - JVM options from *.options not read/set

2020-10-01 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205969#comment-17205969
 ] 

Brandon Williams commented on CASSANDRA-16153:
--

Can you attach a log from startup?

> Cassandra 4b2 - JVM options from *.options not read/set
> ---
>
> Key: CASSANDRA-16153
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16153
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Scripts
>Reporter: Thomas Steinmaurer
>Priority: Normal
>
> Trying out Cassandra 4 beta 2 with Java 8 (AdoptOpenJDK) in AWS.
> {noformat}
> NAME="Amazon Linux AMI"
> VERSION="2018.03"
> ID="amzn"
> ID_LIKE="rhel fedora"
> VERSION_ID="2018.03"
> PRETTY_NAME="Amazon Linux AMI 2018.03"
> ANSI_COLOR="0;33"
> CPE_NAME="cpe:/o:amazon:linux:2018.03:ga"
> HOME_URL="http://aws.amazon.com/amazon-linux-ami/;
> {noformat}
> It seems the Cassandra JVM results in using Parallel GC.
> {noformat}
> INFO  [Service Thread] 2020-10-01 00:00:56,233 GCInspector.java:299 - PS 
> Scavenge GC in 541ms.  PS Old Gen: 5152844776 -> 5726724752;
> WARN  [Service Thread] 2020-10-01 00:00:56,234 GCInspector.java:297 - PS 
> MarkSweep GC in 1969ms.  PS Eden Space: 2111307776 -> 0; PS Old Gen: 
> 5726724752 -> 2581334376; PS Survivor Space: 363850224 -> 0
> {noformat}
> Although {{jvm8-server.options}} is using CMS.
> {noformat}
> #
> #  GC SETTINGS  #
> #
> ### CMS Settings
> -XX:+UseParNewGC
> -XX:+UseConcMarkSweepGC
> -XX:+CMSParallelRemarkEnabled
> -XX:SurvivorRatio=8
> -XX:MaxTenuringThreshold=1
> -XX:CMSInitiatingOccupancyFraction=75
> -XX:+UseCMSInitiatingOccupancyOnly
> -XX:CMSWaitDuration=1
> -XX:+CMSParallelInitialMarkEnabled
> -XX:+CMSEdenChunksRecordAlways
> ## some JVMs will fill up their heap when accessed via JMX, see CASSANDRA-6541
> -XX:+CMSClassUnloadingEnabled
> ...
> {noformat}
> In Cassandra 3, default has been CMS.
> So, possibly there is something wrong in reading/processing 
> {{jvm8-server.options}}?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16153) Cassandra 4b2 - JVM options from *.options not read/set

2020-10-01 Thread Thomas Steinmaurer (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205967#comment-17205967
 ] 

Thomas Steinmaurer commented on CASSANDRA-16153:


[~brandon.williams], no. 4 vCores (m5.xlarge).

> Cassandra 4b2 - JVM options from *.options not read/set
> ---
>
> Key: CASSANDRA-16153
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16153
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Scripts
>Reporter: Thomas Steinmaurer
>Priority: Normal
>
> Trying out Cassandra 4 beta 2 with Java 8 (AdoptOpenJDK) in AWS.
> {noformat}
> NAME="Amazon Linux AMI"
> VERSION="2018.03"
> ID="amzn"
> ID_LIKE="rhel fedora"
> VERSION_ID="2018.03"
> PRETTY_NAME="Amazon Linux AMI 2018.03"
> ANSI_COLOR="0;33"
> CPE_NAME="cpe:/o:amazon:linux:2018.03:ga"
> HOME_URL="http://aws.amazon.com/amazon-linux-ami/;
> {noformat}
> It seems the Cassandra JVM results in using Parallel GC.
> {noformat}
> INFO  [Service Thread] 2020-10-01 00:00:56,233 GCInspector.java:299 - PS 
> Scavenge GC in 541ms.  PS Old Gen: 5152844776 -> 5726724752;
> WARN  [Service Thread] 2020-10-01 00:00:56,234 GCInspector.java:297 - PS 
> MarkSweep GC in 1969ms.  PS Eden Space: 2111307776 -> 0; PS Old Gen: 
> 5726724752 -> 2581334376; PS Survivor Space: 363850224 -> 0
> {noformat}
> Although {{jvm8-server.options}} is using CMS.
> {noformat}
> #
> #  GC SETTINGS  #
> #
> ### CMS Settings
> -XX:+UseParNewGC
> -XX:+UseConcMarkSweepGC
> -XX:+CMSParallelRemarkEnabled
> -XX:SurvivorRatio=8
> -XX:MaxTenuringThreshold=1
> -XX:CMSInitiatingOccupancyFraction=75
> -XX:+UseCMSInitiatingOccupancyOnly
> -XX:CMSWaitDuration=1
> -XX:+CMSParallelInitialMarkEnabled
> -XX:+CMSEdenChunksRecordAlways
> ## some JVMs will fill up their heap when accessed via JMX, see CASSANDRA-6541
> -XX:+CMSClassUnloadingEnabled
> ...
> {noformat}
> In Cassandra 3, default has been CMS.
> So, possibly there is something wrong in reading/processing 
> {{jvm8-server.options}}?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15583) 4.0 quality testing: Tooling, Bundled and First Party

2020-10-01 Thread Berenguer Blasi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205962#comment-17205962
 ] 

Berenguer Blasi commented on CASSANDRA-15583:
-

All tools flags are being ran and the tool doesn't 'explode' so far sort to 
speak. Even testing 'non-estoteric' flags imo would be a huge work: test the 
param, invalid values, corner cases, observe the effects, fix the tool as you 
go along (I can tell you from what I've seen doing so far 95% of them only 
account for the happy path)... Nodetool alone is mammoth is that regard. I 
don't think that middle ground can be done in a timely manner and it could 
easily be 50% of the overall work left to be done. Finger the air, personal 
opinion though...

> 4.0 quality testing: Tooling, Bundled and First Party
> -
>
> Key: CASSANDRA-15583
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15583
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest/python, Test/unit
>Reporter: Josh McKenzie
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-beta, 4.0-triage
>
>
> Reference [doc from 
> NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#]
>  for context.
> *Shepherd: Sam Tunnicliffe*
> Test plans should cover bundled first-party tooling and CLIs such as 
> nodetool, cqlsh, and new tools supporting full query and audit logging 
> (CASSANDRA-13983, CASSANDRA-12151).
> *Progress as of Aug 2020*
> {{ToolRunner}} has been added enabling us to test tools in java unit tests. 
> This includes capturing their stdout/err and stdin i.e. Most tools have a 
> starting unit test testing their cmd line args happy path. Tickets have been 
> created to improve coverage of those  and flagged LHF. Also for those tools 
> big enough they can't be addressed in a simple ticket such as nodetool, a 
> placeholder ticket for future improvements has been created as well. Tickets 
> and status are:
> ||Tool||UX test||UT coverage||dtest coverage||Comments||
> |Nodetool|(x)|(x) CASSANDRA-16026|(!)|Not all the sub commands are tested. 
> Dtest also test nodetool as a side effect|
> |Cqlsh|(x)|(x) CASSANDRA-16025|(!)| |
> |Cassandra-stress|(x)|(x) CASSANDRA-16024|(x)| |
> |debug-cql|(x)|(x) CASSANDRA-16023|(x)| |
> |fqltool|(x)|(/) CASSANDRA-16022|(!)| |
> |auditlogviewer|(/) CASSANDRA-15991|(!) CASSANDRA-16021|(!)| |
> |*Sstable utilities*| | | | |
> |sstabledump|(/) CASSANDRA-15991|(/) CASSANDRA-16020|(!)| |
> |sstableexpiredblockers|(/) CASSANDRA-15991|(x) CASSANDRA-16019|(!)| |
> |sstablelevelreset|(/) CASSANDRA-15991|(x) CASSANDRA-16018|(!)| |
> |sstableloader|(x)|(x) CASSANDRA-16017|(!)| |
> |sstablemetadata|(/) CASSANDRA-15991|(x) CASSANDRA-16016|(x)|Ran in dtests, 
> no dedicated test|
> |sstableofflinerelevel|(/) CASSANDRA-15991|(x) CASSANDRA-16015|(!)| |
> |sstablerepairedset|(/) CASSANDRA-15991|(x) CASSANDRA-16014|(x)|Ran in 
> dtests, no dedicated test|
> |sstablescrub|(/) CASSANDRA-15991|(x) CASSANDRA-16013|(!)| |
> |sstablesplit|(/) CASSANDRA-15991|(x) CASSANDRA-16012|(!)| |
> |sstableupgrade|(/) CASSANDRA-15991|(x) CASSANDRA-16011|(!)| |
> |sstableutil|(/) CASSANDRA-15991|(x) CASSANDRA-16010|(!)| |
> |sstableverify|(/) CASSANDRA-15991|(x) CASSANDRA-16009|(!)| |



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15585) 4.0 quality testing: Test Frameworks, Tooling, Infra / Automation

2020-10-01 Thread Jordan West (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205951#comment-17205951
 ] 

Jordan West commented on CASSANDRA-15585:
-

I think focusing the remains of this ticket on better automation (primarily 
around 4.0 testing for now but I agree re: improving it for other contributors 
as well) would be a good way to drive it done. We can break out the 
cassandra-diff project into something separate as that is more ambitious and it 
sounds like it is likely to evolve outside of the 4.0 path. 

It would be nice to see some Harry tests running regularly on say ASF hardware 
(I am not really the one to comment on where the best place to run them is) 
either on a cadence or as part of builds (whichever makes more sense depending 
on the test). 

> 4.0 quality testing: Test Frameworks, Tooling, Infra / Automation
> -
>
> Key: CASSANDRA-15585
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15585
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest/python
>Reporter: Josh McKenzie
>Priority: Normal
> Fix For: 4.0-beta, 4.0-triage
>
>
> Reference [doc from 
> NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#]
>  for context.
> *Shepherd: Jordan West*
> This area refers to contributions to test frameworks/tooling (e.g., dtests, 
> QuickTheories, CASSANDRA-14821), and automation enabling those tools to be 
> applied at scale (e.g., replay testing via Spark-based replay of captured FQL 
> logs).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16148) Test failures caused by merging CASSANDRA-15833

2020-10-01 Thread Jordan West (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205950#comment-17205950
 ] 

Jordan West commented on CASSANDRA-16148:
-

Added the memoization changes as well and pushed. Unfortunately the build will 
be broken until we get the in-jvm dtest changes released. 

> Test failures caused by merging CASSANDRA-15833
> ---
>
> Key: CASSANDRA-16148
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16148
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Normal
>
> Three issues were caused by merging CASSANDRA-15833:
> 1. `GossiperTest#testHaveAnyVersion3Nodes` was failing on trunk: 
> https://app.circleci.com/pipelines/github/jrwest/cassandra/53/workflows/95f9f401-1ef8-4b8d-9c64-3703d9669d95/jobs/771
> 2. python dtest ReadRepairTest#test_atomic_writes[blocking] was failing
> 3. In-jvm dtests being worked on as part of CASSANDRA-15977 uncovered an 
> issue with how CASSANDRA-15833 changes interacted with in-jvm dtests running 
> without {{Feature.GOSSIP}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15921) 4.0 quality testing: Materialized View

2020-10-01 Thread Josh McKenzie (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205941#comment-17205941
 ] 

Josh McKenzie commented on CASSANDRA-15921:
---

[~jasonstack] - any movement on this? Do we have a backlog of things we've 
identified we want to work on or suspect needs TLC we can enumerate here?

> 4.0 quality testing: Materialized View
> --
>
> Key: CASSANDRA-15921
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15921
> Project: Cassandra
>  Issue Type: Task
>  Components: Feature/Materialized Views
>Reporter: Zhao Yang
>Assignee: Zhao Yang
>Priority: Normal
> Fix For: 4.x
>
> Attachments: C40_MV.png
>
>
> The main purpose of this ticket to get a better understanding about 4.0 MV 
> status as a guideline for future improvements. I don't think it should block 
> 4.0 release since it's already marked as experimental.
> Main areas to test:
>  * Write perf: We expect to see [10% write throughput drop per MV 
> added|https://www.datastax.com/blog/2016/05/materialized-view-performance-cassandra-3x].
> ** Attached C40_MV.png  is alpha-4, 5-node, rf3 MV write tests: with 1 mv, 
> throughput dropped 50%
>  * Read perf: identical to normal table
>  * Bootstrap/Decommision: no write-path required since CASSANDRA-13065
>  * Repair: write path required
>  * Chaos monkey: take down coordinator/base-replica/view-replica during 
> read/write/token-movement and verify data consistency (may need a tool)
>  * Hint Replay: able to throttle if table has MV - CASSANDRA-13810
>  * Schema race: create/drop - CASSANDRA-15845/CASSANDRA-15918



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15588) 4.0 quality testing: Cluster Upgrade

2020-10-01 Thread Josh McKenzie (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205940#comment-17205940
 ] 

Josh McKenzie commented on CASSANDRA-15588:
---

[~xingh] We don't have a shepherd here which is concerning item #1 (probably 
should hit up ML for this and the couple other tickets on this epic w/out; I 
can take that on shortly). Diff testing workloads has proven to be one of the 
most powerful ways to confirm mixed version clusters are behaving, and we can 
reasonably expect a post CASSANDRA-8099 cluster to 4.0 to have significantly 
less risk and exposure to defects than the straddle as they don't rely on 
LegacyLayout.

I don't see much of a path other than having a corpus of real user schemas we 
can run through cassandra-diff w/a generative workload and forward + reverse 
iteration to confirm correctness; we're not there yet, and while we should have 
the framework to accept that in the relatively near future, I think blocking a 
4.0 release on us building up that collection of anonymized schemas is a pretty 
big set of unknowns.

So the current relatively inadequate technical coverage we have (afaik) is in 
upgrade_tests in the dtest repo 
([link|[https://github.com/apache/cassandra-dtest/tree/master/upgrade_tests]]). 
I frame as inadequate because they didn't catch a bunch of the things that had 
sharp edges in the StorageEngine rewrite so clearly our mixed version cluster 
testing wasn't as robust as we'd hoped.

Looks like we have a candidate to build from in UpgradeTestBase.java we could 
start to specifically flesh out if we had a PoV on things to test. I'd advocate 
for us building more there than investing further in dtests.

So: long winded way to say "Hm. Not sure what we should do here."

[~jjirsa] - got any ideas in terms of a straw man proposal of things we should 
test on mixed version clusters from a unit perspective, or is integration the 
way to go here?

> 4.0 quality testing: Cluster Upgrade
> 
>
> Key: CASSANDRA-15588
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15588
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest/java, Test/dtest/python
>Reporter: Josh McKenzie
>Priority: Normal
> Fix For: 4.0-beta, 4.0-triage
>
>
> We've historically had numerous bugs concerning upgrading clusters from one 
> version to the other. Let's establish the supported upgrade path and ensure 
> that users can safely perform the upgrades in production.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16161) Validation Compactions causing Java GC pressure

2020-10-01 Thread Cameron Zemek (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cameron Zemek updated CASSANDRA-16161:
--
Change Category: Performance
 Complexity: Low Hanging Fruit
  Fix Version/s: 3.11.8

> Validation Compactions causing Java GC pressure
> ---
>
> Key: CASSANDRA-16161
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16161
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Cameron Zemek
>Priority: Normal
> Fix For: 3.11.x, 3.11.8
>
> Attachments: 16161.patch
>
>
> Validation Compactions are not rate limited which can cause Java GC pressure 
> and result in spikes in latency.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16161) Validation Compactions causing Java GC pressure

2020-10-01 Thread Cameron Zemek (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205936#comment-17205936
 ] 

Cameron Zemek commented on CASSANDRA-16161:
---

In order to not break existing behavior I have made validation compactions have 
their own separate throttle in this patch. However for 4.x I think it make more 
sense to reuse the some rate limiter that is on normal compactions.

> Validation Compactions causing Java GC pressure
> ---
>
> Key: CASSANDRA-16161
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16161
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Cameron Zemek
>Priority: Normal
> Fix For: 3.11.x
>
> Attachments: 16161.patch
>
>
> Validation Compactions are not rate limited which can cause Java GC pressure 
> and result in spikes in latency.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16161) Validation Compactions causing Java GC pressure

2020-10-01 Thread Cameron Zemek (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cameron Zemek updated CASSANDRA-16161:
--
Fix Version/s: 3.11.x

> Validation Compactions causing Java GC pressure
> ---
>
> Key: CASSANDRA-16161
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16161
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Cameron Zemek
>Priority: Normal
> Fix For: 3.11.x
>
> Attachments: 16161.patch
>
>
> Validation Compactions are not rate limited which can cause Java GC pressure 
> and result in spikes in latency.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15587) 4.0 quality testing: Platforms and Runtimes

2020-10-01 Thread Josh McKenzie (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205935#comment-17205935
 ] 

Josh McKenzie commented on CASSANDRA-15587:
---

[~gianluca] - have you taken a crack at this at all yet? With no shepherd we're 
at something of a disadvantage from a PoV on what JDK11 support in 4.0 should 
look like.

 

>From a selfish dev perspective, I'm always amenable to supporting new JDK's 
>for new language features that make life a little cleaner, and the sooner we 
>have formal JDK11 support the sooner we can start discussing deprecating JDK8. 
>Further, I think the surveying we did on the dev list / twitter / etc other 
>informal channels indicated quite a few folks running JDK11.

> 4.0 quality testing: Platforms and Runtimes
> ---
>
> Key: CASSANDRA-15587
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15587
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest/java, Test/dtest/python
>Reporter: Josh McKenzie
>Assignee: Gianluca Righetto
>Priority: Normal
> Fix For: 4.0-beta, 4.0-triage
>
>
> Reference [doc from 
> NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#]
>  for context.
> *Shepherd: {color:#ff}NONE{color}*
> CASSANDRA-9608 introduces support for Java 11. We'll want to verify that 
> Cassandra under Java 11 meets expectations of stability.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16161) Validation Compactions causing Java GC pressure

2020-10-01 Thread Cameron Zemek (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cameron Zemek updated CASSANDRA-16161:
--
Attachment: 16161.patch

> Validation Compactions causing Java GC pressure
> ---
>
> Key: CASSANDRA-16161
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16161
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Cameron Zemek
>Priority: Normal
> Fix For: 3.11.x
>
> Attachments: 16161.patch
>
>
> Validation Compactions are not rate limited which can cause Java GC pressure 
> and result in spikes in latency.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-16161) Validation Compactions causing Java GC pressure

2020-10-01 Thread Cameron Zemek (Jira)
Cameron Zemek created CASSANDRA-16161:
-

 Summary: Validation Compactions causing Java GC pressure
 Key: CASSANDRA-16161
 URL: https://issues.apache.org/jira/browse/CASSANDRA-16161
 Project: Cassandra
  Issue Type: Improvement
Reporter: Cameron Zemek


Validation Compactions are not rate limited which can cause Java GC pressure 
and result in spikes in latency.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15586) 4.0 quality testing: Cluster Setup and Maintenance

2020-10-01 Thread Josh McKenzie (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205933#comment-17205933
 ] 

Josh McKenzie commented on CASSANDRA-15586:
---

{quote}Windows - currently ongoing mailing list thread. As soon as the 
community come up with agreement, this will be taken care of
{quote}
[~e.dimitrova] where did we land on this? [~yukim]?

I DO have a nice new 8-core ryzen windows laptop here...

> 4.0 quality testing: Cluster Setup and Maintenance
> --
>
> Key: CASSANDRA-15586
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15586
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest/python
>Reporter: Josh McKenzie
>Assignee: Ekaterina Dimitrova
>Priority: Normal
>  Labels: 4.0-QA
> Fix For: 4.0-beta, 4.0-triage
>
>
> We want 4.0 to be easy for users to setup out of the box and just work. This 
> means having low friction when users download the Cassandra package and start 
> running it. For example, users should be able to easily configure and start 
> new 4.0 clusters and have tokens distributed evenly. Another example is 
> packaging, it should be easy to install Cassandra on all supported platforms 
> (e.g. packaging) and have Cassandra use standard platform integrations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15585) 4.0 quality testing: Test Frameworks, Tooling, Infra / Automation

2020-10-01 Thread Josh McKenzie (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205932#comment-17205932
 ] 

Josh McKenzie commented on CASSANDRA-15585:
---

{quote}automation – are we running tests regularly with the tools we built so 
that they have the potential to find things:
{quote}
Currently the Jenkins / Circle straddle seems pretty unhealthy / unclear for 
new contributors. We should probably figure this out (orthogonal to this 
ticket, just thought about it reading your comment).

Definitely think any new heavier test suites we're writing (Harry, fallout, 
else) we'll want to get into less frequent cadenced builds (thinking nightly 
runs, weekly runs) and figure out the right way to raise signals to the dev 
community that things are awry. Not sure if nagging slack bot or emails to 
subbed w/a specific ML is the right answer. Also not sure what infra that 
should run on (ASF Jenkins or something paid for or hosted by a set of 
contributors but openly configured ??)

 
{quote} the project you mentioned

automated cassandra-diff framework
{quote}
In my ideal utopian world, we'd be running a suite of diffing tools with real 
user schemas nightly between trunk and a last released correct build and 
hitting the alarm klaxons when things diverge in slack. That project is super 
nascent but it has the bones of what we need and all of us are smarter than 
just a few of us; should have something to talk to the list about within a week 
or so when ducks are in a row.

> 4.0 quality testing: Test Frameworks, Tooling, Infra / Automation
> -
>
> Key: CASSANDRA-15585
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15585
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest/python
>Reporter: Josh McKenzie
>Priority: Normal
> Fix For: 4.0-beta, 4.0-triage
>
>
> Reference [doc from 
> NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#]
>  for context.
> *Shepherd: Jordan West*
> This area refers to contributions to test frameworks/tooling (e.g., dtests, 
> QuickTheories, CASSANDRA-14821), and automation enabling those tools to be 
> applied at scale (e.g., replay testing via Spark-based replay of captured FQL 
> logs).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16160) Regression in cqlsh with regard to row id display

2020-10-01 Thread David Capwell (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-16160:
--
Fix Version/s: 4.0-beta

> Regression in cqlsh with regard to row id display
> -
>
> Key: CASSANDRA-16160
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16160
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/cqlsh
>Reporter: David Capwell
>Priority: Normal
> Fix For: 4.0-beta
>
>
> When you run a query such as
> {code}
> expand on; 
> select * from table_with_clustering_keys where token(partition_key) = 
> 1192326969048244361;
> {code}
> We print out a header for each row that looks like the following
> @ Row 1
> In 3.0 all values printed were uniq, but in 4.0 they are no longer unique
> {code}
> $ grep Row 3.0-rows.results | sort | uniq -c | sort -k1 -h -r | head -n 10
>   1 @ Row 999
>   1 @ Row 998
>   1 @ Row 997
>   1 @ Row 996
>   1 @ Row 995
>   1 @ Row 994
>   1 @ Row 993
>   1 @ Row 992
>   1 @ Row 991
>   1 @ Row 990
> {code}
> {code}
> $ grep Row 4.0-rows.results | sort | uniq -c | sort -k1 -h -r | head -n 10
>  10 @ Row 9
>  10 @ Row 8
>  10 @ Row 7
>  10 @ Row 6
>  10 @ Row 5
>  10 @ Row 48
>  10 @ Row 47
>  10 @ Row 46
>  10 @ Row 45
>  10 @ Row 44
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15584) 4.0 quality testing: Tooling - External Ecosystem

2020-10-01 Thread Josh McKenzie (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205931#comment-17205931
 ] 

Josh McKenzie commented on CASSANDRA-15584:
---

[~blerer] here's a frame we might use. For new users to the C* ecosystem, do we 
have at least one option in the ecosystem for each major functionality that we 
know works for 4.0? i.e.
 * k8s operator
 * backup and restore
 * sidecar
 * repair operations

etc.

And if the answer is yes, we probably shouldn't hold up the 4.0 release on 
out-of-project tools. My instinct is that if we can offer a complete ecosystem 
to operate C* 4.0 in, we should consider that ecosystem covered rather than 
coupling the progress of the project on tooling that's governed outside the ASF.

wdyt? In terms of the scope / energy of this ticket, that seems appropriate 
w/the caveat that for anything listed in this ticket, we can create follow-up 
JIRAs for 4.0-rc where we keep an eye on things we want to keep channels open 
and ping / follow up on things w/folks to see if they're updating their tooling 
or need any assistance.

> 4.0 quality testing: Tooling - External Ecosystem
> -
>
> Key: CASSANDRA-15584
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15584
> Project: Cassandra
>  Issue Type: Task
>  Components: Tool/external
>Reporter: Josh McKenzie
>Assignee: Benjamin Lerer
>Priority: Normal
> Fix For: 4.0-rc, 4.0-triage
>
>
> Reference [doc from 
> NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#]
>  for context.
> *Shepherd: Benjamin Lerer*
> Many users of Apache Cassandra employ open source tooling to automate 
> Cassandra configuration, runtime management, and repair scheduling. Prior to 
> release, we need to confirm that popular third-party tools function properly. 
> Current list of tools:
> || Name || Status || Contact ||
> | [Priam|http://netflix.github.io/Priam/] | *NOT STARTED* | 
> [~sumanth.pasupuleti]| 
> | [sstabletools|https://github.com/instaclustr/cassandra-sstable-tools] | 
> *NOT STARTED* | [~stefan.miklosovic]| 
> | [cassandra-exporter|https://github.com/instaclustr/cassandra-exporter]| 
> *NOT STARTED* | [~stefan.miklosovic]|
> | [Instaclustr Cassandra 
> operator|https://github.com/instaclustr/cassandra-operator]|  
> {color:#00875A}*DONE*{color} | [~stefan.miklosovic]|
> | [Instaclustr Cassandra Backup Restore | 
> https://github.com/instaclustr/cassandra-backup]|{color:#00875A}*DONE*{color} 
> | [~stefan.miklosovic]|
> | [Instaclustr Cassandra Sidecar | 
> https://github.com/instaclustr/cassandra-sidecar]|{color:#00875A}*DONE*{color}
>  | [~stefan.miklosovic]|
> | [Cassandra SSTable generator | 
> https://github.com/instaclustr/cassandra-sstable-generator]|{color:#00875A}*DONE*{color}|
>  [~stefan.miklosovic]|
> | [Reaper|http://cassandra-reaper.io/]| {color:#00875A}*AUTOMATIC*{color} | 
> [~adejanovski]|
> | [Medusa|https://github.com/thelastpickle/cassandra-medusa]| *NOT STARTED*| 
> [~adejanovski]|
> | [Casskop|https://orange-opensource.github.io/casskop/]| *NOT STARTED*| 
> Franck Dehay|
> | 
> [spark-cassandra-connector|https://github.com/datastax/spark-cassandra-connector]|
>  {color:#00875A}*DONE*{color}| [~jtgrabowski]|
> | [cass operator|https://github.com/datastax/cass-operator]| 
> {color:#00875A}*DONE*{color}| [~jimdickinson]|
> | [metric 
> collector|https://github.com/datastax/metric-collector-for-apache-cassandra]| 
> {color:#00875A}*DONE*{color}| [~tjake]|
> | [managment 
> API|https://github.com/datastax/management-api-for-apache-cassandra]| 
> {color:#00875A}*DONE*{color}| [~tjake]|  
> Columns descriptions:
> * *Name*: Name and link to the tool official page
> * *Status*: {{NOT STARTED}}, {{IN PROGRESS}}, {{BLOCKED}} if you hit any 
> issue and have to wait for it to be solved, {{DONE}}, {{AUTOMATIC}} if 
> testing 4.0 is part of your CI process.
> * *Contact*: The person acting as the contact point for that tool. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15583) 4.0 quality testing: Tooling, Bundled and First Party

2020-10-01 Thread Josh McKenzie (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205928#comment-17205928
 ] 

Josh McKenzie commented on CASSANDRA-15583:
---

Is there a middle ground where we can say "here's the basics of tools that 
we've tested" (i.e. non-esoteric flags) and we know they work?

> 4.0 quality testing: Tooling, Bundled and First Party
> -
>
> Key: CASSANDRA-15583
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15583
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest/python, Test/unit
>Reporter: Josh McKenzie
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-beta, 4.0-triage
>
>
> Reference [doc from 
> NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#]
>  for context.
> *Shepherd: Sam Tunnicliffe*
> Test plans should cover bundled first-party tooling and CLIs such as 
> nodetool, cqlsh, and new tools supporting full query and audit logging 
> (CASSANDRA-13983, CASSANDRA-12151).
> *Progress as of Aug 2020*
> {{ToolRunner}} has been added enabling us to test tools in java unit tests. 
> This includes capturing their stdout/err and stdin i.e. Most tools have a 
> starting unit test testing their cmd line args happy path. Tickets have been 
> created to improve coverage of those  and flagged LHF. Also for those tools 
> big enough they can't be addressed in a simple ticket such as nodetool, a 
> placeholder ticket for future improvements has been created as well. Tickets 
> and status are:
> ||Tool||UX test||UT coverage||dtest coverage||Comments||
> |Nodetool|(x)|(x) CASSANDRA-16026|(!)|Not all the sub commands are tested. 
> Dtest also test nodetool as a side effect|
> |Cqlsh|(x)|(x) CASSANDRA-16025|(!)| |
> |Cassandra-stress|(x)|(x) CASSANDRA-16024|(x)| |
> |debug-cql|(x)|(x) CASSANDRA-16023|(x)| |
> |fqltool|(x)|(/) CASSANDRA-16022|(!)| |
> |auditlogviewer|(/) CASSANDRA-15991|(!) CASSANDRA-16021|(!)| |
> |*Sstable utilities*| | | | |
> |sstabledump|(/) CASSANDRA-15991|(/) CASSANDRA-16020|(!)| |
> |sstableexpiredblockers|(/) CASSANDRA-15991|(x) CASSANDRA-16019|(!)| |
> |sstablelevelreset|(/) CASSANDRA-15991|(x) CASSANDRA-16018|(!)| |
> |sstableloader|(x)|(x) CASSANDRA-16017|(!)| |
> |sstablemetadata|(/) CASSANDRA-15991|(x) CASSANDRA-16016|(x)|Ran in dtests, 
> no dedicated test|
> |sstableofflinerelevel|(/) CASSANDRA-15991|(x) CASSANDRA-16015|(!)| |
> |sstablerepairedset|(/) CASSANDRA-15991|(x) CASSANDRA-16014|(x)|Ran in 
> dtests, no dedicated test|
> |sstablescrub|(/) CASSANDRA-15991|(x) CASSANDRA-16013|(!)| |
> |sstablesplit|(/) CASSANDRA-15991|(x) CASSANDRA-16012|(!)| |
> |sstableupgrade|(/) CASSANDRA-15991|(x) CASSANDRA-16011|(!)| |
> |sstableutil|(/) CASSANDRA-15991|(x) CASSANDRA-16010|(!)| |
> |sstableverify|(/) CASSANDRA-15991|(x) CASSANDRA-16009|(!)| |



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15581) 4.0 quality testing: Compaction

2020-10-01 Thread Josh McKenzie (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205926#comment-17205926
 ] 

Josh McKenzie commented on CASSANDRA-15581:
---

[~marcuse] quick sanity check - you have the bandwidth to shepherd this or 
should we drum up someone else that knows the domain?

Digging around for another assignee (volunteers welcome ;) )

> 4.0 quality testing: Compaction
> ---
>
> Key: CASSANDRA-15581
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15581
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest/python
>Reporter: Josh McKenzie
>Priority: Normal
> Fix For: 4.0-beta, 4.0-triage
>
>
> Reference [doc from 
> NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#]
>  for context.
> *Shepherd: Marcus Eriksson*
> Alongside the local and distributed read/write paths, we'll also want to 
> validate compaction. CASSANDRA-6696 introduced substantial 
> changes/improvements that require testing (esp. JBOD).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Issue Comment Deleted] (CASSANDRA-15581) 4.0 quality testing: Compaction

2020-10-01 Thread Josh McKenzie (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh McKenzie updated CASSANDRA-15581:
--
Comment: was deleted

(was: @pauloricard...@gmail.com   No, I have not had
the time to start working on this ticket yet
. I will remove myself as assignee so if somebody has more bandwidth than
me he will be able to jump in.

Le mer. 30 sept. 2020 à 00:18, Paulo Motta (Jira)  a

)

> 4.0 quality testing: Compaction
> ---
>
> Key: CASSANDRA-15581
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15581
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest/python
>Reporter: Josh McKenzie
>Priority: Normal
> Fix For: 4.0-beta, 4.0-triage
>
>
> Reference [doc from 
> NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#]
>  for context.
> *Shepherd: Marcus Eriksson*
> Alongside the local and distributed read/write paths, we'll also want to 
> validate compaction. CASSANDRA-6696 introduced substantial 
> changes/improvements that require testing (esp. JBOD).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15579) 4.0 quality testing: Distributed Read/Write Path: Coordination, Replication, and Read Repair

2020-10-01 Thread Josh McKenzie (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205925#comment-17205925
 ] 

Josh McKenzie commented on CASSANDRA-15579:
---

[~adelapena] See the great work on read repair. If we haven't moved on 
coordination or replication, should we maybe create sub-tasks for that and drum 
up some assignees to work on that? wdyt?

> 4.0 quality testing: Distributed Read/Write Path: Coordination, Replication, 
> and Read Repair
> 
>
> Key: CASSANDRA-15579
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15579
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/unit
>Reporter: Josh McKenzie
>Assignee: Andres de la Peña
>Priority: Normal
> Fix For: 4.0-beta, 4.0-triage
>
>
> Reference [doc from 
> NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#]
>  for context.
> *Shepherd: Blake Eggleston*
> Testing in this area focuses on non-node-local aspects of the read-write 
> path: coordination, replication, read repair, etc.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15538) 4.0 quality testing: Local Read/Write Path: Other Areas

2020-10-01 Thread Josh McKenzie (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205924#comment-17205924
 ] 

Josh McKenzie commented on CASSANDRA-15538:
---

So does that mean yes you are but are happy to defer, or no you're not? :)

re: Harry, without having dug into it proper, I was under the impression it's 
incredibly helpful reproducing things discovered in diff testing. Is there also 
a "generative fuzz without specific workload context" role it serves? If so, is 
that something we can enumerate here? Thinking something like a gdoc to brain 
dump "here's the scenarios we can and should test with tool X to confirm 
subsystem Y" or something like that.

> 4.0 quality testing: Local Read/Write Path: Other Areas
> ---
>
> Key: CASSANDRA-15538
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15538
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest/java, Test/dtest/python
>Reporter: Josh McKenzie
>Priority: Normal
> Fix For: 4.0-beta, 4.0-triage
>
>
> Reference [doc from 
> NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#]
>  for context.
> *Shepherd: Aleksey Yeschenko*
> Testing in this area refers to the local read/write path (StorageProxy, 
> ColumnFamilyStore, Memtable, SSTable reading/writing, etc). We are still 
> finding numerous bugs and issues with the 3.0 storage engine rewrite 
> (CASSANDRA-8099). For 4.0 we want to ensure that we thoroughly cover the 
> local read/write path with techniques such as property-based testing, fuzzing 
> ([example|http://cassandra.apache.org/blog/2018/10/17/finding_bugs_with_property_based_testing.html]),
>  and a source audit.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15537) 4.0 quality testing: Local Read/Write Path: Upgrade and Diff Test

2020-10-01 Thread Josh McKenzie (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205923#comment-17205923
 ] 

Josh McKenzie commented on CASSANDRA-15537:
---

[~yifanc] is there a point at which you'd say "Here's our definition of done 
for this ticket" in terms of confidence in the release? Should we consider this 
ticket "open until all others are closed then we close this" as just ongoing 
testing w/internal cluster workloads?

Given the # of things this ticket has surfaced there's obviously a ton of value 
in this work. In terms of Jira workload and project mgt to 4.0, however, this 
ticket is currently something of an oddball. :)

> 4.0 quality testing: Local Read/Write Path: Upgrade and Diff Test
> -
>
> Key: CASSANDRA-15537
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15537
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest/java, Test/dtest/python
>Reporter: Josh McKenzie
>Assignee: Yifan Cai
>Priority: Normal
> Fix For: 4.0-beta, 4.0-triage
>
>
> Reference [doc from 
> NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#]
>  for context.
> Execution of upgrade and diff tests via cassandra-diff have proven to be one 
> of the most effective approaches toward identifying issues with the local 
> read/write path. These include instances of data loss, data corruption, data 
> resurrection, incorrect responses to queries, incomplete responses, and 
> others. Upgrade and diff tests can be executed concurrent with fault 
> injection (such as host or network failure); as well as during mixed-version 
> scenarios (such as upgrading half of the instances in a cluster, and running 
> upgradesstables on only half of the upgraded instances).
> Upgrade and diff tests are expected to continue through the release cycle, 
> and are a great way for contributors to gain confidence in the correctness of 
> the database under their own workloads.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14746) Ensure Netty Internode Messaging Refactor is Solid

2020-10-01 Thread Josh McKenzie (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205922#comment-17205922
 ] 

Josh McKenzie commented on CASSANDRA-14746:
---

{quote}4.0 should have better latency, more throughput, fewer threads, fewer 
context switches, less GC allocation, and faster recovery time. 
{quote}
Was this the goal of the MS rewrite? I have no horse in this race - I just 
thought the goal of it was to tighten up some of the things that were present / 
still troublesome after Jason's rewrite of things rather than specifically 
targeting performance improvements.

I'd personally advocate for "no regression on categories a-e" with better 
backpressure, tolerance for failure, etc. etc. that I understood to come along 
w/the MS rewrite. At least in terms of what we should consider a blocker for 
4.0, I think "don't regress" is a stance that makes sense, especially as 
incremental performance improvements are reasonable to consider for patch 
releases IMO.

And fwiw, the benchmarks I've seen on 4.0 show a pretty significant improvement 
in throughput if nothing else, but in terms of bar - no regression for a 
rewrite seems like a good low water mark to block on.

 

> Ensure Netty Internode Messaging Refactor is Solid
> --
>
> Key: CASSANDRA-14746
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14746
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Streaming and Messaging
>Reporter: Joey Lynch
>Assignee: Joey Lynch
>Priority: Normal
>  Labels: 4.0-QA
> Fix For: 4.0-beta, 4.0-triage
>
>
> Before we release 4.0 let's ensure that the internode messaging refactor is 
> 100% solid. As internode messaging is naturally used in many code paths and 
> widely configurable we have a large number of cluster configurations and test 
> configurations that must be vetted.
> We plan to vary the following:
>  * Version of Cassandra 3.0.17 vs 4.0-alpha
>  * Cluster sizes with *multi-dc* deployments ranging from 6 - 100 nodes
>  * Client request rates varying between 1k QPS and 100k QPS of varying sizes 
> and shapes (BATCH, INSERT, SELECT point, SELECT range, etc ...)
>  * Internode compression
>  * Internode SSL (as well as openssl vs jdk)
>  * Internode Coalescing options
> We are looking to measure the following as appropriate:
>  * Latency distributions of reads and writes (lower is better)
>  * Scaling limit, aka maximum throughput before violating p99 latency 
> deadline of 10ms @ LOCAL_QUORUM, on a fixed hardware deployment for 100% 
> writes, 100% reads and 50-50 writes+reads (higher is better)
>  * Thread counts (lower is better)
>  * Context switches (lower is better)
>  * On-CPU time of tasks (higher periods without context switch is better)
>  * GC allocation rates / throughput for a fixed size heap (lower allocation 
> better)
>  * Streaming recovery time for a single node failure, i.e. can Cassandra 
> saturate the NIC
>  
> The goal is that 4.0 should have better latency, more throughput, fewer 
> threads, fewer context switches, less GC allocation, and faster recovery 
> time. I'm putting Jason Brown as the reviewer since he implemented most of 
> the internode refactor.
> Current collaborators driving this QA task: Dinesh Joshi, Jordan West, Joey 
> Lynch (Netflix), Vinay Chella (Netflix)
> Owning committer(s): Jason Brown



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16148) Test failures caused by merging CASSANDRA-15833

2020-10-01 Thread Jordan West (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205909#comment-17205909
 ] 

Jordan West commented on CASSANDRA-16148:
-

{quote} This logic isn't correct as you need the release version of the node 
you are adding {quote}

To address this I've opened:
https://github.com/apache/cassandra-in-jvm-dtest-api/pull/21

And updated all of the branches to implement the new method, as well as updated 
trunk to use it correctly:
[2.2 | https://github.com/jrwest/cassandra/tree/jwest/16148-2.2]
[3.0 | https://github.com/jrwest/cassandra/tree/jwest/16148-3.0]
[3.11 | https://github.com/jrwest/cassandra/tree/jwest/16148-3.11]
[trunk | https://github.com/jrwest/cassandra/tree/jwest/16148] 

Also moved the test to {{ReadRepairTest}} and added the requested comment about 
CASSANDRA-15977 deleting it. 

Memoization changes to follow. 

> Test failures caused by merging CASSANDRA-15833
> ---
>
> Key: CASSANDRA-16148
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16148
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Normal
>
> Three issues were caused by merging CASSANDRA-15833:
> 1. `GossiperTest#testHaveAnyVersion3Nodes` was failing on trunk: 
> https://app.circleci.com/pipelines/github/jrwest/cassandra/53/workflows/95f9f401-1ef8-4b8d-9c64-3703d9669d95/jobs/771
> 2. python dtest ReadRepairTest#test_atomic_writes[blocking] was failing
> 3. In-jvm dtests being worked on as part of CASSANDRA-15977 uncovered an 
> issue with how CASSANDRA-15833 changes interacted with in-jvm dtests running 
> without {{Feature.GOSSIP}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Issue Comment Deleted] (CASSANDRA-16160) Regression in cqlsh with regard to row id display

2020-10-01 Thread David Capwell (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-16160:
--
Comment: was deleted

(was: When you run a query such as

{code}
expand on; 
select * from table_with_clustering_keys where token(partition_key) = 
1192326969048244361;
{code}

We print out a header for each row that looks like the following

@ Row 1

In 3.0 all values printed were uniq, but in 4.0 they are no longer unique

{code}
$ grep Row 3.0-rows.results | sort | uniq -c | sort -k1 -h -r | head -n 10
  1 @ Row 999
  1 @ Row 998
  1 @ Row 997
  1 @ Row 996
  1 @ Row 995
  1 @ Row 994
  1 @ Row 993
  1 @ Row 992
  1 @ Row 991
  1 @ Row 990
{code}

{code}
$ grep Row 4.0-rows.results | sort | uniq -c | sort -k1 -h -r | head -n 10
 10 @ Row 9
 10 @ Row 8
 10 @ Row 7
 10 @ Row 6
 10 @ Row 5
 10 @ Row 48
 10 @ Row 47
 10 @ Row 46
 10 @ Row 45
 10 @ Row 44
{code})

> Regression in cqlsh with regard to row id display
> -
>
> Key: CASSANDRA-16160
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16160
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/cqlsh
>Reporter: David Capwell
>Priority: Normal
>
> When you run a query such as
> {code}
> expand on; 
> select * from table_with_clustering_keys where token(partition_key) = 
> 1192326969048244361;
> {code}
> We print out a header for each row that looks like the following
> @ Row 1
> In 3.0 all values printed were uniq, but in 4.0 they are no longer unique
> {code}
> $ grep Row 3.0-rows.results | sort | uniq -c | sort -k1 -h -r | head -n 10
>   1 @ Row 999
>   1 @ Row 998
>   1 @ Row 997
>   1 @ Row 996
>   1 @ Row 995
>   1 @ Row 994
>   1 @ Row 993
>   1 @ Row 992
>   1 @ Row 991
>   1 @ Row 990
> {code}
> {code}
> $ grep Row 4.0-rows.results | sort | uniq -c | sort -k1 -h -r | head -n 10
>  10 @ Row 9
>  10 @ Row 8
>  10 @ Row 7
>  10 @ Row 6
>  10 @ Row 5
>  10 @ Row 48
>  10 @ Row 47
>  10 @ Row 46
>  10 @ Row 45
>  10 @ Row 44
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16160) Regression in cqlsh with regard to row id display

2020-10-01 Thread David Capwell (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-16160:
--
 Bug Category: Parent values: Correctness(12982)Level 1 values: API / 
Semantic Implementation(12988)
   Complexity: Normal
Discovered By: Workload Replay
 Severity: Normal
   Status: Open  (was: Triage Needed)

When you run a query such as

{code}
expand on; 
select * from table_with_clustering_keys where token(partition_key) = 
1192326969048244361;
{code}

We print out a header for each row that looks like the following

@ Row 1

In 3.0 all values printed were uniq, but in 4.0 they are no longer unique

{code}
$ grep Row 3.0-rows.results | sort | uniq -c | sort -k1 -h -r | head -n 10
  1 @ Row 999
  1 @ Row 998
  1 @ Row 997
  1 @ Row 996
  1 @ Row 995
  1 @ Row 994
  1 @ Row 993
  1 @ Row 992
  1 @ Row 991
  1 @ Row 990
{code}

{code}
$ grep Row 4.0-rows.results | sort | uniq -c | sort -k1 -h -r | head -n 10
 10 @ Row 9
 10 @ Row 8
 10 @ Row 7
 10 @ Row 6
 10 @ Row 5
 10 @ Row 48
 10 @ Row 47
 10 @ Row 46
 10 @ Row 45
 10 @ Row 44
{code}

> Regression in cqlsh with regard to row id display
> -
>
> Key: CASSANDRA-16160
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16160
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/cqlsh
>Reporter: David Capwell
>Priority: Normal
>
> When you run a query such as
> {code}
> expand on; 
> select * from table_with_clustering_keys where token(partition_key) = 
> 1192326969048244361;
> {code}
> We print out a header for each row that looks like the following
> @ Row 1
> In 3.0 all values printed were uniq, but in 4.0 they are no longer unique
> {code}
> $ grep Row 3.0-rows.results | sort | uniq -c | sort -k1 -h -r | head -n 10
>   1 @ Row 999
>   1 @ Row 998
>   1 @ Row 997
>   1 @ Row 996
>   1 @ Row 995
>   1 @ Row 994
>   1 @ Row 993
>   1 @ Row 992
>   1 @ Row 991
>   1 @ Row 990
> {code}
> {code}
> $ grep Row 4.0-rows.results | sort | uniq -c | sort -k1 -h -r | head -n 10
>  10 @ Row 9
>  10 @ Row 8
>  10 @ Row 7
>  10 @ Row 6
>  10 @ Row 5
>  10 @ Row 48
>  10 @ Row 47
>  10 @ Row 46
>  10 @ Row 45
>  10 @ Row 44
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-16160) Regression in cqlsh with regard to row id display

2020-10-01 Thread David Capwell (Jira)
David Capwell created CASSANDRA-16160:
-

 Summary: Regression in cqlsh with regard to row id display
 Key: CASSANDRA-16160
 URL: https://issues.apache.org/jira/browse/CASSANDRA-16160
 Project: Cassandra
  Issue Type: Bug
  Components: Tool/cqlsh
Reporter: David Capwell


When you run a query such as

{code}
expand on; 
select * from table_with_clustering_keys where token(partition_key) = 
1192326969048244361;
{code}

We print out a header for each row that looks like the following

@ Row 1

In 3.0 all values printed were uniq, but in 4.0 they are no longer unique

{code}
$ grep Row 3.0-rows.results | sort | uniq -c | sort -k1 -h -r | head -n 10
  1 @ Row 999
  1 @ Row 998
  1 @ Row 997
  1 @ Row 996
  1 @ Row 995
  1 @ Row 994
  1 @ Row 993
  1 @ Row 992
  1 @ Row 991
  1 @ Row 990
{code}

{code}
$ grep Row 4.0-rows.results | sort | uniq -c | sort -k1 -h -r | head -n 10
 10 @ Row 9
 10 @ Row 8
 10 @ Row 7
 10 @ Row 6
 10 @ Row 5
 10 @ Row 48
 10 @ Row 47
 10 @ Row 46
 10 @ Row 45
 10 @ Row 44
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16144) TLS connections to the storage port on a node without server encryption configured causes java.io.IOException accessing missing keystore

2020-10-01 Thread Dinesh Joshi (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Joshi updated CASSANDRA-16144:
-
Reviewers: Dinesh Joshi

> TLS connections to the storage port on a node without server encryption 
> configured causes java.io.IOException accessing missing keystore
> 
>
> Key: CASSANDRA-16144
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16144
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Internode
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
> Fix For: 4.0-beta, 4.0-triage
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If a TLS connection is requested against a node with all encryption disabled 
> by configuration,
> configured with
> {code}
> server_encryption_options: {optional:false, internode_encryption: none}
> {code}
> it logs the following error if no keystore exists for the node.
> {code}
> INFO  [Messaging-EventLoop-3-3] 2020-09-15T14:30:02,952 : - 
> 127.0.0.1:7000->127.0.1.1:7000-URGENT_MESSAGES-[no-channel] failed to connect
> io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection 
> refused: local1-i1/127.0.1.1:7000
> Caused by: java.net.ConnectException: Connection refused
>at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:?]
>at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:779) ~[?:?]
>at 
> io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)
>  ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)
>  ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:702) 
> ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)
>  ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576) 
> ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) 
> [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
>  [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) 
> [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>  [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at java.lang.Thread.run(Thread.java:834) [?:?]
> WARN  [Messaging-EventLoop-3-9] 2020-09-15T14:30:06,375 : - Failed to 
> initialize a channel. Closing: [id: 0x0746c157, L:/127.0.0.1:7000 - 
> R:/127.0.0.1:59623]
> java.io.IOException: failed to build trust manager store for secure 
> connections
>at 
> org.apache.cassandra.security.SSLFactory.buildKeyManagerFactory(SSLFactory.java:232)
>  ~[apache-cassandra-4.0-beta1-SNAPSHOT.jar:4.0-beta1-SNAPSHOT]
>at 
> org.apache.cassandra.security.SSLFactory.createNettySslContext(SSLFactory.java:300)
>  ~[apache-cassandra-4.0-beta1-SNAPSHOT.jar:4.0-beta1-SNAPSHOT]
>at 
> org.apache.cassandra.security.SSLFactory.getOrCreateSslContext(SSLFactory.java:276)
>  ~[apache-cassandra-4.0-beta1-SNAPSHOT.jar:4.0-beta1-SNAPSHOT]
>at 
> org.apache.cassandra.security.SSLFactory.getOrCreateSslContext(SSLFactory.java:257)
>  ~[apache-cassandra-4.0-beta1-SNAPSHOT.jar:4.0-beta1-SNAPSHOT]
>at 
> org.apache.cassandra.net.InboundConnectionInitiator$Initializer.initChannel(InboundConnectionInitiator.java:107)
>  ~[apache-cassandra-4.0-beta1-SNAPSHOT.jar:4.0-beta1-SNAPSHOT]
>at 
> org.apache.cassandra.net.InboundConnectionInitiator$Initializer.initChannel(InboundConnectionInitiator.java:71)
>  ~[apache-cassandra-4.0-beta1-SNAPSHOT.jar:4.0-beta1-SNAPSHOT]
>at 
> io.netty.channel.ChannelInitializer.initChannel(ChannelInitializer.java:129) 
> [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.ChannelInitializer.handlerAdded(ChannelInitializer.java:112) 
> [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.AbstractChannelHandlerContext.callHandlerAdded(AbstractChannelHandlerContext.java:938)
>  [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.DefaultChannelPipeline.callHandlerAdded0(DefaultChannelPipeline.java:609)
>  [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> 

[jira] [Updated] (CASSANDRA-16144) TLS connections to the storage port on a node without server encryption configured causes java.io.IOException accessing missing keystore

2020-10-01 Thread Jon Meredith (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jon Meredith updated CASSANDRA-16144:
-
Test and Documentation Plan: New tests added under unit and distributed. 
 Status: Patch Available  (was: Open)

[PR|https://github.com/apache/cassandra/pull/763]
[Branch|https://github.com/jonmeredith/cassandra/tree/C16144]
[CircleCI|https://app.circleci.com/pipelines/github/jonmeredith/cassandra?branch=C16144]
 

> TLS connections to the storage port on a node without server encryption 
> configured causes java.io.IOException accessing missing keystore
> 
>
> Key: CASSANDRA-16144
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16144
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Internode
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
> Fix For: 4.0-beta, 4.0-triage
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If a TLS connection is requested against a node with all encryption disabled 
> by configuration,
> configured with
> {code}
> server_encryption_options: {optional:false, internode_encryption: none}
> {code}
> it logs the following error if no keystore exists for the node.
> {code}
> INFO  [Messaging-EventLoop-3-3] 2020-09-15T14:30:02,952 : - 
> 127.0.0.1:7000->127.0.1.1:7000-URGENT_MESSAGES-[no-channel] failed to connect
> io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection 
> refused: local1-i1/127.0.1.1:7000
> Caused by: java.net.ConnectException: Connection refused
>at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:?]
>at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:779) ~[?:?]
>at 
> io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)
>  ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)
>  ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:702) 
> ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)
>  ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576) 
> ~[netty-all-4.1.50.Final.jar:4.1.50.Final]
>at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) 
> [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
>  [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) 
> [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>  [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at java.lang.Thread.run(Thread.java:834) [?:?]
> WARN  [Messaging-EventLoop-3-9] 2020-09-15T14:30:06,375 : - Failed to 
> initialize a channel. Closing: [id: 0x0746c157, L:/127.0.0.1:7000 - 
> R:/127.0.0.1:59623]
> java.io.IOException: failed to build trust manager store for secure 
> connections
>at 
> org.apache.cassandra.security.SSLFactory.buildKeyManagerFactory(SSLFactory.java:232)
>  ~[apache-cassandra-4.0-beta1-SNAPSHOT.jar:4.0-beta1-SNAPSHOT]
>at 
> org.apache.cassandra.security.SSLFactory.createNettySslContext(SSLFactory.java:300)
>  ~[apache-cassandra-4.0-beta1-SNAPSHOT.jar:4.0-beta1-SNAPSHOT]
>at 
> org.apache.cassandra.security.SSLFactory.getOrCreateSslContext(SSLFactory.java:276)
>  ~[apache-cassandra-4.0-beta1-SNAPSHOT.jar:4.0-beta1-SNAPSHOT]
>at 
> org.apache.cassandra.security.SSLFactory.getOrCreateSslContext(SSLFactory.java:257)
>  ~[apache-cassandra-4.0-beta1-SNAPSHOT.jar:4.0-beta1-SNAPSHOT]
>at 
> org.apache.cassandra.net.InboundConnectionInitiator$Initializer.initChannel(InboundConnectionInitiator.java:107)
>  ~[apache-cassandra-4.0-beta1-SNAPSHOT.jar:4.0-beta1-SNAPSHOT]
>at 
> org.apache.cassandra.net.InboundConnectionInitiator$Initializer.initChannel(InboundConnectionInitiator.java:71)
>  ~[apache-cassandra-4.0-beta1-SNAPSHOT.jar:4.0-beta1-SNAPSHOT]
>at 
> io.netty.channel.ChannelInitializer.initChannel(ChannelInitializer.java:129) 
> [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> io.netty.channel.ChannelInitializer.handlerAdded(ChannelInitializer.java:112) 
> [netty-all-4.1.50.Final.jar:4.1.50.Final]
>at 
> 

[jira] [Updated] (CASSANDRA-13325) Bring back the accepted encryption protocols list as configurable option

2020-10-01 Thread Jon Meredith (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jon Meredith updated CASSANDRA-13325:
-
Status: Open  (was: Patch Available)

> Bring back the accepted encryption protocols list as configurable option
> 
>
> Key: CASSANDRA-13325
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13325
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Nachiket Patil
>Assignee: Jon Meredith
>Priority: Low
> Fix For: 4.x
>
> Attachments: trunk.diff
>
>
> With CASSANDRA-10508, the hard coded list of accepted encryption protocols 
> was eliminated. For some use cases, it is necessary to restrict the 
> encryption protocols used for communication between client and server. 
> Default JVM way of negotiations allows the best encryption protocol that 
> client can use. 
> e.g. I have set Cassandra to use encryption. Ideally client and server 
> negotiate to use best protocol (TLSv1.2). But a malicious client might force 
> TLSv1.0 which is susceptible to POODLE attacks.
> At the moment only way to restrict the encryption protocol is using the 
> {{jdk.tls.client.protocols}} systems property. If I dont have enough access 
> to modify this property, I dont have any way of restricting the encryption 
> protocols.
> I am proposing bring back the accepted_protocols property but make it 
> configurable. If not specified, let the JVM take care of the TLS negotiations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13325) Bring back the accepted encryption protocols list as configurable option

2020-10-01 Thread Jon Meredith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205879#comment-17205879
 ] 

Jon Meredith commented on CASSANDRA-13325:
--

I've been investigating restricting the TLS protocols to prevent use of TLSv1 & 
TLSv1.1 for secure internode messaging and streaming connections and think the 
current implementation needs improvement before the final 4.0 release, so I'd 
like to pick this up again.

The Apache Cassandra documentation page on security 
https://cassandra.apache.org/doc/latest/operating/security.html mentions

"...  the JVM defaults for supported protocols and cipher suites are used when 
encryption is enabled. These can be overidden using the settings in 
cassandra.yaml, but this is not recommended unless there are policies in place 
which dictate certain settings or a need to disable vulnerable ciphers or 
protocols in cases where the JVM cannot be updated."

The implication to me there is that the preferred mechanism is to configure the 
JSSE subsystem. Trawling through documentation, the operator can disable older 
TLS protocol at the JVM level by creating new security properties file

{code}
$ cat conf/cassandra-security.properties
jdk.tls.disabledAlgorithms=SSLv3, RC4, DES, MD5withRSA, DH keySize < 1024, \
EC keySize < 224, 3DES_EDE_CBC, anon, NULL, TLSv1, TLSv1.1
{code}

And appending to the current security properties using

{code}
  -Djava.security.properties=conf/cassandra-security.properties
{code}

This works fine pre-4.0, however the introduction of Netty tcnative which uses 
OpenSSL under the hood, does not use the {{java.security.properties}} to 
restrict anything. Neither does it implement the calls for supporting the 
OpenSSL configuration file. It only seems possible to restrict the protocol & 
ciphers through the Netty SSLContext API. It is possible to disable OpenSSL by 
setting the Java system property {{cassandra.disable_tcactive_openssl=true}}, 
but it seems undesirable to lose the performance benefit there.

Looking in {{cassandra.yaml}}, under 'More advanced defaults' there is a 
{{protocol}} setting, which an operator might expect restricts which TLS 
protocols are accepted.

{code}
# More advanced defaults:
# protocol: TLS
{code}

However, setting that to {{TLSv1.2}} had no effect on the protocols the server 
accepted. Running {{openssl}} will connect without issue and negotiate a 
TLSv1.0 session.

{code}
openssl s_client -tlsv1 -connect 127.0.0.1:7000
{code}

I found two previous tickets that addressed TLS protocols, first explicitly 
hard-coding the accepted TLS protocols to disable SSLv3 (due to POODLE) in 
CASSANDRA-8265 / b93f48a5db321bf7c9fb55a800ed6ab2d6f6b102, and then rely back 
on Java8 defaults in CASSANDRA-10508 / e4a0a4bf65a87c3aabae4ee0cc35009879e2d455 
once the defaults were fixed.

CASSANDRA-10508 mentions the ‘protocol' field as a mechanism for specifying the 
protocol, however according to Java docs, that only verifies the protocol is to 
the SSL engine supported, and does not restrict negotiation to using it, as the 
openssl s_client test demonstrates.

>From a quick search of the internet, a couple of blog posts recommend setting 
>the cipher suite to only {{TLSv1.2}} valid ciphers and I can confirm that does 
>work, leading to this being logged (at ERROR).

{code}
ERROR [Messaging-EventLoop-3-2] 2020-09-19T16:17:48,023 : - Failed to properly 
handshake with peer /127.0.0.1:33826. Closing the channel. 
io.netty.handler.codec.DecoderException:
javax.net.ssl.SSLHandshakeException: Client requested protocol TLSv1.1 is not 
enabled or supported in server context
Caused by: javax.net.ssl.SSLHandshakeException: Client requested protocol 
TLSv1.1 is not enabled or supported in server context
{code}

While it does work to restrict the protocol, if we start logging the accepted 
protocols the log will show that the server will negotiate TLS1/TLS1.1 which 
may get flagged by anybody validating the operators configuration/logs.

The current state of the code and documentation is unsatisfactory to me.  We 
should at least improve the documentation to give clear guidance to operators 
on how they can secure their systems under 4.0/tcnative, however I think we 
should go further and make the encryption_option.protocol field behave as 
intended.

Here's my proposal:

1) Interpret the current protocol string as a comma separated list of protocols 
that are accepted. Replace the default
{{EncryptionOptions.protocol}} of {{"TLS"}} with null.
2) If protocol is non-null, call {{SslContextBuilder.protocols()}} with the 
configured protocols in 
{{org.apache.cassandra.security.SSLFactory#createNettySslContext}}
3) Special case the protocol configuration {{"TLS"}} to mean {{["TLSv1", 
"TLSv1.1", "TLSv1.2”]}} for users that have uncommented the default value. 
Passing {{“TLS”}} is invalid in the {{protocols()}} call.
4) Hard-code 

[jira] [Assigned] (CASSANDRA-13325) Bring back the accepted encryption protocols list as configurable option

2020-10-01 Thread Jon Meredith (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jon Meredith reassigned CASSANDRA-13325:


Assignee: Jon Meredith  (was: Nachiket Patil)

> Bring back the accepted encryption protocols list as configurable option
> 
>
> Key: CASSANDRA-13325
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13325
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Nachiket Patil
>Assignee: Jon Meredith
>Priority: Low
> Fix For: 4.x
>
> Attachments: trunk.diff
>
>
> With CASSANDRA-10508, the hard coded list of accepted encryption protocols 
> was eliminated. For some use cases, it is necessary to restrict the 
> encryption protocols used for communication between client and server. 
> Default JVM way of negotiations allows the best encryption protocol that 
> client can use. 
> e.g. I have set Cassandra to use encryption. Ideally client and server 
> negotiate to use best protocol (TLSv1.2). But a malicious client might force 
> TLSv1.0 which is susceptible to POODLE attacks.
> At the moment only way to restrict the encryption protocol is using the 
> {{jdk.tls.client.protocols}} systems property. If I dont have enough access 
> to modify this property, I dont have any way of restricting the encryption 
> protocols.
> I am proposing bring back the accepted_protocols property but make it 
> configurable. If not specified, let the JVM take care of the TLS negotiations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16159) Reduce the Severity of Errors Reported in FailureDetector#isAlive()

2020-10-01 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-16159:

 Bug Category: Parent values: Code(13163)Level 1 values: Bug - Unclear 
Impact(13164)
   Complexity: Low Hanging Fruit
Discovered By: User Report
 Severity: Low
   Status: Open  (was: Triage Needed)

> Reduce the Severity of Errors Reported in FailureDetector#isAlive()
> ---
>
> Key: CASSANDRA-16159
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16159
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Caleb Rackliffe
>Priority: Normal
>
> Noticed the following error in the failure detector during a host replacement:
> {noformat}
> java.lang.IllegalArgumentException: Unknown endpoint: 10.38.178.98:7000
>   at 
> org.apache.cassandra.gms.FailureDetector.isAlive(FailureDetector.java:281)
>   at 
> org.apache.cassandra.service.StorageService.handleStateBootreplacing(StorageService.java:2502)
>   at 
> org.apache.cassandra.service.StorageService.onChange(StorageService.java:2182)
>   at 
> org.apache.cassandra.service.StorageService.onJoin(StorageService.java:3145)
>   at 
> org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:1242)
>   at 
> org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:1368)
>   at 
> org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:50)
>   at 
> org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:77)
>   at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:93)
>   at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:44)
>   at 
> org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:884)
>   at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> {noformat}
> This particular error looks benign, given that even if it occurs, the node 
> continues to handle the {{BOOT_REPLACE}} state. There are two things we might 
> be able to do to improve {{FailureDetector#isAlive()}} though:
> 1.) We don’t short circuit in the case that the endpoint in question is in 
> quarantine after being removed. It may be useful to check for this so we can 
> avoid logging an ERROR when the endpoint is clearly doomed/dead. (Quarantine 
> works great when the gossip message is _from_ a quarantined endpoint, but in 
> this case, that would be the new/replacing and not the old/replaced one.)
> 2.) We can reduce the severity of the logging from ERROR to WARN and provide 
> better context around how to determine whether or not there’s actually a 
> problem. (ex. “If this occurs while trying to determine liveness for a node 
> that is currently being replaced, it can be safely ignored.”)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16159) Reduce the Severity of Errors Reported in FailureDetector#isAlive()

2020-10-01 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-16159:

Fix Version/s: 4.0-rc

> Reduce the Severity of Errors Reported in FailureDetector#isAlive()
> ---
>
> Key: CASSANDRA-16159
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16159
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Caleb Rackliffe
>Priority: Normal
> Fix For: 4.0-rc
>
>
> Noticed the following error in the failure detector during a host replacement:
> {noformat}
> java.lang.IllegalArgumentException: Unknown endpoint: 10.38.178.98:7000
>   at 
> org.apache.cassandra.gms.FailureDetector.isAlive(FailureDetector.java:281)
>   at 
> org.apache.cassandra.service.StorageService.handleStateBootreplacing(StorageService.java:2502)
>   at 
> org.apache.cassandra.service.StorageService.onChange(StorageService.java:2182)
>   at 
> org.apache.cassandra.service.StorageService.onJoin(StorageService.java:3145)
>   at 
> org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:1242)
>   at 
> org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:1368)
>   at 
> org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:50)
>   at 
> org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:77)
>   at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:93)
>   at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:44)
>   at 
> org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:884)
>   at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> {noformat}
> This particular error looks benign, given that even if it occurs, the node 
> continues to handle the {{BOOT_REPLACE}} state. There are two things we might 
> be able to do to improve {{FailureDetector#isAlive()}} though:
> 1.) We don’t short circuit in the case that the endpoint in question is in 
> quarantine after being removed. It may be useful to check for this so we can 
> avoid logging an ERROR when the endpoint is clearly doomed/dead. (Quarantine 
> works great when the gossip message is _from_ a quarantined endpoint, but in 
> this case, that would be the new/replacing and not the old/replaced one.)
> 2.) We can reduce the severity of the logging from ERROR to WARN and provide 
> better context around how to determine whether or not there’s actually a 
> problem. (ex. “If this occurs while trying to determine liveness for a node 
> that is currently being replaced, it can be safely ignored.”)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16159) Reduce the Severity of Errors Reported in FailureDetector#isAlive()

2020-10-01 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-16159:

Summary: Reduce the Severity of Errors Reported in 
FailureDetector#isAlive()  (was: Reduce the Severity of Errors Reported in 
FailureDetector#iaAlive())

> Reduce the Severity of Errors Reported in FailureDetector#isAlive()
> ---
>
> Key: CASSANDRA-16159
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16159
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Caleb Rackliffe
>Priority: Normal
>
> Noticed the following error in the failure detector during a host replacement:
> {noformat}
> java.lang.IllegalArgumentException: Unknown endpoint: 10.38.178.98:7000
>   at 
> org.apache.cassandra.gms.FailureDetector.isAlive(FailureDetector.java:281)
>   at 
> org.apache.cassandra.service.StorageService.handleStateBootreplacing(StorageService.java:2502)
>   at 
> org.apache.cassandra.service.StorageService.onChange(StorageService.java:2182)
>   at 
> org.apache.cassandra.service.StorageService.onJoin(StorageService.java:3145)
>   at 
> org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:1242)
>   at 
> org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:1368)
>   at 
> org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:50)
>   at 
> org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:77)
>   at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:93)
>   at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:44)
>   at 
> org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:884)
>   at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> {noformat}
> This particular error looks benign, given that even if it occurs, the node 
> continues to handle the {{BOOT_REPLACE}} state. There are two things we might 
> be able to do to improve {{FailureDetector#isAlive()}} though:
> 1.) We don’t short circuit in the case that the endpoint in question is in 
> quarantine after being removed. It may be useful to check for this so we can 
> avoid logging an ERROR when the endpoint is clearly doomed/dead. (Quarantine 
> works great when the gossip message is _from_ a quarantined endpoint, but in 
> this case, that would be the new/replacing and not the old/replaced one.)
> 2.) We can reduce the severity of the logging from ERROR to WARN and provide 
> better context around how to determine whether or not there’s actually a 
> problem. (ex. “If this occurs while trying to determine liveness for a node 
> that is currently being replaced, it can be safely ignored.”)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-16159) Reduce the Severity of Errors Reported in FailureDetector#iaAlive()

2020-10-01 Thread Caleb Rackliffe (Jira)
Caleb Rackliffe created CASSANDRA-16159:
---

 Summary: Reduce the Severity of Errors Reported in 
FailureDetector#iaAlive()
 Key: CASSANDRA-16159
 URL: https://issues.apache.org/jira/browse/CASSANDRA-16159
 Project: Cassandra
  Issue Type: Bug
  Components: Cluster/Gossip
Reporter: Caleb Rackliffe


Noticed the following error in the failure detector during a host replacement:

{noformat}
java.lang.IllegalArgumentException: Unknown endpoint: 10.38.178.98:7000
at 
org.apache.cassandra.gms.FailureDetector.isAlive(FailureDetector.java:281)
at 
org.apache.cassandra.service.StorageService.handleStateBootreplacing(StorageService.java:2502)
at 
org.apache.cassandra.service.StorageService.onChange(StorageService.java:2182)
at 
org.apache.cassandra.service.StorageService.onJoin(StorageService.java:3145)
at 
org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:1242)
at 
org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:1368)
at 
org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:50)
at 
org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:77)
at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:93)
at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:44)
at 
org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:884)
at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
{noformat}

This particular error looks benign, given that even if it occurs, the node 
continues to handle the {{BOOT_REPLACE}} state. There are two things we might 
be able to do to improve {{FailureDetector#isAlive()}} though:

1.) We don’t short circuit in the case that the endpoint in question is in 
quarantine after being removed. It may be useful to check for this so we can 
avoid logging an ERROR when the endpoint is clearly doomed/dead. (Quarantine 
works great when the gossip message is _from_ a quarantined endpoint, but in 
this case, that would be the new/replacing and not the old/replaced one.)

2.) We can reduce the severity of the logging from ERROR to WARN and provide 
better context around how to determine whether or not there’s actually a 
problem. (ex. “If this occurs while trying to determine liveness for a node 
that is currently being replaced, it can be safely ignored.”)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16158) Avoid Unnecessary Chunk Cache Usage During Compaction

2020-10-01 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-16158:

Change Category: Performance
 Complexity: Normal
  Fix Version/s: 4.x
 Status: Open  (was: Triage Needed)

> Avoid Unnecessary Chunk Cache Usage During Compaction
> -
>
> Key: CASSANDRA-16158
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16158
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Compression, Feature/Encryption, Local/Compaction
>Reporter: Caleb Rackliffe
>Priority: Normal
> Fix For: 4.x
>
>
> We have at least some evidence from CASSANDRA-16036 that compaction can cause 
> significant churn in the chunk cache for a mixed workloads. Conceptually, 
> this makes sense, as the files compaction is scanning are destined for 
> deletion. It seems like we could avoid most of the cache churn mess by having 
> the {{ISSTableScanner}} implementations returned by {{SSTableReader}} during 
> compaction use file handles that don't use {{CachingRebufferer}}. 
> {{FileHandle.Builder#complete()}} already seems to roughly have the logic we 
> would need to produce the correct (uncached) {{RebuffererFactory}}.
> (NOTE: {{SSTableReader#getScanner(ColumnFilter, DataRange, 
> SSTableReadsListener)}} is used on the read path, and we likely don't want to 
> touch it here.)
> Given that CASSANDRA-16036 settled on disabling the chunk cache by default in 
> 4.0, CASSANDRA-15229 further segregates the negative effects of this, and it 
> isn't entirely clear that the decompression/decryption cache is actually 
> adding measurable value for any workload, this may not be a critical priority 
> for 4.0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16150) Upgrade to snakeyaml >= 1.26 version for CVE-2017-18640 fix

2020-10-01 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205851#comment-17205851
 ] 

Brandon Williams commented on CASSANDRA-16150:
--

If we're doing this, 1.27 is now out.

> Upgrade to snakeyaml >= 1.26 version for CVE-2017-18640 fix
> ---
>
> Key: CASSANDRA-16150
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16150
> Project: Cassandra
>  Issue Type: Bug
>  Components: Dependencies
>Reporter: Rahul Nandi
>Assignee: Rahul Nandi
>Priority: Normal
> Fix For: 4.x
>
>
> There have been critical level CVE (CVE-2017-18640) discovered in snakeyaml 
> version earlier to 1.26. This has been patched into snakeyaml version 1.26.
> Reference: [https://nvd.nist.gov/vuln/detail/CVE-2017-18640]
> This card is expected to upgrade the snakeyaml version to 1.26.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-16158) Avoid Unnecessary Chunk Cache Usage During Compaction

2020-10-01 Thread Caleb Rackliffe (Jira)
Caleb Rackliffe created CASSANDRA-16158:
---

 Summary: Avoid Unnecessary Chunk Cache Usage During Compaction
 Key: CASSANDRA-16158
 URL: https://issues.apache.org/jira/browse/CASSANDRA-16158
 Project: Cassandra
  Issue Type: Improvement
  Components: Feature/Compression, Feature/Encryption, Local/Compaction
Reporter: Caleb Rackliffe


We have at least some evidence from CASSANDRA-16036 that compaction can cause 
significant churn in the chunk cache for a mixed workloads. Conceptually, this 
makes sense, as the files compaction is scanning are destined for deletion. It 
seems like we could avoid most of the cache churn mess by having the 
{{ISSTableScanner}} implementations returned by {{SSTableReader}} during 
compaction use file handles that don't use {{CachingRebufferer}}. 
{{FileHandle.Builder#complete()}} already seems to roughly have the logic we 
would need to produce the correct (uncached) {{RebuffererFactory}}.

(NOTE: {{SSTableReader#getScanner(ColumnFilter, DataRange, 
SSTableReadsListener)}} is used on the read path, and we likely don't want to 
touch it here.)

Given that CASSANDRA-16036 settled on disabling the chunk cache by default in 
4.0, CASSANDRA-15229 further segregates the negative effects of this, and it 
isn't entirely clear that the decompression/decryption cache is actually adding 
measurable value for any workload, this may not be a critical priority for 4.0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16153) Cassandra 4b2 - JVM options from *.options not read/set

2020-10-01 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205848#comment-17205848
 ] 

Brandon Williams commented on CASSANDRA-16153:
--

Are you using an instance type with only one core?

> Cassandra 4b2 - JVM options from *.options not read/set
> ---
>
> Key: CASSANDRA-16153
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16153
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Scripts
>Reporter: Thomas Steinmaurer
>Priority: Normal
>
> Trying out Cassandra 4 beta 2 with Java 8 (AdoptOpenJDK) in AWS.
> {noformat}
> NAME="Amazon Linux AMI"
> VERSION="2018.03"
> ID="amzn"
> ID_LIKE="rhel fedora"
> VERSION_ID="2018.03"
> PRETTY_NAME="Amazon Linux AMI 2018.03"
> ANSI_COLOR="0;33"
> CPE_NAME="cpe:/o:amazon:linux:2018.03:ga"
> HOME_URL="http://aws.amazon.com/amazon-linux-ami/;
> {noformat}
> It seems the Cassandra JVM results in using Parallel GC.
> {noformat}
> INFO  [Service Thread] 2020-10-01 00:00:56,233 GCInspector.java:299 - PS 
> Scavenge GC in 541ms.  PS Old Gen: 5152844776 -> 5726724752;
> WARN  [Service Thread] 2020-10-01 00:00:56,234 GCInspector.java:297 - PS 
> MarkSweep GC in 1969ms.  PS Eden Space: 2111307776 -> 0; PS Old Gen: 
> 5726724752 -> 2581334376; PS Survivor Space: 363850224 -> 0
> {noformat}
> Although {{jvm8-server.options}} is using CMS.
> {noformat}
> #
> #  GC SETTINGS  #
> #
> ### CMS Settings
> -XX:+UseParNewGC
> -XX:+UseConcMarkSweepGC
> -XX:+CMSParallelRemarkEnabled
> -XX:SurvivorRatio=8
> -XX:MaxTenuringThreshold=1
> -XX:CMSInitiatingOccupancyFraction=75
> -XX:+UseCMSInitiatingOccupancyOnly
> -XX:CMSWaitDuration=1
> -XX:+CMSParallelInitialMarkEnabled
> -XX:+CMSEdenChunksRecordAlways
> ## some JVMs will fill up their heap when accessed via JMX, see CASSANDRA-6541
> -XX:+CMSClassUnloadingEnabled
> ...
> {noformat}
> In Cassandra 3, default has been CMS.
> So, possibly there is something wrong in reading/processing 
> {{jvm8-server.options}}?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16154) OOM Error (Direct buffer memory) during intensive reading from large SSTables

2020-10-01 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205847#comment-17205847
 ] 

Brandon Williams commented on CASSANDRA-16154:
--

I would highly recommend seeing if this reproduces on the latest 3.11 release.

> OOM Error (Direct buffer memory) during intensive reading from large SSTables
> -
>
> Key: CASSANDRA-16154
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16154
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Vygantas Gedgaudas
>Priority: Normal
>
> Hello,
> We have a certain database, from when we are reading intensively leads to the 
> following OOM error:
> {noformat}
>  java.lang.OutOfMemoryError: Direct buffer memory
>  at java.nio.Bits.reserveMemory(Bits.java:694) ~[na:1.8.0_212]
>  at java.nio.DirectByteBuffer.(DirectByteBuffer.java:123) 
> ~[na:1.8.0_212]
>  at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311) ~[na:1.8.0_212]
>  at 
> org.apache.cassandra.utils.memory.BufferPool.allocate(BufferPool.java:110) 
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at 
> org.apache.cassandra.utils.memory.BufferPool.access$1000(BufferPool.java:46) 
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at 
> org.apache.cassandra.utils.memory.BufferPool$LocalPool.allocate(BufferPool.java:407)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at 
> org.apache.cassandra.utils.memory.BufferPool$LocalPool.access$000(BufferPool.java:334)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at 
> org.apache.cassandra.utils.memory.BufferPool.takeFromPool(BufferPool.java:122)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at org.apache.cassandra.utils.memory.BufferPool.get(BufferPool.java:94) 
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at org.apache.cassandra.cache.ChunkCache.load(ChunkCache.java:155) 
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at org.apache.cassandra.cache.ChunkCache.load(ChunkCache.java:39) 
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at 
> com.github.benmanes.caffeine.cache.BoundedLocalCache$BoundedLocalLoadingCache.lambda$new$0(BoundedLocalCache.java:2949)
>  ~[caffeine-2.2.6.jar:na]
>  at 
> com.github.benmanes.caffeine.cache.BoundedLocalCache.lambda$doComputeIfAbsent$15(BoundedLocalCache.java:1807)
>  ~[caffeine-2.2.6.jar:na]
>  at 
> java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853) 
> ~[na:1.8.0_212]
>  at 
> com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:1805)
>  ~[caffeine-2.2.6.jar:na]
>  at 
> com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:1788)
>  ~[caffeine-2.2.6.jar:na]
>  at 
> com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:97)
>  ~[caffeine-2.2.6.jar:na]
>  at 
> com.github.benmanes.caffeine.cache.LocalLoadingCache.get(LocalLoadingCache.java:66)
>  ~[caffeine-2.2.6.jar:na]
>  at 
> org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:235)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at 
> org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:213)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at 
> org.apache.cassandra.io.util.RandomAccessReader.reBufferAt(RandomAccessReader.java:65)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at 
> org.apache.cassandra.io.util.RandomAccessReader.seek(RandomAccessReader.java:207)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at org.apache.cassandra.io.util.FileHandle.createReader(FileHandle.java:150) 
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at 
> org.apache.cassandra.io.sstable.format.SSTableReader.getFileDataInput(SSTableReader.java:1767)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at 
> org.apache.cassandra.db.columniterator.AbstractSSTableIterator.(AbstractSSTableIterator.java:103)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at 
> org.apache.cassandra.db.columniterator.SSTableIterator.(SSTableIterator.java:49)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at 
> org.apache.cassandra.io.sstable.format.big.BigTableReader.iterator(BigTableReader.java:72)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at 
> org.apache.cassandra.io.sstable.format.big.BigTableReader.iterator(BigTableReader.java:65)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorWithLowerBound.initializeIterator(UnfilteredRowIteratorWithLowerBound.java:107)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at 
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.maybeInit(LazilyInitializedUnfilteredRowIterator.java:48)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorWithLowerBound.getPartitionIndexLowerBound(UnfilteredRowIteratorWithLowerBound.java:191)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at 
> 

[jira] [Updated] (CASSANDRA-15584) 4.0 quality testing: Tooling - External Ecosystem

2020-10-01 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-15584:
--
Description: 
Reference [doc from 
NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#]
 for context.

*Shepherd: Benjamin Lerer*

Many users of Apache Cassandra employ open source tooling to automate Cassandra 
configuration, runtime management, and repair scheduling. Prior to release, we 
need to confirm that popular third-party tools function properly. 

Current list of tools:
|| Name || Status || Contact ||
| [Priam|http://netflix.github.io/Priam/] | *NOT STARTED* | 
[~sumanth.pasupuleti]| 
| [sstabletools|https://github.com/instaclustr/cassandra-sstable-tools] | *NOT 
STARTED* | [~stefan.miklosovic]| 
| [cassandra-exporter|https://github.com/instaclustr/cassandra-exporter]| *NOT 
STARTED* | [~stefan.miklosovic]|
| [Instaclustr Cassandra 
operator|https://github.com/instaclustr/cassandra-operator]|  
{color:#00875A}*DONE*{color} | [~stefan.miklosovic]|
| [Instaclustr Cassandra Backup Restore | 
https://github.com/instaclustr/cassandra-backup]|{color:#00875A}*DONE*{color} | 
[~stefan.miklosovic]|
| [Instaclustr Cassandra Sidecar | 
https://github.com/instaclustr/cassandra-sidecar]|{color:#00875A}*DONE*{color} 
| [~stefan.miklosovic]|
| [Cassandra SSTable generator | 
https://github.com/instaclustr/cassandra-sstable-generator]|{color:#00875A}*DONE*{color}|
 [~stefan.miklosovic]|
| [Reaper|http://cassandra-reaper.io/]| {color:#00875A}*AUTOMATIC*{color} | 
[~adejanovski]|
| [Medusa|https://github.com/thelastpickle/cassandra-medusa]| *NOT STARTED*| 
[~adejanovski]|
| [Casskop|https://orange-opensource.github.io/casskop/]| *NOT STARTED*| Franck 
Dehay|
| 
[spark-cassandra-connector|https://github.com/datastax/spark-cassandra-connector]|
 {color:#00875A}*DONE*{color}| [~jtgrabowski]|
| [cass operator|https://github.com/datastax/cass-operator]| 
{color:#00875A}*DONE*{color}| [~jimdickinson]|
| [metric 
collector|https://github.com/datastax/metric-collector-for-apache-cassandra]| 
{color:#00875A}*DONE*{color}| [~tjake]|
| [managment 
API|https://github.com/datastax/management-api-for-apache-cassandra]| 
{color:#00875A}*DONE*{color}| [~tjake]|  

Columns descriptions:
* *Name*: Name and link to the tool official page
* *Status*: {{NOT STARTED}}, {{IN PROGRESS}}, {{BLOCKED}} if you hit any issue 
and have to wait for it to be solved, {{DONE}}, {{AUTOMATIC}} if testing 4.0 is 
part of your CI process.
* *Contact*: The person acting as the contact point for that tool. 

  was:
Reference [doc from 
NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#]
 for context.

*Shepherd: Benjamin Lerer*

Many users of Apache Cassandra employ open source tooling to automate Cassandra 
configuration, runtime management, and repair scheduling. Prior to release, we 
need to confirm that popular third-party tools function properly. 

Current list of tools:
|| Name || Status || Contact ||
| [Priam|http://netflix.github.io/Priam/] | *NOT STARTED* | 
[~sumanth.pasupuleti]| 
| [sstabletools|https://github.com/instaclustr/cassandra-sstable-tools] | *NOT 
STARTED* | [~stefan.miklosovic]| 
| [cassandra-exporter|https://github.com/instaclustr/cassandra-exporter]| *NOT 
STARTED* | [~stefan.miklosovic]|
| [Cassandra operator|https://github.com/instaclustr/cassandra-operator]|  
{color:#00875A}*DONE*{color} | [~stefan.miklosovic]|
| [Cassandra SSTable generator | 
https://github.com/instaclustr/cassandra-sstable-generator]|{color:#00875A}*DONE*{color}|
 [~stefan.miklosovic]|
| [Reaper|http://cassandra-reaper.io/]| {color:#00875A}*AUTOMATIC*{color} | 
[~adejanovski]|
| [Medusa|https://github.com/thelastpickle/cassandra-medusa]| *NOT STARTED*| 
[~adejanovski]|
| [Casskop|https://orange-opensource.github.io/casskop/]| *NOT STARTED*| Franck 
Dehay|
| 
[spark-cassandra-connector|https://github.com/datastax/spark-cassandra-connector]|
 {color:#00875A}*DONE*{color}| [~jtgrabowski]|
| [cass operator|https://github.com/datastax/cass-operator]| 
{color:#00875A}*DONE*{color}| [~jimdickinson]|
| [metric 
collector|https://github.com/datastax/metric-collector-for-apache-cassandra]| 
{color:#00875A}*DONE*{color}| [~tjake]|
| [managment 
API|https://github.com/datastax/management-api-for-apache-cassandra]| 
{color:#00875A}*DONE*{color}| [~tjake]|  

Columns descriptions:
* *Name*: Name and link to the tool official page
* *Status*: {{NOT STARTED}}, {{IN PROGRESS}}, {{BLOCKED}} if you hit any issue 
and have to wait for it to be solved, {{DONE}}, {{AUTOMATIC}} if testing 4.0 is 
part of your CI process.
* *Contact*: The person acting as the contact point for that tool. 


> 4.0 quality testing: Tooling - External Ecosystem
> -
>
> Key: CASSANDRA-15584
> URL: 

[jira] [Commented] (CASSANDRA-14030) disk_balance_bootstrap_test - disk_balance_test.TestDiskBalance fails: Missing: ['127.0.0.5.* now UP']:

2020-10-01 Thread Adam Holmberg (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205797#comment-17205797
 ] 

Adam Holmberg commented on CASSANDRA-14030:
---

The two failures indicated by Ekaterina and Jon are fixed in CASSANDRA-16089.

I'll try to reproduce the original flake for the test under its new name, 
[{{test_disk_balance_bootstrap}}|https://github.com/apache/cassandra-dtest/commit/49b2dda4e6643d2b18376d504b5fea4c0b3354a7#diff-1ef92939c7765f8c4041bada71208eebR41]

> disk_balance_bootstrap_test - disk_balance_test.TestDiskBalance fails: 
> Missing: ['127.0.0.5.* now UP']:
> ---
>
> Key: CASSANDRA-14030
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14030
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Testing
>Reporter: Michael Kjellman
>Assignee: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta
>
>
> disk_balance_bootstrap_test - disk_balance_test.TestDiskBalance fails: 
> Missing: ['127.0.0.5.* now UP']:
> 15 Nov 2017 11:28:03 [node4] Missing: ['127.0.0.5.* now UP']:
> .
> See system.log for remainder
>  >> begin captured logging << 
> dtest: DEBUG: cluster ccm directory: /tmp/dtest-NZzhNb
> dtest: DEBUG: Done setting configuration options:
> {   'initial_token': None,
> 'num_tokens': '32',
> 'phi_convict_threshold': 5,
> 'range_request_timeout_in_ms': 1,
> 'read_request_timeout_in_ms': 1,
> 'request_timeout_in_ms': 1,
> 'truncate_request_timeout_in_ms': 1,
> 'write_request_timeout_in_ms': 1}
> - >> end captured logging << -
>   File "/usr/lib/python2.7/unittest/case.py", line 329, in run
> testMethod()
>   File "/home/cassandra/cassandra-dtest/disk_balance_test.py", line 44, in 
> disk_balance_bootstrap_test
> node5.start(wait_for_binary_proto=True)
>   File 
> "/home/cassandra/env/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 706, in start
> node.watch_log_for_alive(self, from_mark=mark)
>   File 
> "/home/cassandra/env/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 520, in watch_log_for_alive
> self.watch_log_for(tofind, from_mark=from_mark, timeout=timeout, 
> filename=filename)
>   File 
> "/home/cassandra/env/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 488, in watch_log_for
> raise TimeoutError(time.strftime("%d %b %Y %H:%M:%S", time.gmtime()) + " 
> [" + self.name + "] Missing: " + str([e.pattern for e in tofind]) + ":\n" + 
> reads[:50] + ".\nSee {} for remainder".format(filename))
> "15 Nov 2017 11:28:03 [node4] Missing: ['127.0.0.5.* now UP']:\n.\nSee 
> system.log for remainder\n >> begin captured logging << 
> \ndtest: DEBUG: cluster ccm directory: 
> /tmp/dtest-NZzhNb\ndtest: DEBUG: Done setting configuration options:\n{   
> 'initial_token': None,\n'num_tokens': '32',\n'phi_convict_threshold': 
> 5,\n'range_request_timeout_in_ms': 1,\n
> 'read_request_timeout_in_ms': 1,\n'request_timeout_in_ms': 1,\n   
>  'truncate_request_timeout_in_ms': 1,\n'write_request_timeout_in_ms': 
> 1}\n- >> end captured logging << 
> -"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-14030) disk_balance_bootstrap_test - disk_balance_test.TestDiskBalance fails: Missing: ['127.0.0.5.* now UP']:

2020-10-01 Thread Adam Holmberg (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Holmberg reassigned CASSANDRA-14030:
-

Assignee: Adam Holmberg

> disk_balance_bootstrap_test - disk_balance_test.TestDiskBalance fails: 
> Missing: ['127.0.0.5.* now UP']:
> ---
>
> Key: CASSANDRA-14030
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14030
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Testing
>Reporter: Michael Kjellman
>Assignee: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta
>
>
> disk_balance_bootstrap_test - disk_balance_test.TestDiskBalance fails: 
> Missing: ['127.0.0.5.* now UP']:
> 15 Nov 2017 11:28:03 [node4] Missing: ['127.0.0.5.* now UP']:
> .
> See system.log for remainder
>  >> begin captured logging << 
> dtest: DEBUG: cluster ccm directory: /tmp/dtest-NZzhNb
> dtest: DEBUG: Done setting configuration options:
> {   'initial_token': None,
> 'num_tokens': '32',
> 'phi_convict_threshold': 5,
> 'range_request_timeout_in_ms': 1,
> 'read_request_timeout_in_ms': 1,
> 'request_timeout_in_ms': 1,
> 'truncate_request_timeout_in_ms': 1,
> 'write_request_timeout_in_ms': 1}
> - >> end captured logging << -
>   File "/usr/lib/python2.7/unittest/case.py", line 329, in run
> testMethod()
>   File "/home/cassandra/cassandra-dtest/disk_balance_test.py", line 44, in 
> disk_balance_bootstrap_test
> node5.start(wait_for_binary_proto=True)
>   File 
> "/home/cassandra/env/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 706, in start
> node.watch_log_for_alive(self, from_mark=mark)
>   File 
> "/home/cassandra/env/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 520, in watch_log_for_alive
> self.watch_log_for(tofind, from_mark=from_mark, timeout=timeout, 
> filename=filename)
>   File 
> "/home/cassandra/env/local/lib/python2.7/site-packages/ccmlib/node.py", line 
> 488, in watch_log_for
> raise TimeoutError(time.strftime("%d %b %Y %H:%M:%S", time.gmtime()) + " 
> [" + self.name + "] Missing: " + str([e.pattern for e in tofind]) + ":\n" + 
> reads[:50] + ".\nSee {} for remainder".format(filename))
> "15 Nov 2017 11:28:03 [node4] Missing: ['127.0.0.5.* now UP']:\n.\nSee 
> system.log for remainder\n >> begin captured logging << 
> \ndtest: DEBUG: cluster ccm directory: 
> /tmp/dtest-NZzhNb\ndtest: DEBUG: Done setting configuration options:\n{   
> 'initial_token': None,\n'num_tokens': '32',\n'phi_convict_threshold': 
> 5,\n'range_request_timeout_in_ms': 1,\n
> 'read_request_timeout_in_ms': 1,\n'request_timeout_in_ms': 1,\n   
>  'truncate_request_timeout_in_ms': 1,\n'write_request_timeout_in_ms': 
> 1}\n- >> end captured logging << 
> -"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16148) Test failures caused by merging CASSANDRA-15833

2020-10-01 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205774#comment-17205774
 ] 

Andres de la Peña commented on CASSANDRA-16148:
---

bq. I'd move {{SimpleReadWriteTest#test15833()}} to {{ReadRepairTest}}, get rid 
of the commented out cluster init, and throw a TODO onto it that indicates the 
method should be removed when CASSANDRA-15977 merges. (CC [~adelapena])
Agree, we'll remove it in CASSANDRA-15977. We could also name it {{test16148}}, 
like this ticket.

> Test failures caused by merging CASSANDRA-15833
> ---
>
> Key: CASSANDRA-16148
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16148
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Normal
>
> Three issues were caused by merging CASSANDRA-15833:
> 1. `GossiperTest#testHaveAnyVersion3Nodes` was failing on trunk: 
> https://app.circleci.com/pipelines/github/jrwest/cassandra/53/workflows/95f9f401-1ef8-4b8d-9c64-3703d9669d95/jobs/771
> 2. python dtest ReadRepairTest#test_atomic_writes[blocking] was failing
> 3. In-jvm dtests being worked on as part of CASSANDRA-15977 uncovered an 
> issue with how CASSANDRA-15833 changes interacted with in-jvm dtests running 
> without {{Feature.GOSSIP}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16152) In-JVM dtest - modify schema with stopped nodes and use yaml fragments for config

2020-10-01 Thread Jon Meredith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205761#comment-17205761
 ] 

Jon Meredith commented on CASSANDRA-16152:
--

Branches / CircleCI

[2.2|https://github.com/jonmeredith/cassandra/tree/C16152-2.2] 
[CircleCI|https://app.circleci.com/pipelines/github/jonmeredith/cassandra?branch=C16152-2.2]
[3.0|https://github.com/jonmeredith/cassandra/tree/C16152-3.0] 
[CircleCI|https://app.circleci.com/pipelines/github/jonmeredith/cassandra?branch=C16152-3.0]
[3.11|https://github.com/jonmeredith/cassandra/tree/C16152-3.11] 
[CircleCI|https://app.circleci.com/pipelines/github/jonmeredith/cassandra?branch=C16152-3.11]
[trunk|https://github.com/jonmeredith/cassandra/tree/C16152-trunk] 
[CircleCI|https://app.circleci.com/pipelines/github/jonmeredith/cassandra?branch=C16152-trunk]

> In-JVM dtest - modify schema with stopped nodes and use yaml fragments for 
> config
> -
>
> Key: CASSANDRA-16152
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16152
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest/java
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
>
> Some convenience improvements to in-JVM dtest that are useful across versions 
> that I needed while working on CASSANDRA-16144
> * Add support for changing schema with stopped nodes.
> * Make it simpler to modify nested configuration items by specifying Yaml 
> fragments 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16152) Minor in-jvm dtest improvements

2020-10-01 Thread Jon Meredith (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jon Meredith updated CASSANDRA-16152:
-
Change Category: Quality Assurance
 Complexity: Normal
Component/s: Test/dtest/java
 Status: Open  (was: Triage Needed)

Some convenience improvements to in-JVM dtest that are useful across versions 
that I needed while working on CASSANDRA-16144

* Add support for changing schema with stopped nodes.

* Make it simpler to modify nested configuration items by specifying Yaml 
fragments 

> Minor in-jvm dtest improvements
> ---
>
> Key: CASSANDRA-16152
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16152
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest/java
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
>
> Boring. Details to follow.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16152) In-JVM dtest - modify schema with stopped nodes and use yaml fragments for config

2020-10-01 Thread Jon Meredith (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jon Meredith updated CASSANDRA-16152:
-
Description: 
Some convenience improvements to in-JVM dtest that are useful across versions 
that I needed while working on CASSANDRA-16144

* Add support for changing schema with stopped nodes.

* Make it simpler to modify nested configuration items by specifying Yaml 
fragments 

  was:Boring. Details to follow.

Summary: In-JVM dtest - modify schema with stopped nodes and use yaml 
fragments for config  (was: Minor in-jvm dtest improvements)

> In-JVM dtest - modify schema with stopped nodes and use yaml fragments for 
> config
> -
>
> Key: CASSANDRA-16152
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16152
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest/java
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
>
> Some convenience improvements to in-JVM dtest that are useful across versions 
> that I needed while working on CASSANDRA-16144
> * Add support for changing schema with stopped nodes.
> * Make it simpler to modify nested configuration items by specifying Yaml 
> fragments 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16155) ByteBufferAccessor cast exceptions are thrown when trying to query a virtual table

2020-10-01 Thread Chris Lohfink (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Lohfink updated CASSANDRA-16155:
--
Reviewers: Chris Lohfink

> ByteBufferAccessor cast exceptions are thrown when trying to query a virtual 
> table
> --
>
> Key: CASSANDRA-16155
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16155
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Virtual Tables
>Reporter: Alex Petrov
>Assignee: Alex Petrov
>Priority: Normal
>
> Start a fresh trunk node, and try to run
> SELECT * FROM system_views.local_read_latency ;
> You’ll get: 
> {code:java}
> ERROR [Native-Transport-Requests-1] 2020-09-30 09:44:45,099 
> ErrorMessage.java:457 - Unexpected exception during request
>  java.lang.ClassCastException: 
> org.apache.cassandra.db.marshal.ByteBufferAccessor cannot be cast to 
> java.lang.String
>          at 
> org.apache.cassandra.serializers.AbstractTextSerializer.serialize(AbstractTextSerializer.java:29)
>          at 
> org.apache.cassandra.db.marshal.AbstractType.decompose(AbstractType.java:131) 
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16148) Test failures caused by merging CASSANDRA-15833

2020-10-01 Thread Caleb Rackliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205713#comment-17205713
 ] 

Caleb Rackliffe commented on CASSANDRA-16148:
-

bq. It would be great if we could get a baseline with and without memoize with 
a 50-100 node cluster with at least 1 node in 3.x.

Would a microbenchmark hammering {{haveMajorVersion3Nodes()}} over 100 
endpoints suffice to allay our fears? (i.e. One comparing trunk to a version of 
the patch w/ a {{volatile}} {{haveMajorVersion3Nodes}}?)

> Test failures caused by merging CASSANDRA-15833
> ---
>
> Key: CASSANDRA-16148
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16148
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Normal
>
> Three issues were caused by merging CASSANDRA-15833:
> 1. `GossiperTest#testHaveAnyVersion3Nodes` was failing on trunk: 
> https://app.circleci.com/pipelines/github/jrwest/cassandra/53/workflows/95f9f401-1ef8-4b8d-9c64-3703d9669d95/jobs/771
> 2. python dtest ReadRepairTest#test_atomic_writes[blocking] was failing
> 3. In-jvm dtests being worked on as part of CASSANDRA-15977 uncovered an 
> issue with how CASSANDRA-15833 changes interacted with in-jvm dtests running 
> without {{Feature.GOSSIP}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-16148) Test failures caused by merging CASSANDRA-15833

2020-10-01 Thread Caleb Rackliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205708#comment-17205708
 ] 

Caleb Rackliffe edited comment on CASSANDRA-16148 at 10/1/20, 5:45 PM:
---

bq. This logic isn't correct as you need the release version of the node you 
are adding, this method takes the release version of the current node (example: 
node1 is 3.0, node2 is 4.0. node2 would update gossip to say node1 is 4.0).

Would the idea be to add something similar to {{getMessagingVersion()}} to the 
{{IInstance}} API?

Otherwise, just minor notes in addition to what's already been discussed above:

* I'd move {{SimpleReadWriteTest#test15833()}} to {{ReadRepairTest}}, get rid 
of the commented out cluster init, and throw a TODO onto it that indicates the 
method should be removed when CASSANDRA-15977 merges. (CC [~adelapena])
* nit: If I'm interpreting all this correctly, {{haveMajorVersion3Nodes()}} is 
really more like {{mightHaveMajorVersion3Nodes()}} ;)
* If we end up not needing Guava, can we switch to 
{{java.util.function.Supplier}}?
* {{import com.google.common.base.Suppliers}} is unused.


was (Author: maedhroz):
bq. This logic isn't correct as you need the release version of the node you 
are adding, this method takes the release version of the current node (example: 
node1 is 3.0, node2 is 4.0. node2 would update gossip to say node1 is 4.0).

Would the idea be to add something similar to {{getMessagingVersion()}} to the 
{{IInstance}} API?

Otherwise, just minor notes in addition to what's already been discussed above:

* I'd move {{SimpleReadWriteTest#test15833()}} to {{ReadRepairTest}}, get rid 
of the commented out cluster init, and throw a TODO onto it that indicates the 
method should be removed when CASSANDRA-15977 merges. (CC [~adelapena])
* nit: If I'm interpreting all this correctly, {{haveMajorVersion3Nodes()}} is 
really more like {{mightHaveMajorVersion3Nodes()}} ;)
* If we end up not needing Guava, can we switch to 
{{java.util.function.Supplier}}?

> Test failures caused by merging CASSANDRA-15833
> ---
>
> Key: CASSANDRA-16148
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16148
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Normal
>
> Three issues were caused by merging CASSANDRA-15833:
> 1. `GossiperTest#testHaveAnyVersion3Nodes` was failing on trunk: 
> https://app.circleci.com/pipelines/github/jrwest/cassandra/53/workflows/95f9f401-1ef8-4b8d-9c64-3703d9669d95/jobs/771
> 2. python dtest ReadRepairTest#test_atomic_writes[blocking] was failing
> 3. In-jvm dtests being worked on as part of CASSANDRA-15977 uncovered an 
> issue with how CASSANDRA-15833 changes interacted with in-jvm dtests running 
> without {{Feature.GOSSIP}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16148) Test failures caused by merging CASSANDRA-15833

2020-10-01 Thread Caleb Rackliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205708#comment-17205708
 ] 

Caleb Rackliffe commented on CASSANDRA-16148:
-

bq. This logic isn't correct as you need the release version of the node you 
are adding, this method takes the release version of the current node (example: 
node1 is 3.0, node2 is 4.0. node2 would update gossip to say node1 is 4.0).

Would the idea be to add something similar to {{getMessagingVersion()}} to the 
{{IInstance}} API?

Otherwise, just minor notes in addition to what's already been discussed above:

* I'd move {{SimpleReadWriteTest#test15833()}} to {{ReadRepairTest}}, get rid 
of the commented out cluster init, and throw a TODO onto it that indicates the 
method should be removed when CASSANDRA-15977 merges. (CC [~adelapena])
* nit: If I'm interpreting all this correctly, {{haveMajorVersion3Nodes()}} is 
really more like {{mightHaveMajorVersion3Nodes()}} ;)
* If we end up not needing Guava, can we switch to 
{{java.util.function.Supplier}}?

> Test failures caused by merging CASSANDRA-15833
> ---
>
> Key: CASSANDRA-16148
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16148
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Normal
>
> Three issues were caused by merging CASSANDRA-15833:
> 1. `GossiperTest#testHaveAnyVersion3Nodes` was failing on trunk: 
> https://app.circleci.com/pipelines/github/jrwest/cassandra/53/workflows/95f9f401-1ef8-4b8d-9c64-3703d9669d95/jobs/771
> 2. python dtest ReadRepairTest#test_atomic_writes[blocking] was failing
> 3. In-jvm dtests being worked on as part of CASSANDRA-15977 uncovered an 
> issue with how CASSANDRA-15833 changes interacted with in-jvm dtests running 
> without {{Feature.GOSSIP}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-16152) Minor in-jvm dtest improvements

2020-10-01 Thread Jon Meredith (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jon Meredith reassigned CASSANDRA-16152:


Assignee: Jon Meredith

> Minor in-jvm dtest improvements
> ---
>
> Key: CASSANDRA-16152
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16152
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jon Meredith
>Assignee: Jon Meredith
>Priority: Normal
>
> Boring. Details to follow.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15538) 4.0 quality testing: Local Read/Write Path: Other Areas

2020-10-01 Thread Aleksey Yeschenko (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205703#comment-17205703
 ] 

Aleksey Yeschenko commented on CASSANDRA-15538:
---

bq. you still planning on shepherding this?

If someone else is willing to take on this, I'm happy to let it go. Also, am in 
agreement with Sylvain, except I've seen Harry, and believe it should be a core 
vector here.

> 4.0 quality testing: Local Read/Write Path: Other Areas
> ---
>
> Key: CASSANDRA-15538
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15538
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest/java, Test/dtest/python
>Reporter: Josh McKenzie
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Reference [doc from 
> NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#]
>  for context.
> *Shepherd: Aleksey Yeschenko*
> Testing in this area refers to the local read/write path (StorageProxy, 
> ColumnFamilyStore, Memtable, SSTable reading/writing, etc). We are still 
> finding numerous bugs and issues with the 3.0 storage engine rewrite 
> (CASSANDRA-8099). For 4.0 we want to ensure that we thoroughly cover the 
> local read/write path with techniques such as property-based testing, fuzzing 
> ([example|http://cassandra.apache.org/blog/2018/10/17/finding_bugs_with_property_based_testing.html]),
>  and a source audit.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15229) Segregate Network and Chunk Cache BufferPools and Recirculate Partially Freed Chunks

2020-10-01 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-15229:

Summary: Segregate Network and Chunk Cache BufferPools and Recirculate 
Partially Freed Chunks  (was: BufferPool Regression)

> Segregate Network and Chunk Cache BufferPools and Recirculate Partially Freed 
> Chunks
> 
>
> Key: CASSANDRA-15229
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15229
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Caching
>Reporter: Benedict Elliott Smith
>Assignee: Zhao Yang
>Priority: Normal
> Fix For: 4.0, 4.0-beta
>
> Attachments: 15229-count.png, 15229-direct.png, 15229-hit-rate.png, 
> 15229-recirculate-count.png, 15229-recirculate-hit-rate.png, 
> 15229-recirculate-size.png, 15229-recirculate.png, 15229-size.png, 
> 15229-unsafe.png
>
>
> The BufferPool was never intended to be used for a {{ChunkCache}}, and we 
> need to either change our behaviour to handle uncorrelated lifetimes or use 
> something else.  This is particularly important with the default chunk size 
> for compressed sstables being reduced.  If we address the problem, we should 
> also utilise the BufferPool for native transport connections like we do for 
> internode messaging, and reduce the number of pooling solutions we employ.
> Probably the best thing to do is to improve BufferPool’s behaviour when used 
> for things with uncorrelated lifetimes, which essentially boils down to 
> tracking those chunks that have not been freed and re-circulating them when 
> we run out of completely free blocks.  We should probably also permit 
> instantiating separate {{BufferPool}}, so that we can insulate internode 
> messaging from the {{ChunkCache}}, or at least have separate memory bounds 
> for each, and only share fully-freed chunks.
> With these improvements we can also safely increase the {{BufferPool}} chunk 
> size to 128KiB or 256KiB, to guarantee we can fit compressed pages and reduce 
> the amount of global coordination and per-allocation overhead.  We don’t need 
> 1KiB granularity for allocations, nor 16 byte granularity for tiny 
> allocations.
> -
> Since CASSANDRA-5863, chunk cache is implemented to use buffer pool. When 
> local pool is full, one of its chunks will be evicted and only put back to 
> global pool when all buffers in the evicted chunk are released. But due to 
> chunk cache, buffers can be held for long period of time, preventing evicted 
> chunk to be recycled even though most of space in the evicted chunk are free.
> There two things need to be improved:
> 1. Evicted chunk with free space should be recycled to global pool, even if 
> it's not fully free. It's doable in 4.0.
> 2. Reduce fragmentation caused by different buffer size. With #1, partially 
> freed chunk will be available for allocation, but "holes" in the partially 
> freed chunk are with different sizes. We should consider allocating fixed 
> buffer size which is unlikely to fit in 4.0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15229) BufferPool Regression

2020-10-01 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-15229:

Reviewers: Aleksey Yeschenko, Caleb Rackliffe, Caleb Rackliffe  (was: 
Aleksey Yeschenko, Caleb Rackliffe)
   Aleksey Yeschenko, Caleb Rackliffe, Caleb Rackliffe  (was: 
Aleksey Yeschenko, Caleb Rackliffe)
   Status: Review In Progress  (was: Patch Available)

> BufferPool Regression
> -
>
> Key: CASSANDRA-15229
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15229
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Caching
>Reporter: Benedict Elliott Smith
>Assignee: Zhao Yang
>Priority: Normal
> Fix For: 4.0, 4.0-beta
>
> Attachments: 15229-count.png, 15229-direct.png, 15229-hit-rate.png, 
> 15229-recirculate-count.png, 15229-recirculate-hit-rate.png, 
> 15229-recirculate-size.png, 15229-recirculate.png, 15229-size.png, 
> 15229-unsafe.png
>
>
> The BufferPool was never intended to be used for a {{ChunkCache}}, and we 
> need to either change our behaviour to handle uncorrelated lifetimes or use 
> something else.  This is particularly important with the default chunk size 
> for compressed sstables being reduced.  If we address the problem, we should 
> also utilise the BufferPool for native transport connections like we do for 
> internode messaging, and reduce the number of pooling solutions we employ.
> Probably the best thing to do is to improve BufferPool’s behaviour when used 
> for things with uncorrelated lifetimes, which essentially boils down to 
> tracking those chunks that have not been freed and re-circulating them when 
> we run out of completely free blocks.  We should probably also permit 
> instantiating separate {{BufferPool}}, so that we can insulate internode 
> messaging from the {{ChunkCache}}, or at least have separate memory bounds 
> for each, and only share fully-freed chunks.
> With these improvements we can also safely increase the {{BufferPool}} chunk 
> size to 128KiB or 256KiB, to guarantee we can fit compressed pages and reduce 
> the amount of global coordination and per-allocation overhead.  We don’t need 
> 1KiB granularity for allocations, nor 16 byte granularity for tiny 
> allocations.
> -
> Since CASSANDRA-5863, chunk cache is implemented to use buffer pool. When 
> local pool is full, one of its chunks will be evicted and only put back to 
> global pool when all buffers in the evicted chunk are released. But due to 
> chunk cache, buffers can be held for long period of time, preventing evicted 
> chunk to be recycled even though most of space in the evicted chunk are free.
> There two things need to be improved:
> 1. Evicted chunk with free space should be recycled to global pool, even if 
> it's not fully free. It's doable in 4.0.
> 2. Reduce fragmentation caused by different buffer size. With #1, partially 
> freed chunk will be available for allocation, but "holes" in the partially 
> freed chunk are with different sizes. We should consider allocating fixed 
> buffer size which is unlikely to fit in 4.0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15865) Flaky dtest hintedhandoff_test.py::TestHintedHandoffConfig::test_hintedhandoff_setmaxwindow

2020-10-01 Thread Adam Holmberg (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205663#comment-17205663
 ] 

Adam Holmberg commented on CASSANDRA-15865:
---

[~Ottermad] are you still looking at this?

> Flaky dtest 
> hintedhandoff_test.py::TestHintedHandoffConfig::test_hintedhandoff_setmaxwindow
> ---
>
> Key: CASSANDRA-15865
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15865
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Sam Tunnicliffe
>Assignee: Charles Attwood Thomas
>Priority: Normal
> Fix For: 4.0-beta
>
>
> I've seen this fail a couple of times under JDK11, when it doesn't appear to 
> be related to the changes under test.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15585) 4.0 quality testing: Test Frameworks, Tooling, Infra / Automation

2020-10-01 Thread Jordan West (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205628#comment-17205628
 ] 

Jordan West commented on CASSANDRA-15585:
-

Thanks for summarizing the current status and the previous discussions 
[~jmckenzie]. We've made a lot of progress on this ticket even if its not well 
reflected. In-jvm dtests have made it much easier to write and debug complex, 
multi-node scenarios. Harry (and to a much lesser degree QuickTheories) is now 
available and has been used to write some pretty complex tests as well. And 
cassandra-diff, which other folks are starting to pick up and use, for example, 
in the project you mentioned, is available as well. 

Regarding what's left, the biggest area of focus for this ticket, I think, 
should be automation -- are we running tests regularly with the tools we built 
so that they have the potential to find things:

- For in-jvm dtests, that is already true (at least in circle ci but I believe 
that is also the case in Jenkins -- I will check).
- [~ifesdjeen] can you speak to the current level of automation (regular 
running) of Harry tests in OSS? I'm not familiar (but can also check if you are 
busy)
- We don't have much automation around cassandra-diff but it does sound like 
there are some potential projects on the horizon

I think a great stretch goal (or goal for subsequent releases otherwise) would 
be the sort automated, cassandra-diff framework mentioned, especially one that 
allows for others to contribute schemas, etc. Or maybe breaking it up so we 
start it now and add to it in subsequent releases would be a good middle 
ground.  

> 4.0 quality testing: Test Frameworks, Tooling, Infra / Automation
> -
>
> Key: CASSANDRA-15585
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15585
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest/python
>Reporter: Josh McKenzie
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Reference [doc from 
> NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#]
>  for context.
> *Shepherd: Jordan West*
> This area refers to contributions to test frameworks/tooling (e.g., dtests, 
> QuickTheories, CASSANDRA-14821), and automation enabling those tools to be 
> applied at scale (e.g., replay testing via Spark-based replay of captured FQL 
> logs).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16063) Fix user experience when upgrading to 4.0 with compact tables

2020-10-01 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205623#comment-17205623
 ] 

Andres de la Peña commented on CASSANDRA-16063:
---

I have left a few very minor suggestions on the PRs. The only thing that 
worries me is that, if I'm understanding it correctly, the verification of the 
SSTables before dropping `COMPACT STORAGE` at the cluster-level is done only in 
the local node, while there could be other nodes with old sstable versions.

> Fix user experience when upgrading to 4.0 with compact tables
> -
>
> Key: CASSANDRA-16063
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16063
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/CQL
>Reporter: Sylvain Lebresne
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 4.0-beta
>
> Attachments: Compact_storage_upgrade_tests.txt
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The code to handle compact tables has been removed from 4.0, and the intended 
> upgrade path to 4.0 for users having compact tables on 3.x is that they must 
> execute {{ALTER ... DROP COMPACT STORAGE}} on all of their compact tables 
> *before* attempting the upgrade.
> Obviously, some users won't read the upgrade instructions (or miss a table) 
> and may try upgrading despite still having compact tables. If they do so, the 
> intent is that the node will _not_ start, with a message clearly indicating 
> the pre-upgrade step the user has missed. The user will then downgrade back 
> the node(s) to 3.x, run the proper {{ALTER ... DROP COMPACT STORAGE}}, and 
> then upgrade again.
> But while 4.0 does currently fail startup when finding any compact tables 
> with a decent message, I believe the check is done too late during startup.
> Namely, that check is done as we read the tables schema, so within 
> [{{Schema.instance.loadFromDisk()}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/CassandraDaemon.java#L241].
>   But by then, we've _at least_ called 
> {{SystemKeyspace.persistLocalMetadata()}}} and 
> {{SystemKeyspaceMigrator40.migrate()}}, which will get into the commit log, 
> and even possibly flush new {{na}} format sstables. As a results, a user 
> might not be able to seemlessly restart the node on 3.x (to drop compact 
> storage on the appropriate tables).
> Basically, we should make sure the check for compact tables done at 4.0 
> startup is done as a {{StartupCheck}}, before the node does anything.
> We should also add a test for this (checking that if you try upgrading to 4.0 
> with compact storage, you can downgrade back with no intervention whatsoever).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15993) Fix flaky python dtest test_view_metadata_cleanup - materialized_views_test.TestMaterializedViews

2020-10-01 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-15993:
-
Reviewers: Berenguer Blasi, Brandon Williams  (was: Berenguer Blasi)

> Fix flaky python dtest test_view_metadata_cleanup - 
> materialized_views_test.TestMaterializedViews
> -
>
> Key: CASSANDRA-15993
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15993
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: David Capwell
>Assignee: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta3
>
>
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/355/workflows/7b8df61d-706f-4094-a206-7cdc6b4e0451/jobs/1818
> {code}
> E   cassandra.OperationTimedOut: errors={'127.0.0.2': 'Client request 
> timeout. See Session.execute[_async](timeout)'}, last_host=127.0.0.2
> cassandra/cluster.py:4026: OperationTimedOut
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15993) Fix flaky python dtest test_view_metadata_cleanup - materialized_views_test.TestMaterializedViews

2020-10-01 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-15993:
-
  Fix Version/s: (was: 4.0-beta)
 4.0-beta3
  Since Version: NA
Source Control Link: 
https://github.com/apache/cassandra-dtest/commit/1789213ee00a05b3686858b3a22dc8c2d26fc837
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

Committed, thanks!

> Fix flaky python dtest test_view_metadata_cleanup - 
> materialized_views_test.TestMaterializedViews
> -
>
> Key: CASSANDRA-15993
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15993
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: David Capwell
>Assignee: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta3
>
>
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/355/workflows/7b8df61d-706f-4094-a206-7cdc6b4e0451/jobs/1818
> {code}
> E   cassandra.OperationTimedOut: errors={'127.0.0.2': 'Client request 
> timeout. See Session.execute[_async](timeout)'}, last_host=127.0.0.2
> cassandra/cluster.py:4026: OperationTimedOut
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15993) Fix flaky python dtest test_view_metadata_cleanup - materialized_views_test.TestMaterializedViews

2020-10-01 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-15993:
-
Status: Ready to Commit  (was: Review In Progress)

> Fix flaky python dtest test_view_metadata_cleanup - 
> materialized_views_test.TestMaterializedViews
> -
>
> Key: CASSANDRA-15993
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15993
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: David Capwell
>Assignee: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta
>
>
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/355/workflows/7b8df61d-706f-4094-a206-7cdc6b4e0451/jobs/1818
> {code}
> E   cassandra.OperationTimedOut: errors={'127.0.0.2': 'Client request 
> timeout. See Session.execute[_async](timeout)'}, last_host=127.0.0.2
> cassandra/cluster.py:4026: OperationTimedOut
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra-dtest] branch master updated: Fix flaky timeouts coming from concurrent view builds and schema modification

2020-10-01 Thread brandonwilliams
This is an automated email from the ASF dual-hosted git repository.

brandonwilliams pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/cassandra-dtest.git


The following commit(s) were added to refs/heads/master by this push:
 new 1789213  Fix flaky timeouts coming from concurrent view builds and 
schema modification
1789213 is described below

commit 1789213ee00a05b3686858b3a22dc8c2d26fc837
Author: Adam Holmberg 
AuthorDate: Wed Sep 30 16:40:41 2020 -0500

Fix flaky timeouts coming from concurrent view builds and schema 
modification

Patch by Adam Holmberg, reviewed by Berenguer Blasi and brandonwilliams
for CASSANDRA-15993
---
 materialized_views_test.py | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/materialized_views_test.py b/materialized_views_test.py
index 625a697..b1cd79e 100644
--- a/materialized_views_test.py
+++ b/materialized_views_test.py
@@ -209,13 +209,14 @@ class TestMaterializedViews(Tester):
 logger.debug("create view")
 for view in range(views):
 session.execute("CREATE MATERIALIZED VIEW mv{} AS SELECT * 
FROM t "
-"WHERE k IS NOT NULL AND c IS NOT NULL PRIMARY 
KEY (c,k)".format(view))
-for view in range(views):
+"WHERE k IS NOT NULL AND c IS NOT NULL PRIMARY 
KEY (c,k)".format(view),
+timeout=60)
 self._wait_for_view(keyspace, "mv{}".format(view))
 
 def drop_keyspace(session, keyspace="ks1"):
 logger.debug("drop keyspace {}".format(keyspace))
-session.execute("DROP KEYSPACE IF EXISTS {}".format(keyspace))
+session.execute("DROP KEYSPACE IF EXISTS {}".format(keyspace),
+timeout=60)
 
 def drop_views(session, views):
 logger.debug("drop all views")


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14793) Improve system table handling when losing a disk when using JBOD

2020-10-01 Thread Benjamin Lerer (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205588#comment-17205588
 ] 

Benjamin Lerer commented on CASSANDRA-14793:


[PR|https://github.com/apache/cassandra/compare/trunk...blerer:CASSANDRA-14793] 
[CI|https://app.circleci.com/pipelines/github/blerer/cassandra/34/workflows/34ff6aa9-2ee9-4d3e-8929-3c9f048ce357]

> Improve system table handling when losing a disk when using JBOD
> 
>
> Key: CASSANDRA-14793
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14793
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Marcus Eriksson
>Assignee: Benjamin Lerer
>Priority: Normal
> Fix For: 4.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> We should improve the way we handle disk failures when losing a disk in a 
> JBOD setup
>  One way could be to pin the system tables to a special data directory.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14793) Improve system table handling when losing a disk when using JBOD

2020-10-01 Thread Benjamin Lerer (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-14793:
---
Status: Patch Available  (was: In Progress)

> Improve system table handling when losing a disk when using JBOD
> 
>
> Key: CASSANDRA-14793
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14793
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Marcus Eriksson
>Assignee: Benjamin Lerer
>Priority: Normal
> Fix For: 4.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> We should improve the way we handle disk failures when losing a disk in a 
> JBOD setup
>  One way could be to pin the system tables to a special data directory.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-16157)

2020-10-01 Thread Alex Petrov (Jira)
Alex Petrov created CASSANDRA-16157:
---

 Summary: 
 Key: CASSANDRA-16157
 URL: https://issues.apache.org/jira/browse/CASSANDRA-16157
 Project: Cassandra
  Issue Type: Bug
Reporter: Alex Petrov






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16156) Decomissioned nodes are picked for gossip when unreachable nodes are considered for gossiping

2020-10-01 Thread Alex Petrov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov updated CASSANDRA-16156:

 Bug Category: Parent values: Code(13163)Level 1 values: Bug - Unclear 
Impact(13164)
   Complexity: Normal
  Component/s: Cluster/Gossip
Discovered By: User Report
 Severity: Normal
   Status: Open  (was: Triage Needed)

> Decomissioned nodes are picked for gossip when unreachable nodes are 
> considered for gossiping 
> --
>
> Key: CASSANDRA-16156
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16156
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Alex Petrov
>Assignee: Alex Petrov
>Priority: Normal
>
> After node is decommissioned, it is still considered for gossip via 
> “unreachable” nodes, which results into following exceptions:
>  
> {code}
> INFO  [node4_Messaging-EventLoop-3-3] node4 2020-09-29 16:37:37,527 
> NoSpamLogger.java:91 - 
> /127.0.0.4:7012->/127.0.0.1:7012-URGENT_MESSAGES-[no-channel] failed to 
> connect
> io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection 
> refused: /127.0.0.1:7012
> Caused by: java.net.ConnectException: Connection refused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
>   at 
> io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)
>   at 
> io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:702)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)
>   at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
>   at 
> io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
>   at 
> io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.lang.Thread.run(Thread.java:748)
>  {code}
> Trace of the method that attempts to establish connection:
> {code} 
> org.apache.cassandra.net.MessagingService.getOutbound(MessagingService.java:492)
>   at 
> org.apache.cassandra.net.MessagingService.doSend(MessagingService.java:335)
>   at 
> org.apache.cassandra.net.OutboundSink$Filtered.accept(OutboundSink.java:55)
>   at org.apache.cassandra.net.OutboundSink.accept(OutboundSink.java:70)
>   at 
> org.apache.cassandra.net.MessagingService.send(MessagingService.java:327)
>   at 
> org.apache.cassandra.net.MessagingService.send(MessagingService.java:314)
>   at org.apache.cassandra.gms.Gossiper.sendGossip(Gossiper.java:813)
>   at 
> org.apache.cassandra.gms.Gossiper.maybeGossipToUnreachableMember(Gossiper.java:840)
>   at org.apache.cassandra.gms.Gossiper.access$400(Gossiper.java:86)
>  {code}
> LEFT and other nodes that are considered dead should not be picked for gossip 
> with unreachable nodes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16156) Decomissioned nodes are picked for gossip when unreachable nodes are considered for gossiping

2020-10-01 Thread Alex Petrov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov updated CASSANDRA-16156:

Reviewers: Marcus Eriksson

> Decomissioned nodes are picked for gossip when unreachable nodes are 
> considered for gossiping 
> --
>
> Key: CASSANDRA-16156
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16156
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Alex Petrov
>Assignee: Alex Petrov
>Priority: Normal
>
> After node is decommissioned, it is still considered for gossip via 
> “unreachable” nodes, which results into following exceptions:
>  
> {code}
> INFO  [node4_Messaging-EventLoop-3-3] node4 2020-09-29 16:37:37,527 
> NoSpamLogger.java:91 - 
> /127.0.0.4:7012->/127.0.0.1:7012-URGENT_MESSAGES-[no-channel] failed to 
> connect
> io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection 
> refused: /127.0.0.1:7012
> Caused by: java.net.ConnectException: Connection refused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
>   at 
> io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)
>   at 
> io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:702)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)
>   at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
>   at 
> io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
>   at 
> io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.lang.Thread.run(Thread.java:748)
>  {code}
> Trace of the method that attempts to establish connection:
> {code} 
> org.apache.cassandra.net.MessagingService.getOutbound(MessagingService.java:492)
>   at 
> org.apache.cassandra.net.MessagingService.doSend(MessagingService.java:335)
>   at 
> org.apache.cassandra.net.OutboundSink$Filtered.accept(OutboundSink.java:55)
>   at org.apache.cassandra.net.OutboundSink.accept(OutboundSink.java:70)
>   at 
> org.apache.cassandra.net.MessagingService.send(MessagingService.java:327)
>   at 
> org.apache.cassandra.net.MessagingService.send(MessagingService.java:314)
>   at org.apache.cassandra.gms.Gossiper.sendGossip(Gossiper.java:813)
>   at 
> org.apache.cassandra.gms.Gossiper.maybeGossipToUnreachableMember(Gossiper.java:840)
>   at org.apache.cassandra.gms.Gossiper.access$400(Gossiper.java:86)
>  {code}
> LEFT and other nodes that are considered dead should not be picked for gossip 
> with unreachable nodes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16155) ByteBufferAccessor cast exceptions are thrown when trying to query a virtual table

2020-10-01 Thread Alex Petrov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov updated CASSANDRA-16155:

 Bug Category: Parent values: Availability(12983)Level 1 values: Response 
Crash(12991)
   Complexity: Normal
  Component/s: Feature/Virtual Tables
Discovered By: User Report
 Severity: Critical
   Status: Open  (was: Triage Needed)

> ByteBufferAccessor cast exceptions are thrown when trying to query a virtual 
> table
> --
>
> Key: CASSANDRA-16155
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16155
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Virtual Tables
>Reporter: Alex Petrov
>Assignee: Alex Petrov
>Priority: Normal
>
> Start a fresh trunk node, and try to run
> SELECT * FROM system_views.local_read_latency ;
> You’ll get: 
> {code:java}
> ERROR [Native-Transport-Requests-1] 2020-09-30 09:44:45,099 
> ErrorMessage.java:457 - Unexpected exception during request
>  java.lang.ClassCastException: 
> org.apache.cassandra.db.marshal.ByteBufferAccessor cannot be cast to 
> java.lang.String
>          at 
> org.apache.cassandra.serializers.AbstractTextSerializer.serialize(AbstractTextSerializer.java:29)
>          at 
> org.apache.cassandra.db.marshal.AbstractType.decompose(AbstractType.java:131) 
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-16156) Decomissioned nodes are picked for gossip when unreachable nodes are considered for gossiping

2020-10-01 Thread Alex Petrov (Jira)
Alex Petrov created CASSANDRA-16156:
---

 Summary: Decomissioned nodes are picked for gossip when 
unreachable nodes are considered for gossiping 
 Key: CASSANDRA-16156
 URL: https://issues.apache.org/jira/browse/CASSANDRA-16156
 Project: Cassandra
  Issue Type: Bug
Reporter: Alex Petrov
Assignee: Alex Petrov


After node is decommissioned, it is still considered for gossip via 
“unreachable” nodes, which results into following exceptions:
 
INFO  [node4_Messaging-EventLoop-3-3] node4 2020-09-29 16:37:37,527 
NoSpamLogger.java:91 - 
/127.0.0.4:7012->/127.0.0.1:7012-URGENT_MESSAGES-[no-channel] failed to connect
io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: 
/127.0.0.1:7012
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at 
io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)
at 
io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:702)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
at 
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
at 
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:748)
 
Trace of the method that attempts to establish connection:
 
org.apache.cassandra.net.MessagingService.getOutbound(MessagingService.java:492)
at 
org.apache.cassandra.net.MessagingService.doSend(MessagingService.java:335)
at 
org.apache.cassandra.net.OutboundSink$Filtered.accept(OutboundSink.java:55)
at org.apache.cassandra.net.OutboundSink.accept(OutboundSink.java:70)
at 
org.apache.cassandra.net.MessagingService.send(MessagingService.java:327)
at 
org.apache.cassandra.net.MessagingService.send(MessagingService.java:314)
at org.apache.cassandra.gms.Gossiper.sendGossip(Gossiper.java:813)
at 
org.apache.cassandra.gms.Gossiper.maybeGossipToUnreachableMember(Gossiper.java:840)
at org.apache.cassandra.gms.Gossiper.access$400(Gossiper.java:86)
 
LEFT and other nodes that are considered dead should not be picked for gossip 
with unreachable nodes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16156) Decomissioned nodes are picked for gossip when unreachable nodes are considered for gossiping

2020-10-01 Thread Alex Petrov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov updated CASSANDRA-16156:

Description: 
After node is decommissioned, it is still considered for gossip via 
“unreachable” nodes, which results into following exceptions:
 
{code}
INFO  [node4_Messaging-EventLoop-3-3] node4 2020-09-29 16:37:37,527 
NoSpamLogger.java:91 - 
/127.0.0.4:7012->/127.0.0.1:7012-URGENT_MESSAGES-[no-channel] failed to connect
io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: 
/127.0.0.1:7012
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at 
io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)
at 
io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:702)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
at 
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
at 
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:748)
 {code}

Trace of the method that attempts to establish connection:

{code} 
org.apache.cassandra.net.MessagingService.getOutbound(MessagingService.java:492)
at 
org.apache.cassandra.net.MessagingService.doSend(MessagingService.java:335)
at 
org.apache.cassandra.net.OutboundSink$Filtered.accept(OutboundSink.java:55)
at org.apache.cassandra.net.OutboundSink.accept(OutboundSink.java:70)
at 
org.apache.cassandra.net.MessagingService.send(MessagingService.java:327)
at 
org.apache.cassandra.net.MessagingService.send(MessagingService.java:314)
at org.apache.cassandra.gms.Gossiper.sendGossip(Gossiper.java:813)
at 
org.apache.cassandra.gms.Gossiper.maybeGossipToUnreachableMember(Gossiper.java:840)
at org.apache.cassandra.gms.Gossiper.access$400(Gossiper.java:86)
 {code}

LEFT and other nodes that are considered dead should not be picked for gossip 
with unreachable nodes.

  was:
After node is decommissioned, it is still considered for gossip via 
“unreachable” nodes, which results into following exceptions:
 
INFO  [node4_Messaging-EventLoop-3-3] node4 2020-09-29 16:37:37,527 
NoSpamLogger.java:91 - 
/127.0.0.4:7012->/127.0.0.1:7012-URGENT_MESSAGES-[no-channel] failed to connect
io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: 
/127.0.0.1:7012
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at 
io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)
at 
io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:702)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
at 
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
at 
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:748)
 
Trace of the method that attempts to establish connection:
 
org.apache.cassandra.net.MessagingService.getOutbound(MessagingService.java:492)
at 
org.apache.cassandra.net.MessagingService.doSend(MessagingService.java:335)
at 
org.apache.cassandra.net.OutboundSink$Filtered.accept(OutboundSink.java:55)
at org.apache.cassandra.net.OutboundSink.accept(OutboundSink.java:70)
at 
org.apache.cassandra.net.MessagingService.send(MessagingService.java:327)
at 
org.apache.cassandra.net.MessagingService.send(MessagingService.java:314)
at org.apache.cassandra.gms.Gossiper.sendGossip(Gossiper.java:813)
at 
org.apache.cassandra.gms.Gossiper.maybeGossipToUnreachableMember(Gossiper.java:840)
at 

[jira] [Updated] (CASSANDRA-16155) ByteBufferAccessor cast exceptions are thrown when trying to query a virtual table

2020-10-01 Thread Alex Petrov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov updated CASSANDRA-16155:

Description: 
Start a fresh trunk node, and try to run

SELECT * FROM system_views.local_read_latency ;

You’ll get: 
{code:java}
ERROR [Native-Transport-Requests-1] 2020-09-30 09:44:45,099 
ErrorMessage.java:457 - Unexpected exception during request
 java.lang.ClassCastException: 
org.apache.cassandra.db.marshal.ByteBufferAccessor cannot be cast to 
java.lang.String
         at 
org.apache.cassandra.serializers.AbstractTextSerializer.serialize(AbstractTextSerializer.java:29)
         at 
org.apache.cassandra.db.marshal.AbstractType.decompose(AbstractType.java:131) 
{code}
 

  was:
Start a fresh trunk node, and try to run

SELECT * FROM system_views.local_read_latency ;

You’ll get: 

ERROR [Native-Transport-Requests-1] 2020-09-30 09:44:45,099 
ErrorMessage.java:457 - Unexpected exception during request
java.lang.ClassCastException: 
org.apache.cassandra.db.marshal.ByteBufferAccessor cannot be cast to 
java.lang.String
        at 
org.apache.cassandra.serializers.AbstractTextSerializer.serialize(AbstractTextSerializer.java:29)
        at 
org.apache.cassandra.db.marshal.AbstractType.decompose(AbstractType.java:131)


> ByteBufferAccessor cast exceptions are thrown when trying to query a virtual 
> table
> --
>
> Key: CASSANDRA-16155
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16155
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Alex Petrov
>Assignee: Alex Petrov
>Priority: Normal
>
> Start a fresh trunk node, and try to run
> SELECT * FROM system_views.local_read_latency ;
> You’ll get: 
> {code:java}
> ERROR [Native-Transport-Requests-1] 2020-09-30 09:44:45,099 
> ErrorMessage.java:457 - Unexpected exception during request
>  java.lang.ClassCastException: 
> org.apache.cassandra.db.marshal.ByteBufferAccessor cannot be cast to 
> java.lang.String
>          at 
> org.apache.cassandra.serializers.AbstractTextSerializer.serialize(AbstractTextSerializer.java:29)
>          at 
> org.apache.cassandra.db.marshal.AbstractType.decompose(AbstractType.java:131) 
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-16155) ByteBufferAccessor cast exceptions are thrown when trying to query a virtual table

2020-10-01 Thread Alex Petrov (Jira)
Alex Petrov created CASSANDRA-16155:
---

 Summary: ByteBufferAccessor cast exceptions are thrown when trying 
to query a virtual table
 Key: CASSANDRA-16155
 URL: https://issues.apache.org/jira/browse/CASSANDRA-16155
 Project: Cassandra
  Issue Type: Bug
Reporter: Alex Petrov
Assignee: Alex Petrov


Start a fresh trunk node, and try to run

SELECT * FROM system_views.local_read_latency ;

You’ll get: 

ERROR [Native-Transport-Requests-1] 2020-09-30 09:44:45,099 
ErrorMessage.java:457 - Unexpected exception during request
java.lang.ClassCastException: 
org.apache.cassandra.db.marshal.ByteBufferAccessor cannot be cast to 
java.lang.String
        at 
org.apache.cassandra.serializers.AbstractTextSerializer.serialize(AbstractTextSerializer.java:29)
        at 
org.apache.cassandra.db.marshal.AbstractType.decompose(AbstractType.java:131)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15584) 4.0 quality testing: Tooling - External Ecosystem

2020-10-01 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-15584:
--
Description: 
Reference [doc from 
NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#]
 for context.

*Shepherd: Benjamin Lerer*

Many users of Apache Cassandra employ open source tooling to automate Cassandra 
configuration, runtime management, and repair scheduling. Prior to release, we 
need to confirm that popular third-party tools function properly. 

Current list of tools:
|| Name || Status || Contact ||
| [Priam|http://netflix.github.io/Priam/] | *NOT STARTED* | 
[~sumanth.pasupuleti]| 
| [sstabletools|https://github.com/instaclustr/cassandra-sstable-tools] | *NOT 
STARTED* | [~stefan.miklosovic]| 
| [cassandra-exporter|https://github.com/instaclustr/cassandra-exporter]| *NOT 
STARTED* | [~stefan.miklosovic]|
| [Cassandra operator|https://github.com/instaclustr/cassandra-operator]|  
{color:#00875A}*DONE*{color} | [~stefan.miklosovic]|
| [Cassandra SSTable generator | 
https://github.com/instaclustr/cassandra-sstable-generator]|{color:#00875A}*DONE*{color}|
 [~stefan.miklosovic]|
| [Reaper|http://cassandra-reaper.io/]| {color:#00875A}*AUTOMATIC*{color} | 
[~adejanovski]|
| [Medusa|https://github.com/thelastpickle/cassandra-medusa]| *NOT STARTED*| 
[~adejanovski]|
| [Casskop|https://orange-opensource.github.io/casskop/]| *NOT STARTED*| Franck 
Dehay|
| 
[spark-cassandra-connector|https://github.com/datastax/spark-cassandra-connector]|
 {color:#00875A}*DONE*{color}| [~jtgrabowski]|
| [cass operator|https://github.com/datastax/cass-operator]| 
{color:#00875A}*DONE*{color}| [~jimdickinson]|
| [metric 
collector|https://github.com/datastax/metric-collector-for-apache-cassandra]| 
{color:#00875A}*DONE*{color}| [~tjake]|
| [managment 
API|https://github.com/datastax/management-api-for-apache-cassandra]| 
{color:#00875A}*DONE*{color}| [~tjake]|  

Columns descriptions:
* *Name*: Name and link to the tool official page
* *Status*: {{NOT STARTED}}, {{IN PROGRESS}}, {{BLOCKED}} if you hit any issue 
and have to wait for it to be solved, {{DONE}}, {{AUTOMATIC}} if testing 4.0 is 
part of your CI process.
* *Contact*: The person acting as the contact point for that tool. 

  was:
Reference [doc from 
NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#]
 for context.

*Shepherd: Benjamin Lerer*

Many users of Apache Cassandra employ open source tooling to automate Cassandra 
configuration, runtime management, and repair scheduling. Prior to release, we 
need to confirm that popular third-party tools function properly. 

Current list of tools:
|| Name || Status || Contact ||
| [Priam|http://netflix.github.io/Priam/] | *NOT STARTED* | 
[~sumanth.pasupuleti]| 
|[sstabletools|https://github.com/instaclustr/cassandra-sstable-tools] | *NOT 
STARTED* | [~stefan.miklosovic]| 
| [cassandra-exporter|https://github.com/instaclustr/cassandra-exporter]| *NOT 
STARTED* | [~stefan.miklosovic]|
| [Cassandra operator|https://github.com/instaclustr/cassandra-operator]|  
{color:#00875A}*DONE*{color} | [~stefan.miklosovic]|
| [Reaper|http://cassandra-reaper.io/]| {color:#00875A}*AUTOMATIC*{color} | 
[~adejanovski]|
| [Medusa|https://github.com/thelastpickle/cassandra-medusa]| *NOT STARTED*| 
[~adejanovski]|
| [Casskop|https://orange-opensource.github.io/casskop/]| *NOT STARTED*| Franck 
Dehay|
| 
[spark-cassandra-connector|https://github.com/datastax/spark-cassandra-connector]|
 {color:#00875A}*DONE*{color}| [~jtgrabowski]|
| [cass operator|https://github.com/datastax/cass-operator]| 
{color:#00875A}*DONE*{color}| [~jimdickinson]|
| [metric 
collector|https://github.com/datastax/metric-collector-for-apache-cassandra]| 
{color:#00875A}*DONE*{color}| [~tjake]|
| [managment 
API|https://github.com/datastax/management-api-for-apache-cassandra]| 
{color:#00875A}*DONE*{color}| [~tjake]|  

Columns descriptions:
* *Name*: Name and link to the tool official page
* *Status*: {{NOT STARTED}}, {{IN PROGRESS}}, {{BLOCKED}} if you hit any issue 
and have to wait for it to be solved, {{DONE}}, {{AUTOMATIC}} if testing 4.0 is 
part of your CI process.
* *Contact*: The person acting as the contact point for that tool. 


> 4.0 quality testing: Tooling - External Ecosystem
> -
>
> Key: CASSANDRA-15584
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15584
> Project: Cassandra
>  Issue Type: Task
>  Components: Tool/external
>Reporter: Josh McKenzie
>Assignee: Benjamin Lerer
>Priority: Normal
> Fix For: 4.0-rc
>
>
> Reference [doc from 
> NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#]
>  for context.
> *Shepherd: Benjamin Lerer*
> Many users of 

[jira] [Commented] (CASSANDRA-15538) 4.0 quality testing: Local Read/Write Path: Other Areas

2020-10-01 Thread Josh McKenzie (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205512#comment-17205512
 ] 

Josh McKenzie commented on CASSANDRA-15538:
---

{quote}if you look at (true) unit testing for the classes that constitute the 
read/write path, there isn't much

those path are mostly covered, but by "integration/functional" tests.
{quote}
So there's an opportunity for us to test this claim with code coverage, 
assuming we can get the aggregate of unit + dtest + in-jvm dtest in a single 
report. It looks like our coverage / jococo task in build.xml defaults to just 
the equivalent of 'ant test' which will tell us where our gaps are from unit 
testing but not much else.

[~aleksey] - you still planning on shepherding this? If so, have a point of 
view on the thoughts given Sylvain's thoughts here?

Another interesting point:
{quote}biggest bucket (of bugs)... 'legacy layout conversions/handling'. And 
that was clearly under-tested, but it's also gone in 4.0
{quote}
While I think there's some merit in that observation, it also makes the case 
for us doing more robust testing and coverage analysis of how much our tests 
actually exercise the legacy layout code in the 3.0 and 3.11 line. I'd advocate 
for us opening another ticket for that rather than coupling it with the 4.0 
line, but certainly seems like it'd be valuable to at least know where our gaps 
and risk are.

> 4.0 quality testing: Local Read/Write Path: Other Areas
> ---
>
> Key: CASSANDRA-15538
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15538
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest/java, Test/dtest/python
>Reporter: Josh McKenzie
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Reference [doc from 
> NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#]
>  for context.
> *Shepherd: Aleksey Yeschenko*
> Testing in this area refers to the local read/write path (StorageProxy, 
> ColumnFamilyStore, Memtable, SSTable reading/writing, etc). We are still 
> finding numerous bugs and issues with the 3.0 storage engine rewrite 
> (CASSANDRA-8099). For 4.0 we want to ensure that we thoroughly cover the 
> local read/write path with techniques such as property-based testing, fuzzing 
> ([example|http://cassandra.apache.org/blog/2018/10/17/finding_bugs_with_property_based_testing.html]),
>  and a source audit.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15580) 4.0 quality testing: Repair

2020-10-01 Thread Josh McKenzie (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205502#comment-17205502
 ] 

Josh McKenzie commented on CASSANDRA-15580:
---

{quote}(full range, sub range, incremental)

4.0 only + mixed-version (3.11.x + 4.0)
{quote}
My bid is that we test (and automate and integrate in ci) the combination of 
the 6 states above where not yet covered. I'm pretty sure reaper validation is 
covered by CASSANDRA-15584. I'd bid for building these tests in 
[fallout|[https://github.com/datastax/fallout]] specifically to get nemeses and 
adverse cluster states in play (packet loss, node down, etc) during the repair 
process as well to ensure the repair process works as expected. Fallout's 
currently ASLv2 though not contributed to the project and ASF governance (much 
like ccm) though that's more because of a lack of conversation / appetite to 
take it on than anything. Happy to donate it to the project if other devs were 
interested.

If we don't go the fallout route, we'd need to chew on another longer running 
test automation framework running real clusters w/gen + validation. Either way, 
I think either time-based (1 hour gen + validation) or size-based (xGB 
workload) would be appropriate to make sure we have confidence in the extent to 
which we exercise the work.

What do you think [~bdeggleston]?

> 4.0 quality testing: Repair
> ---
>
> Key: CASSANDRA-15580
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15580
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest/python
>Reporter: Josh McKenzie
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Reference [doc from 
> NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#]
>  for context.
> *Shepherd: Blake Eggleston*
> We aim for 4.0 to have the first fully functioning incremental repair 
> solution (CASSANDRA-9143)! Furthermore we aim to verify that all types of 
> repair: (full range, sub range, incremental) function as expected as well as 
> ensuring community tools such as Reaper work. CASSANDRA-3200 adds an 
> experimental option to reduce the amount of data streamed during repair, we 
> should write more tests and see how it works with big nodes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15585) 4.0 quality testing: Test Frameworks, Tooling, Infra / Automation

2020-10-01 Thread Josh McKenzie (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205493#comment-17205493
 ] 

Josh McKenzie commented on CASSANDRA-15585:
---

Pulling directly from the NGCC discussion doc, I think these are the items that 
apply to this ticket:
{quote} # Targeting Fuzzing (Integration)
 # Have all the targeted fuzz test tooling executed as a pre-release task prior 
to all new builds being published.
 # Integration with Circle CI.
 # Ability to launch for execution on any platform (launch script) to enable 
at-scale execution on elastic compute infra.

 # Netflix Fuzz testing
 # Writes data with checksum, later reads out to validate.{quote}
Given the description: _and automation enabling those tools to be applied at 
scale (e.g., replay testing via Spark-based replay of captured FQL logs)._

As far as I know, there's not plans to open source anything wiring FQLTool and 
cassandra-diff together, though [~gianluca] has been working on a project we're 
going to be open sourcing within a couple weeks that ties generative workload 
fuzzing with gemini and nosqlbench based on anonymized schemas and using 
[cassandra-diff|[https://github.com/apache/cassandra-diffhttp://example.com|https://github.com/apache/cassandra-diff]]
 to verify cluster state after.

 

So [~jwest] any thoughts on how this ticket / effort may have evolved to today? 
in-jvm dtests are working well and, while FQLTool *exists* I don't know that 
it's integrated with our primary ci/cd pipeline.

 

We don't have a corpus of real schemas and/or workloads publicly available to 
the project as yet to integrate as a pre-release checkpoint. Ultimately, I 
think the project I alluded to above checks the box on 1 of the above 4 items 
but I don't know how realistic it is for us to have fuzzing integrated as a 
pre-release action unless there are folks with workloads or schemas that are 
currently not open source they'll be able to bring forward. 

> 4.0 quality testing: Test Frameworks, Tooling, Infra / Automation
> -
>
> Key: CASSANDRA-15585
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15585
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest/python
>Reporter: Josh McKenzie
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Reference [doc from 
> NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#]
>  for context.
> *Shepherd: Jordan West*
> This area refers to contributions to test frameworks/tooling (e.g., dtests, 
> QuickTheories, CASSANDRA-14821), and automation enabling those tools to be 
> applied at scale (e.g., replay testing via Spark-based replay of captured FQL 
> logs).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra-builds] branch master updated: In Jenkins, don't do a full docker system prune if cassandra-artifact.sh or jenkinscommand.sh is running

2020-10-01 Thread mck
This is an automated email from the ASF dual-hosted git repository.

mck pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/cassandra-builds.git


The following commit(s) were added to refs/heads/master by this push:
 new 6b3c726  In Jenkins, don't do a full docker system prune if 
cassandra-artifact.sh or jenkinscommand.sh is running
6b3c726 is described below

commit 6b3c7266f340b7a80eda701a8cf43a0efe08ecab
Author: Mick Semb Wever 
AuthorDate: Thu Oct 1 11:07:42 2020 +0200

In Jenkins, don't do a full docker system prune if cassandra-artifact.sh or 
jenkinscommand.sh is running

The cassandra-artifact.sh recently added docker usage when adding 
cassandra-*-packaging.sh (deb|rpm).
This have infrequently crashed with `unknown parent image ID sha256:…`

 patch by Mick Semb Wever; reviewed by Berenguer Blasi
---
 jenkins-dsl/cassandra_job_dsl_seed.groovy | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/jenkins-dsl/cassandra_job_dsl_seed.groovy 
b/jenkins-dsl/cassandra_job_dsl_seed.groovy
index 7684f32..1927427 100644
--- a/jenkins-dsl/cassandra_job_dsl_seed.groovy
+++ b/jenkins-dsl/cassandra_job_dsl_seed.groovy
@@ -318,9 +318,10 @@ matrixJob('Cassandra-template-dtest-matrix') {
 }
 }
 postBuildTask {
+// the pgrep needs to catch any other build/process that is using 
docker
 task('.', """
 echo "Cleaning project…"; git clean -xdff ;
-echo "Pruning docker…" ; if pgrep -af jenkinscommand.sh; then 
docker system prune -f --filter 'until=${maxJobHours}h'; else docker system 
prune -f --volumes ; fi;
+echo "Pruning docker…" ; if pgrep -af 
"cassandra-artifacts.sh|jenkinscommand.sh"; then docker system prune -f 
--filter 'until=${maxJobHours}h'; else docker system prune -f --volumes ; fi;
 echo "Reporting disk usage…"; df -h ; du -hs ../* ; du -hs 
../../* ;
 echo "Cleaning tmp…";
 find . -type d -name tmp -delete 2>/dev/null ;
@@ -824,9 +825,10 @@ dtestTargets.each {
 }
 archiveJunit('nosetests.xml')
 postBuildTask {
+// the pgrep needs to catch any other build/process that is 
using docker
 task('.', """
 echo "Cleaning project…" ; git clean -xdff ;
-echo "Pruning docker…" ; if pgrep -af jenkinscommand.sh; 
then docker system prune -f --filter "until=${maxJobHours}h"; else docker 
system prune -f --volumes ; fi;
+echo "Pruning docker…" ; if pgrep -af 
"cassandra-artifacts.sh|jenkinscommand.sh"; then docker system prune -f 
--filter "until=${maxJobHours}h"; else docker system prune -f --volumes ; fi;
 echo "Reporting disk usage…"; df -h ; du -hs ../* ; du -hs 
../../* ;
 echo "Cleaning tmp…";
 find . -type d -name tmp -delete 2>/dev/null ;


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15583) 4.0 quality testing: Tooling, Bundled and First Party

2020-10-01 Thread Berenguer Blasi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205409#comment-17205409
 ] 

Berenguer Blasi edited comment on CASSANDRA-15583 at 10/1/20, 10:35 AM:


[~samt] CASSANDRA-15991 is done thx to David's review. As you're the shepherd 
of this ticket I'd like to consult sthg with you. We have 2 options:
A. Call this ticket done. We have put in place testing utils to test tooling. 
Also we have seeded all tools with initial user interface testing. Now it's a 
matter of completing the per tool specific tickets. If sbdy disagrees it can be 
reopened down the line.
B. This ticket is not done. It will be done when all the specific tool tickets 
have been completed.

I don't know in what 'spirit' this ticket was raised. But I think 'A' is a 
reasonable approach as 'B' can't be done in a timely manner. wdyt?


was (Author: bereng):
[~samt] CASSANDRA-15991 thx to David's review. As you're the shepherd of this 
ticket I'd like to consult sthg with you. We have 2 options:
A. Call this ticket done. We have put in place testing utils to test tooling. 
Also we have seeded all tools with initial user interface testing. Now it's a 
matter of completing the per tool specific tickets. If sbdy disagrees it can be 
reopened down the line.
B. This ticket is not done. It will be done when all the specific tool tickets 
have been completed.

I don't know in what 'spirit' this ticket was raised. But I think 'A' is a 
reasonable approach as 'B' can't be done in a timely manner. wdyt?

> 4.0 quality testing: Tooling, Bundled and First Party
> -
>
> Key: CASSANDRA-15583
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15583
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest/python, Test/unit
>Reporter: Josh McKenzie
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Reference [doc from 
> NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#]
>  for context.
> *Shepherd: Sam Tunnicliffe*
> Test plans should cover bundled first-party tooling and CLIs such as 
> nodetool, cqlsh, and new tools supporting full query and audit logging 
> (CASSANDRA-13983, CASSANDRA-12151).
> *Progress as of Aug 2020*
> {{ToolRunner}} has been added enabling us to test tools in java unit tests. 
> This includes capturing their stdout/err and stdin i.e. Most tools have a 
> starting unit test testing their cmd line args happy path. Tickets have been 
> created to improve coverage of those  and flagged LHF. Also for those tools 
> big enough they can't be addressed in a simple ticket such as nodetool, a 
> placeholder ticket for future improvements has been created as well. Tickets 
> and status are:
> ||Tool||UX test||UT coverage||dtest coverage||Comments||
> |Nodetool|(x)|(x) CASSANDRA-16026|(!)|Not all the sub commands are tested. 
> Dtest also test nodetool as a side effect|
> |Cqlsh|(x)|(x) CASSANDRA-16025|(!)| |
> |Cassandra-stress|(x)|(x) CASSANDRA-16024|(x)| |
> |debug-cql|(x)|(x) CASSANDRA-16023|(x)| |
> |fqltool|(x)|(/) CASSANDRA-16022|(!)| |
> |auditlogviewer|(/) CASSANDRA-15991|(!) CASSANDRA-16021|(!)| |
> |*Sstable utilities*| | | | |
> |sstabledump|(/) CASSANDRA-15991|(/) CASSANDRA-16020|(!)| |
> |sstableexpiredblockers|(/) CASSANDRA-15991|(x) CASSANDRA-16019|(!)| |
> |sstablelevelreset|(/) CASSANDRA-15991|(x) CASSANDRA-16018|(!)| |
> |sstableloader|(x)|(x) CASSANDRA-16017|(!)| |
> |sstablemetadata|(/) CASSANDRA-15991|(x) CASSANDRA-16016|(x)|Ran in dtests, 
> no dedicated test|
> |sstableofflinerelevel|(/) CASSANDRA-15991|(x) CASSANDRA-16015|(!)| |
> |sstablerepairedset|(/) CASSANDRA-15991|(x) CASSANDRA-16014|(x)|Ran in 
> dtests, no dedicated test|
> |sstablescrub|(/) CASSANDRA-15991|(x) CASSANDRA-16013|(!)| |
> |sstablesplit|(/) CASSANDRA-15991|(x) CASSANDRA-16012|(!)| |
> |sstableupgrade|(/) CASSANDRA-15991|(x) CASSANDRA-16011|(!)| |
> |sstableutil|(/) CASSANDRA-15991|(x) CASSANDRA-16010|(!)| |
> |sstableverify|(/) CASSANDRA-15991|(x) CASSANDRA-16009|(!)| |



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15583) 4.0 quality testing: Tooling, Bundled and First Party

2020-10-01 Thread Berenguer Blasi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205409#comment-17205409
 ] 

Berenguer Blasi commented on CASSANDRA-15583:
-

[~samt] CASSANDRA-15991 thx to David's review. As you're the shepherd of this 
ticket I'd like to consult sthg with you. We have 2 options:
A. Call this ticket done. We have put in place testing utils to test tooling. 
Also we have seeded all tools with initial user interface testing. Now it's a 
matter of completing the per tool specific tickets. If sbdy disagrees it can be 
reopened down the line.
B. This ticket is not done. It will be done when all the specific tool tickets 
have been completed.

I don't know in what 'spirit' this ticket was raised. But I think 'A' is a 
reasonable approach as 'B' can't be done in a timely manner. wdyt?

> 4.0 quality testing: Tooling, Bundled and First Party
> -
>
> Key: CASSANDRA-15583
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15583
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest/python, Test/unit
>Reporter: Josh McKenzie
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Reference [doc from 
> NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#]
>  for context.
> *Shepherd: Sam Tunnicliffe*
> Test plans should cover bundled first-party tooling and CLIs such as 
> nodetool, cqlsh, and new tools supporting full query and audit logging 
> (CASSANDRA-13983, CASSANDRA-12151).
> *Progress as of Aug 2020*
> {{ToolRunner}} has been added enabling us to test tools in java unit tests. 
> This includes capturing their stdout/err and stdin i.e. Most tools have a 
> starting unit test testing their cmd line args happy path. Tickets have been 
> created to improve coverage of those  and flagged LHF. Also for those tools 
> big enough they can't be addressed in a simple ticket such as nodetool, a 
> placeholder ticket for future improvements has been created as well. Tickets 
> and status are:
> ||Tool||UX test||UT coverage||dtest coverage||Comments||
> |Nodetool|(x)|(x) CASSANDRA-16026|(!)|Not all the sub commands are tested. 
> Dtest also test nodetool as a side effect|
> |Cqlsh|(x)|(x) CASSANDRA-16025|(!)| |
> |Cassandra-stress|(x)|(x) CASSANDRA-16024|(x)| |
> |debug-cql|(x)|(x) CASSANDRA-16023|(x)| |
> |fqltool|(x)|(/) CASSANDRA-16022|(!)| |
> |auditlogviewer|(/) CASSANDRA-15991|(!) CASSANDRA-16021|(!)| |
> |*Sstable utilities*| | | | |
> |sstabledump|(/) CASSANDRA-15991|(/) CASSANDRA-16020|(!)| |
> |sstableexpiredblockers|(/) CASSANDRA-15991|(x) CASSANDRA-16019|(!)| |
> |sstablelevelreset|(/) CASSANDRA-15991|(x) CASSANDRA-16018|(!)| |
> |sstableloader|(x)|(x) CASSANDRA-16017|(!)| |
> |sstablemetadata|(/) CASSANDRA-15991|(x) CASSANDRA-16016|(x)|Ran in dtests, 
> no dedicated test|
> |sstableofflinerelevel|(/) CASSANDRA-15991|(x) CASSANDRA-16015|(!)| |
> |sstablerepairedset|(/) CASSANDRA-15991|(x) CASSANDRA-16014|(x)|Ran in 
> dtests, no dedicated test|
> |sstablescrub|(/) CASSANDRA-15991|(x) CASSANDRA-16013|(!)| |
> |sstablesplit|(/) CASSANDRA-15991|(x) CASSANDRA-16012|(!)| |
> |sstableupgrade|(/) CASSANDRA-15991|(x) CASSANDRA-16011|(!)| |
> |sstableutil|(/) CASSANDRA-15991|(x) CASSANDRA-16010|(!)| |
> |sstableverify|(/) CASSANDRA-15991|(x) CASSANDRA-16009|(!)| |



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16012) sstablesplit unit test hardening

2020-10-01 Thread Berenguer Blasi (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Berenguer Blasi updated CASSANDRA-16012:

Test and Documentation Plan: See PR
 Status: Patch Available  (was: In Progress)

> sstablesplit unit test hardening
> 
>
> Key: CASSANDRA-16012
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16012
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/sstable
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
>  Labels: low-hanging-fruit
> Fix For: 4.0-beta
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
>  
> During CASSANDRA-15883 / CASSANDRA-15991 it was detected unit test coverage 
> for this tool is minimal. There is a unit test to enhance upon under 
> {{test/unit/org/apache/cassandra/tools}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16154) OOM Error (Direct buffer memory) during intensive reading from large SSTables

2020-10-01 Thread Vygantas Gedgaudas (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205331#comment-17205331
 ] 

Vygantas Gedgaudas commented on CASSANDRA-16154:


Cassandra version used: 3.11.0.

> OOM Error (Direct buffer memory) during intensive reading from large SSTables
> -
>
> Key: CASSANDRA-16154
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16154
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Vygantas Gedgaudas
>Priority: Normal
>
> Hello,
> We have a certain database, from when we are reading intensively leads to the 
> following OOM error:
> {noformat}
>  java.lang.OutOfMemoryError: Direct buffer memory
>  at java.nio.Bits.reserveMemory(Bits.java:694) ~[na:1.8.0_212]
>  at java.nio.DirectByteBuffer.(DirectByteBuffer.java:123) 
> ~[na:1.8.0_212]
>  at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311) ~[na:1.8.0_212]
>  at 
> org.apache.cassandra.utils.memory.BufferPool.allocate(BufferPool.java:110) 
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at 
> org.apache.cassandra.utils.memory.BufferPool.access$1000(BufferPool.java:46) 
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at 
> org.apache.cassandra.utils.memory.BufferPool$LocalPool.allocate(BufferPool.java:407)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at 
> org.apache.cassandra.utils.memory.BufferPool$LocalPool.access$000(BufferPool.java:334)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at 
> org.apache.cassandra.utils.memory.BufferPool.takeFromPool(BufferPool.java:122)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at org.apache.cassandra.utils.memory.BufferPool.get(BufferPool.java:94) 
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at org.apache.cassandra.cache.ChunkCache.load(ChunkCache.java:155) 
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at org.apache.cassandra.cache.ChunkCache.load(ChunkCache.java:39) 
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at 
> com.github.benmanes.caffeine.cache.BoundedLocalCache$BoundedLocalLoadingCache.lambda$new$0(BoundedLocalCache.java:2949)
>  ~[caffeine-2.2.6.jar:na]
>  at 
> com.github.benmanes.caffeine.cache.BoundedLocalCache.lambda$doComputeIfAbsent$15(BoundedLocalCache.java:1807)
>  ~[caffeine-2.2.6.jar:na]
>  at 
> java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853) 
> ~[na:1.8.0_212]
>  at 
> com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:1805)
>  ~[caffeine-2.2.6.jar:na]
>  at 
> com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:1788)
>  ~[caffeine-2.2.6.jar:na]
>  at 
> com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:97)
>  ~[caffeine-2.2.6.jar:na]
>  at 
> com.github.benmanes.caffeine.cache.LocalLoadingCache.get(LocalLoadingCache.java:66)
>  ~[caffeine-2.2.6.jar:na]
>  at 
> org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:235)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at 
> org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:213)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at 
> org.apache.cassandra.io.util.RandomAccessReader.reBufferAt(RandomAccessReader.java:65)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at 
> org.apache.cassandra.io.util.RandomAccessReader.seek(RandomAccessReader.java:207)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at org.apache.cassandra.io.util.FileHandle.createReader(FileHandle.java:150) 
> ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at 
> org.apache.cassandra.io.sstable.format.SSTableReader.getFileDataInput(SSTableReader.java:1767)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at 
> org.apache.cassandra.db.columniterator.AbstractSSTableIterator.(AbstractSSTableIterator.java:103)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at 
> org.apache.cassandra.db.columniterator.SSTableIterator.(SSTableIterator.java:49)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at 
> org.apache.cassandra.io.sstable.format.big.BigTableReader.iterator(BigTableReader.java:72)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at 
> org.apache.cassandra.io.sstable.format.big.BigTableReader.iterator(BigTableReader.java:65)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorWithLowerBound.initializeIterator(UnfilteredRowIteratorWithLowerBound.java:107)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at 
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.maybeInit(LazilyInitializedUnfilteredRowIterator.java:48)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorWithLowerBound.getPartitionIndexLowerBound(UnfilteredRowIteratorWithLowerBound.java:191)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
>  at 
> 

[jira] [Created] (CASSANDRA-16154) OOM Error (Direct buffer memory) during intensive reading from large SSTables

2020-10-01 Thread Vygantas Gedgaudas (Jira)
Vygantas Gedgaudas created CASSANDRA-16154:
--

 Summary: OOM Error (Direct buffer memory) during intensive reading 
from large SSTables
 Key: CASSANDRA-16154
 URL: https://issues.apache.org/jira/browse/CASSANDRA-16154
 Project: Cassandra
  Issue Type: Bug
Reporter: Vygantas Gedgaudas


Hello,

We have a certain database, from when we are reading intensively leads to the 
following OOM error:
{noformat}
 java.lang.OutOfMemoryError: Direct buffer memory
 at java.nio.Bits.reserveMemory(Bits.java:694) ~[na:1.8.0_212]
 at java.nio.DirectByteBuffer.(DirectByteBuffer.java:123) ~[na:1.8.0_212]
 at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311) ~[na:1.8.0_212]
 at org.apache.cassandra.utils.memory.BufferPool.allocate(BufferPool.java:110) 
~[apache-cassandra-3.11.0.jar:3.11.0]
 at 
org.apache.cassandra.utils.memory.BufferPool.access$1000(BufferPool.java:46) 
~[apache-cassandra-3.11.0.jar:3.11.0]
 at 
org.apache.cassandra.utils.memory.BufferPool$LocalPool.allocate(BufferPool.java:407)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
 at 
org.apache.cassandra.utils.memory.BufferPool$LocalPool.access$000(BufferPool.java:334)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
 at 
org.apache.cassandra.utils.memory.BufferPool.takeFromPool(BufferPool.java:122) 
~[apache-cassandra-3.11.0.jar:3.11.0]
 at org.apache.cassandra.utils.memory.BufferPool.get(BufferPool.java:94) 
~[apache-cassandra-3.11.0.jar:3.11.0]
 at org.apache.cassandra.cache.ChunkCache.load(ChunkCache.java:155) 
~[apache-cassandra-3.11.0.jar:3.11.0]
 at org.apache.cassandra.cache.ChunkCache.load(ChunkCache.java:39) 
~[apache-cassandra-3.11.0.jar:3.11.0]
 at 
com.github.benmanes.caffeine.cache.BoundedLocalCache$BoundedLocalLoadingCache.lambda$new$0(BoundedLocalCache.java:2949)
 ~[caffeine-2.2.6.jar:na]
 at 
com.github.benmanes.caffeine.cache.BoundedLocalCache.lambda$doComputeIfAbsent$15(BoundedLocalCache.java:1807)
 ~[caffeine-2.2.6.jar:na]
 at java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853) 
~[na:1.8.0_212]
 at 
com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:1805)
 ~[caffeine-2.2.6.jar:na]
 at 
com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:1788)
 ~[caffeine-2.2.6.jar:na]
 at 
com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:97)
 ~[caffeine-2.2.6.jar:na]
 at 
com.github.benmanes.caffeine.cache.LocalLoadingCache.get(LocalLoadingCache.java:66)
 ~[caffeine-2.2.6.jar:na]
 at 
org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:235)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
 at 
org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:213)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
 at 
org.apache.cassandra.io.util.RandomAccessReader.reBufferAt(RandomAccessReader.java:65)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
 at 
org.apache.cassandra.io.util.RandomAccessReader.seek(RandomAccessReader.java:207)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
 at org.apache.cassandra.io.util.FileHandle.createReader(FileHandle.java:150) 
~[apache-cassandra-3.11.0.jar:3.11.0]
 at 
org.apache.cassandra.io.sstable.format.SSTableReader.getFileDataInput(SSTableReader.java:1767)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
 at 
org.apache.cassandra.db.columniterator.AbstractSSTableIterator.(AbstractSSTableIterator.java:103)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
 at 
org.apache.cassandra.db.columniterator.SSTableIterator.(SSTableIterator.java:49)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
 at 
org.apache.cassandra.io.sstable.format.big.BigTableReader.iterator(BigTableReader.java:72)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
 at 
org.apache.cassandra.io.sstable.format.big.BigTableReader.iterator(BigTableReader.java:65)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
 at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorWithLowerBound.initializeIterator(UnfilteredRowIteratorWithLowerBound.java:107)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
 at 
org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.maybeInit(LazilyInitializedUnfilteredRowIterator.java:48)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
 at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorWithLowerBound.getPartitionIndexLowerBound(UnfilteredRowIteratorWithLowerBound.java:191)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
 at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorWithLowerBound.lowerBound(UnfilteredRowIteratorWithLowerBound.java:88)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
 at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorWithLowerBound.lowerBound(UnfilteredRowIteratorWithLowerBound.java:47)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
 at 
org.apache.cassandra.utils.MergeIterator$Candidate.(MergeIterator.java:362)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
 at 
org.apache.cassandra.utils.MergeIterator$ManyToOne.(MergeIterator.java:147)
 

[jira] [Updated] (CASSANDRA-16105) InvalidQuery when datetime string format is not zero padded

2020-10-01 Thread Marcus Eriksson (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-16105:

Description: 
With CASSANDRA-15976, Cassandra no longer accepts certain datetime string 
formats that it used to accept before:

{code:java}
Unable to parse a date/time from '2020-09-03 9:00:00+'
{code}

In this example, {{2020-09-03 9:00:00+}} is not accepted in 4.0-beta2 but 
it is accepted in previous versions (I tested this with 4.0-beta1 and 3.11.4). 
If I add a zero so that it becomes {{2020-09-03 09:00:00+}} then it is 
accepted in all of the 3 mentioned versions (note the zero padded time part - 
{{9:00:00}} vs {{09:00:00}})

  was:
+underlined text+With CASSANDRA-15976, Cassandra no longer accepts certain 
datetime string formats that it used to accept before:

{code:java}
Unable to parse a date/time from '2020-09-03 9:00:00+'
{code}

In this example, {{2020-09-03 9:00:00+}} is not accepted in 4.0-beta2 but 
it is accepted in previous versions (I tested this with 4.0-beta1 and 3.11.4). 
If I add a zero so that it becomes {{2020-09-03 09:00:00+}} then it is 
accepted in all of the 3 mentioned versions (note the zero padded time part - 
{{9:00:00}} vs {{09:00:00}})


> InvalidQuery when datetime string format is not zero padded
> ---
>
> Key: CASSANDRA-16105
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16105
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL/Interpreter
>Reporter: João Reis
>Assignee: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta3
>
>
> With CASSANDRA-15976, Cassandra no longer accepts certain datetime string 
> formats that it used to accept before:
> {code:java}
> Unable to parse a date/time from '2020-09-03 9:00:00+'
> {code}
> In this example, {{2020-09-03 9:00:00+}} is not accepted in 4.0-beta2 but 
> it is accepted in previous versions (I tested this with 4.0-beta1 and 
> 3.11.4). If I add a zero so that it becomes {{2020-09-03 09:00:00+}} then 
> it is accepted in all of the 3 mentioned versions (note the zero padded time 
> part - {{9:00:00}} vs {{09:00:00}})



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16105) InvalidQuery when datetime string format is not zero padded

2020-10-01 Thread Marcus Eriksson (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-16105:

Description: 
+underlined text+With CASSANDRA-15976, Cassandra no longer accepts certain 
datetime string formats that it used to accept before:

{code:java}
Unable to parse a date/time from '2020-09-03 9:00:00+'
{code}

In this example, {{2020-09-03 9:00:00+}} is not accepted in 4.0-beta2 but 
it is accepted in previous versions (I tested this with 4.0-beta1 and 3.11.4). 
If I add a zero so that it becomes {{2020-09-03 09:00:00+}} then it is 
accepted in all of the 3 mentioned versions (note the zero padded time part - 
{{9:00:00}} vs {{09:00:00}})

  was:
With CASSANDRA-15976, Cassandra no longer accepts certain datetime string 
formats that it used to accept before:

{code:java}
Unable to parse a date/time from '2020-09-03 9:00:00+'
{code}

In this example, {{2020-09-03 9:00:00+}} is not accepted in 4.0-beta2 but 
it is accepted in previous versions (I tested this with 4.0-beta1 and 3.11.4). 
If I add a zero so that it becomes {{2020-09-03 09:00:00+}} then it is 
accepted in all of the 3 mentioned versions (note the zero padded time part - 
{{9:00:00}} vs {{09:00:00}})


> InvalidQuery when datetime string format is not zero padded
> ---
>
> Key: CASSANDRA-16105
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16105
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL/Interpreter
>Reporter: João Reis
>Assignee: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta3
>
>
> +underlined text+With CASSANDRA-15976, Cassandra no longer accepts certain 
> datetime string formats that it used to accept before:
> {code:java}
> Unable to parse a date/time from '2020-09-03 9:00:00+'
> {code}
> In this example, {{2020-09-03 9:00:00+}} is not accepted in 4.0-beta2 but 
> it is accepted in previous versions (I tested this with 4.0-beta1 and 
> 3.11.4). If I add a zero so that it becomes {{2020-09-03 09:00:00+}} then 
> it is accepted in all of the 3 mentioned versions (note the zero padded time 
> part - {{9:00:00}} vs {{09:00:00}})



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16153) Cassandra 4b2 - JVM options from *.options not read/set

2020-10-01 Thread Thomas Steinmaurer (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Steinmaurer updated CASSANDRA-16153:
---
Description: 
Trying out Cassandra 4 beta 2 with Java 8 (AdoptOpenJDK) in AWS.
{noformat}
NAME="Amazon Linux AMI"
VERSION="2018.03"
ID="amzn"
ID_LIKE="rhel fedora"
VERSION_ID="2018.03"
PRETTY_NAME="Amazon Linux AMI 2018.03"
ANSI_COLOR="0;33"
CPE_NAME="cpe:/o:amazon:linux:2018.03:ga"
HOME_URL="http://aws.amazon.com/amazon-linux-ami/;
{noformat}

It seems the Cassandra JVM results in using Parallel GC.
{noformat}
INFO  [Service Thread] 2020-10-01 00:00:56,233 GCInspector.java:299 - PS 
Scavenge GC in 541ms.  PS Old Gen: 5152844776 -> 5726724752;
WARN  [Service Thread] 2020-10-01 00:00:56,234 GCInspector.java:297 - PS 
MarkSweep GC in 1969ms.  PS Eden Space: 2111307776 -> 0; PS Old Gen: 5726724752 
-> 2581334376; PS Survivor Space: 363850224 -> 0
{noformat}

Although {{jvm8-server.options}} is using CMS.
{noformat}
#
#  GC SETTINGS  #
#

### CMS Settings
-XX:+UseParNewGC
-XX:+UseConcMarkSweepGC
-XX:+CMSParallelRemarkEnabled
-XX:SurvivorRatio=8
-XX:MaxTenuringThreshold=1
-XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly
-XX:CMSWaitDuration=1
-XX:+CMSParallelInitialMarkEnabled
-XX:+CMSEdenChunksRecordAlways
## some JVMs will fill up their heap when accessed via JMX, see CASSANDRA-6541
-XX:+CMSClassUnloadingEnabled
...
{noformat}

In Cassandra 3, default has been CMS.

So, possibly there is something wrong in reading/processing 
{{jvm8-server.options}}?

  was:
Trying out Cassandra 4 beta 2 with Java 8 (AdoptOpenJDK) on Ubuntu 18.04 LTS. 
It seems the Cassandra JVM results in using Parallel GC. In Cassandra 3, 
default has been CMS.

Digging a bit further, it seems like the {{jvm8-server.options}} resp. 
{{jvm11-server.options}} files aren't used/processed in e.g. 
{{cassandra-env.sh}}.

E.g. in Cassandra 3.11, here we something like that in {{cassandra-env.sh}}.
{noformat}
# Read user-defined JVM options from jvm.options file
JVM_OPTS_FILE=$CASSANDRA_CONF/jvm.options
for opt in `grep "^-" $JVM_OPTS_FILE`
do
  JVM_OPTS="$JVM_OPTS $opt"
done
{noformat}

Can't find something similar in {{cassandra-env.sh}} for Cassandra 4 beta2.


> Cassandra 4b2 - JVM options from *.options not read/set
> ---
>
> Key: CASSANDRA-16153
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16153
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Scripts
>Reporter: Thomas Steinmaurer
>Priority: Normal
>
> Trying out Cassandra 4 beta 2 with Java 8 (AdoptOpenJDK) in AWS.
> {noformat}
> NAME="Amazon Linux AMI"
> VERSION="2018.03"
> ID="amzn"
> ID_LIKE="rhel fedora"
> VERSION_ID="2018.03"
> PRETTY_NAME="Amazon Linux AMI 2018.03"
> ANSI_COLOR="0;33"
> CPE_NAME="cpe:/o:amazon:linux:2018.03:ga"
> HOME_URL="http://aws.amazon.com/amazon-linux-ami/;
> {noformat}
> It seems the Cassandra JVM results in using Parallel GC.
> {noformat}
> INFO  [Service Thread] 2020-10-01 00:00:56,233 GCInspector.java:299 - PS 
> Scavenge GC in 541ms.  PS Old Gen: 5152844776 -> 5726724752;
> WARN  [Service Thread] 2020-10-01 00:00:56,234 GCInspector.java:297 - PS 
> MarkSweep GC in 1969ms.  PS Eden Space: 2111307776 -> 0; PS Old Gen: 
> 5726724752 -> 2581334376; PS Survivor Space: 363850224 -> 0
> {noformat}
> Although {{jvm8-server.options}} is using CMS.
> {noformat}
> #
> #  GC SETTINGS  #
> #
> ### CMS Settings
> -XX:+UseParNewGC
> -XX:+UseConcMarkSweepGC
> -XX:+CMSParallelRemarkEnabled
> -XX:SurvivorRatio=8
> -XX:MaxTenuringThreshold=1
> -XX:CMSInitiatingOccupancyFraction=75
> -XX:+UseCMSInitiatingOccupancyOnly
> -XX:CMSWaitDuration=1
> -XX:+CMSParallelInitialMarkEnabled
> -XX:+CMSEdenChunksRecordAlways
> ## some JVMs will fill up their heap when accessed via JMX, see CASSANDRA-6541
> -XX:+CMSClassUnloadingEnabled
> ...
> {noformat}
> In Cassandra 3, default has been CMS.
> So, possibly there is something wrong in reading/processing 
> {{jvm8-server.options}}?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org