[jira] [Updated] (CASSANDRASC-112) ClosedChannelException when downloading from S3
[ https://issues.apache.org/jira/browse/CASSANDRASC-112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yifan Cai updated CASSANDRASC-112: -- Authors: Yifan Cai Test and Documentation Plan: unit; ci Status: Patch Available (was: Open) > ClosedChannelException when downloading from S3 > --- > > Key: CASSANDRASC-112 > URL: https://issues.apache.org/jira/browse/CASSANDRASC-112 > Project: Sidecar for Apache Cassandra > Issue Type: Bug > Components: Rest API >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: Normal > Labels: pull-request-available > > {code:java} > org.apache.cassandra.sidecar.exceptions.RestoreJobFatalException: > Unrecoverable error when downloading object. > Caused by: java.nio.channels.ClosedChannelException > at > java.base/sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:150) > at java.base/sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:266) > at > org.apache.cassandra.sidecar.restore.StorageClient.lambda$subscribeRateLimitedWrite$6(StorageClient.java:271) > ... 22 more > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRASC-112) ClosedChannelException when downloading from S3
[ https://issues.apache.org/jira/browse/CASSANDRASC-112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated CASSANDRASC-112: --- Labels: pull-request-available (was: ) > ClosedChannelException when downloading from S3 > --- > > Key: CASSANDRASC-112 > URL: https://issues.apache.org/jira/browse/CASSANDRASC-112 > Project: Sidecar for Apache Cassandra > Issue Type: Bug >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: Normal > Labels: pull-request-available > > {code:java} > org.apache.cassandra.sidecar.exceptions.RestoreJobFatalException: > Unrecoverable error when downloading object. > Caused by: java.nio.channels.ClosedChannelException > at > java.base/sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:150) > at java.base/sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:266) > at > org.apache.cassandra.sidecar.restore.StorageClient.lambda$subscribeRateLimitedWrite$6(StorageClient.java:271) > ... 22 more > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRASC-112) ClosedChannelException when downloading from S3
[ https://issues.apache.org/jira/browse/CASSANDRASC-112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yifan Cai updated CASSANDRASC-112: -- Bug Category: Parent values: Availability(12983)Level 1 values: Response Crash(12991) Complexity: Normal Component/s: Rest API Discovered By: User Report Severity: Normal Status: Open (was: Triage Needed) PR: https://github.com/apache/cassandra-sidecar/pull/103 CI: https://app.circleci.com/pipelines/github/yifan-c/cassandra-sidecar?branch=CASSANDRASC-112%2Ftrunk > ClosedChannelException when downloading from S3 > --- > > Key: CASSANDRASC-112 > URL: https://issues.apache.org/jira/browse/CASSANDRASC-112 > Project: Sidecar for Apache Cassandra > Issue Type: Bug > Components: Rest API >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: Normal > Labels: pull-request-available > > {code:java} > org.apache.cassandra.sidecar.exceptions.RestoreJobFatalException: > Unrecoverable error when downloading object. > Caused by: java.nio.channels.ClosedChannelException > at > java.base/sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:150) > at java.base/sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:266) > at > org.apache.cassandra.sidecar.restore.StorageClient.lambda$subscribeRateLimitedWrite$6(StorageClient.java:271) > ... 22 more > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
Re: [PR] CASSANDRA-19418 - Changes to report additional bulk analytics job stats for instrumentation [cassandra-analytics]
arjunashok commented on code in PR #41: URL: https://github.com/apache/cassandra-analytics/pull/41#discussion_r1513798245 ## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/bulkwriter/CassandraBulkSourceRelation.java: ## @@ -107,17 +112,25 @@ private void persist(@NotNull JavaPairRDD sortedRDD, Str { try { -sortedRDD.foreachPartition(writeRowsInPartition(broadcastContext, columnNames)); +List results = sortedRDD + .mapPartitions(partitionsFlatMapFunc(broadcastContext, columnNames)) + .collect(); +long rowCount = results.stream().mapToLong(res -> res.rowCount).sum(); +long totalBytesWritten = results.stream().mapToLong(res -> res.bytesWritten).sum(); +LOGGER.info("Bulk writer has written {} rows and {} bytes", rowCount, totalBytesWritten); +recordSuccessfulJobStats(rowCount, totalBytesWritten); } catch (Throwable throwable) { +recordFailureStats(throwable.getMessage()); LOGGER.error("Bulk Write Failed", throwable); throw new RuntimeException("Bulk Write to Cassandra has failed", throwable); } finally { try { +writerContext.publishJobStats(); Review Comment: So, the change is not propagating data back to the driver, but publishes stats from the executors. I am assuming here that the context is made available to the executors so what you are saying does not need to happen. Let me know if that makes sense. I have been able to validate this using the in-jvm-dtests. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
Re: [PR] CASSANDRA-19418 - Changes to report additional bulk analytics job stats for instrumentation [cassandra-analytics]
arjunashok commented on code in PR #41: URL: https://github.com/apache/cassandra-analytics/pull/41#discussion_r1513793354 ## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/bulkwriter/BulkWriterContext.java: ## @@ -21,7 +21,9 @@ import java.io.Serializable; -public interface BulkWriterContext extends Serializable +import org.apache.cassandra.spark.common.Reportable; + +public interface BulkWriterContext extends Serializable, Reportable Review Comment: So, the functionality provided by the new interface is replacing the existing `dialHome` method in the `CassandraBulkWriterContext` . The thinking is that this is tied to the "context" that is shared across executors, as we "record" initial stats and job status stats at the executor level and the "inflight" stats at the task level. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
Re: [PR] CASSANDRA-19418 - Changes to report additional bulk analytics job stats for instrumentation [cassandra-analytics]
arjunashok commented on code in PR #41: URL: https://github.com/apache/cassandra-analytics/pull/41#discussion_r1513793217 ## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/bulkwriter/RingInstance.java: ## @@ -49,6 +49,7 @@ public RingInstance(ReplicaMetadata replica) .datacenter(replica.datacenter()) .state(replica.state()) .status(replica.status()) + .token("") Review Comment: Answered in the response below -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
Re: [PR] CASSANDRA-19418 - Changes to report additional bulk analytics job stats for instrumentation [cassandra-analytics]
arjunashok commented on code in PR #41: URL: https://github.com/apache/cassandra-analytics/pull/41#discussion_r1513793094 ## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/bulkwriter/RingInstance.java: ## @@ -125,40 +126,28 @@ private void writeObject(ObjectOutputStream out) throws IOException out.writeUTF(ringEntry.address()); out.writeInt(ringEntry.port()); out.writeUTF(ringEntry.datacenter()); -out.writeUTF(ringEntry.load()); Review Comment: Since We are now returning the `StreamResult` back from the tasks, the existing implementation will result in NPEs while serializing the contained `RingInstance`, due to many of these fields not being defined when we create the `RingInstance` from `ReplicaMetadata`. The change removes the fields not being used from RingInstance from the serialization context. Likewise, the change is also explicitly setting the `token` field to a default for the same reason, since `token` is part of the equals/hashcode validations. Stack trace: ``` Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1.0 failed 4 times, most recent failure: Lost task 0.3 in stage 1.0 (TID 40) (172.20.10.7 executor driver): com.esotericsoftware.kryo.KryoException: Error during Java serialization. Serialization trace: instance (org.apache.cassandra.spark.bulkwriter.CommitResult) commitResults (org.apache.cassandra.spark.bulkwriter.StreamResult) at org.apache.cassandra.spark.bulkwriter.util.SbwJavaSerializer.write(SbwJavaSerializer.java:58) at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:575) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
Re: [PR] CASSANDRA-19418 - Changes to report additional bulk analytics job stats for instrumentation [cassandra-analytics]
arjunashok commented on code in PR #41: URL: https://github.com/apache/cassandra-analytics/pull/41#discussion_r1513792433 ## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/common/Reportable.java: ## @@ -0,0 +1,47 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.cassandra.spark.common; + +import java.util.Map; + +/** + * Interface to provide functionality to report Spark Job Statistics and/or properties + * that can optionally be instrumented. The default implementation merely logs these + * stats at the end of the job. + */ +public interface Reportable Review Comment: This is not meant to be specific to the writer, so can be renamed to `JobStats` maybe? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
Re: [PR] CASSANDRA-19418 - Changes to report additional bulk analytics job stats for instrumentation [cassandra-analytics]
arjunashok commented on code in PR #41: URL: https://github.com/apache/cassandra-analytics/pull/41#discussion_r1513792138 ## cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/common/Reportable.java: ## @@ -0,0 +1,47 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.cassandra.spark.common; Review Comment: Makes sense -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18879) Modernize CQLSH datetime conversions
[ https://issues.apache.org/jira/browse/CASSANDRA-18879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-18879: - Reviewers: Brandon Williams > Modernize CQLSH datetime conversions > > > Key: CASSANDRA-18879 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18879 > Project: Cassandra > Issue Type: Improvement > Components: CQL/Interpreter >Reporter: Brad Schoening >Assignee: Arun Ganesh >Priority: Low > Attachments: cassandra-cqlsh-stdout > > Time Spent: 2h > Remaining Estimate: 0h > > Python 3.x introduced many updates to datetime conversion which allows > simplified conversions. > 1. For example, tracing.py defines a function datetime_from_utc_to_local() > but datetime now has a native function astimezone() which will convert UTC to > local time. > Review the following users of datetime which apply conversions: > * cqlshmain.py > * formatting.py > * tracing.py > Example: > {code:java} > >>> from dateutil import tz > >>> import datetime > >>> a = datetime.datetime.now().astimezone(tz.tzutc()) > >>> a > datetime.datetime(2023, 9, 25, 11, 22, 36, 251705, tzinfo=tzutc()) > >>> b = a.astimezone() > >>> b > datetime.datetime(2023, 9, 25, 14, 22, 36, 251705, > tzinfo=datetime.timezone(datetime.timedelta(seconds=10800), 'EST')) {code} > See [[PEP 495|http://example.com]]] > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRASC-112) ClosedChannelException when downloading from S3
Yifan Cai created CASSANDRASC-112: - Summary: ClosedChannelException when downloading from S3 Key: CASSANDRASC-112 URL: https://issues.apache.org/jira/browse/CASSANDRASC-112 Project: Sidecar for Apache Cassandra Issue Type: Bug Reporter: Yifan Cai Assignee: Yifan Cai {code:java} org.apache.cassandra.sidecar.exceptions.RestoreJobFatalException: Unrecoverable error when downloading object. Caused by: java.nio.channels.ClosedChannelException at java.base/sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:150) at java.base/sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:266) at org.apache.cassandra.sidecar.restore.StorageClient.lambda$subscribeRateLimitedWrite$6(StorageClient.java:271) ... 22 more {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19454) Revert switch to approximate time in Dispatcher to avoid mixing with nanoTime() in downstream timeout calculations
[ https://issues.apache.org/jira/browse/CASSANDRA-19454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823818#comment-17823818 ] Arun Ganesh commented on CASSANDRA-19454: - Thanks! Meanwhile, if you have any issues lying around that can help me understand the project better, I'd like to work on them. > Revert switch to approximate time in Dispatcher to avoid mixing with > nanoTime() in downstream timeout calculations > -- > > Key: CASSANDRA-19454 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19454 > Project: Cassandra > Issue Type: Bug > Components: Messaging/Client >Reporter: Caleb Rackliffe >Assignee: Arun Ganesh >Priority: Normal > Fix For: 5.0.x, 5.x > > Time Spent: 10m > Remaining Estimate: 0h > > CASSANDRA-15241 changed {{Dispatcher}} to use the {{approxTime}} > implementation of {{MonotonicClock}} rather than {{nanoTime()}}, but clock > drift between the two, can potentially cause queries to time out more > quickly. We should be able to revert the {{Dispatcher}} to use {{nanoTime()}} > again and similarly change {{QueriesTable} to {{nanoTime()}} as well for > consistency. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
(cassandra-website) branch asf-staging updated (e86c7f87c -> 5dca216ee)
This is an automated email from the ASF dual-hosted git repository. git-site-role pushed a change to branch asf-staging in repository https://gitbox.apache.org/repos/asf/cassandra-website.git discard e86c7f87c generate docs for fd550e9c new 5dca216ee generate docs for fd550e9c This update added new revisions after undoing existing revisions. That is to say, some revisions that were in the old version of the branch are not in the new version. This situation occurs when a user --force pushes a change and generates a repository containing something like this: * -- * -- B -- O -- O -- O (e86c7f87c) \ N -- N -- N refs/heads/asf-staging (5dca216ee) You should already have received notification emails for all of the O revisions, and so the following emails describe only the N revisions from the common base, B. Any revisions marked "omit" are not gone; other references still refer to them. Any revisions marked "discard" are gone forever. The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: content/search-index.js | 2 +- site-ui/build/ui-bundle.zip | Bin 4883646 -> 4883646 bytes 2 files changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19454) Revert switch to approximate time in Dispatcher to avoid mixing with nanoTime() in downstream timeout calculations
[ https://issues.apache.org/jira/browse/CASSANDRA-19454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823788#comment-17823788 ] Caleb Rackliffe commented on CASSANDRA-19454: - The draft PR looks good, +1 I'm running the tests, and I'll post a summary for you here soon... > Revert switch to approximate time in Dispatcher to avoid mixing with > nanoTime() in downstream timeout calculations > -- > > Key: CASSANDRA-19454 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19454 > Project: Cassandra > Issue Type: Bug > Components: Messaging/Client >Reporter: Caleb Rackliffe >Assignee: Arun Ganesh >Priority: Normal > Fix For: 5.0.x, 5.x > > Time Spent: 10m > Remaining Estimate: 0h > > CASSANDRA-15241 changed {{Dispatcher}} to use the {{approxTime}} > implementation of {{MonotonicClock}} rather than {{nanoTime()}}, but clock > drift between the two, can potentially cause queries to time out more > quickly. We should be able to revert the {{Dispatcher}} to use {{nanoTime()}} > again and similarly change {{QueriesTable} to {{nanoTime()}} as well for > consistency. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
(cassandra-website) branch asf-staging updated (b178e036c -> e86c7f87c)
This is an automated email from the ASF dual-hosted git repository. git-site-role pushed a change to branch asf-staging in repository https://gitbox.apache.org/repos/asf/cassandra-website.git discard b178e036c generate docs for fd550e9c new e86c7f87c generate docs for fd550e9c This update added new revisions after undoing existing revisions. That is to say, some revisions that were in the old version of the branch are not in the new version. This situation occurs when a user --force pushes a change and generates a repository containing something like this: * -- * -- B -- O -- O -- O (b178e036c) \ N -- N -- N refs/heads/asf-staging (e86c7f87c) You should already have received notification emails for all of the O revisions, and so the following emails describe only the N revisions from the common base, B. Any revisions marked "omit" are not gone; other references still refer to them. Any revisions marked "discard" are gone forever. The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: content/search-index.js | 2 +- site-ui/build/ui-bundle.zip | Bin 4883646 -> 4883646 bytes 2 files changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
(cassandra-website) branch asf-staging updated (4edcc0e8c -> b178e036c)
This is an automated email from the ASF dual-hosted git repository. git-site-role pushed a change to branch asf-staging in repository https://gitbox.apache.org/repos/asf/cassandra-website.git discard 4edcc0e8c generate docs for fd550e9c new b178e036c generate docs for fd550e9c This update added new revisions after undoing existing revisions. That is to say, some revisions that were in the old version of the branch are not in the new version. This situation occurs when a user --force pushes a change and generates a repository containing something like this: * -- * -- B -- O -- O -- O (4edcc0e8c) \ N -- N -- N refs/heads/asf-staging (b178e036c) You should already have received notification emails for all of the O revisions, and so the following emails describe only the N revisions from the common base, B. Any revisions marked "omit" are not gone; other references still refer to them. Any revisions marked "discard" are gone forever. The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: content/search-index.js | 2 +- site-ui/build/ui-bundle.zip | Bin 4883646 -> 4883646 bytes 2 files changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19452) [Analytics] Use constant reference time during bulk read process
[ https://issues.apache.org/jira/browse/CASSANDRA-19452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yifan Cai updated CASSANDRA-19452: -- Fix Version/s: NA Since Version: NA Source Control Link: https://github.com/apache/cassandra-analytics/commit/a13532272051d4e4608f92d53bdd997103e8ea19 Resolution: Fixed Status: Resolved (was: Ready to Commit) > [Analytics] Use constant reference time during bulk read process > > > Key: CASSANDRA-19452 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19452 > Project: Cassandra > Issue Type: Bug > Components: Analytics Library >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: Normal > Fix For: NA > > Time Spent: 1.5h > Remaining Estimate: 0h > > Bulk reader leverages a time provider that returns the current time during > read to guide compaction and validation. > As the current time value varies in spark executors, there is a chance that > rows/cells get expired inconsistently. Another issue is the validation on > no-expired rows/cells after compaction might fail, since they could expire > during read. The read can take minutes or even hours. > It could lead to false data omission and job failure. > The fix is to use constant reference time that is decided by Spark driver and > distribute to all executors. The reference time is used for compaction and > validation later. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
(cassandra-analytics) branch trunk updated: CASSANDRA-19452 Use constant reference time during bulk read process (#44)
This is an automated email from the ASF dual-hosted git repository. ycai pushed a commit to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra-analytics.git The following commit(s) were added to refs/heads/trunk by this push: new a135322 CASSANDRA-19452 Use constant reference time during bulk read process (#44) a135322 is described below commit a13532272051d4e4608f92d53bdd997103e8ea19 Author: Yifan Cai <52585731+yifa...@users.noreply.github.com> AuthorDate: Tue Mar 5 11:06:36 2024 -0800 CASSANDRA-19452 Use constant reference time during bulk read process (#44) patch by Yifan Cai; reviewed by Francisco Guerrero, James Berragan for CASSANDRA-19452 --- CHANGES.txt| 1 + .../cassandra/spark/data/CassandraDataLayer.java | 28 - .../cassandra/spark/data/LocalDataLayer.java | 7 ++ .../org/apache/cassandra/spark/TestDataLayer.java | 7 ++ .../data/partitioner/JDKSerializationTests.java| 7 ++ .../apache/cassandra/bridge/CassandraBridge.java | 2 - .../org/apache/cassandra/spark/data/DataLayer.java | 13 +-- .../{TimeProvider.java => ReaderTimeProvider.java} | 28 +++-- .../apache/cassandra/spark/utils/TimeProvider.java | 41 ++- .../cassandra/spark/utils/test/TestSchema.java | 29 - .../bridge/CassandraBridgeImplementation.java | 7 -- .../spark/reader/AbstractStreamScanner.java| 50 ++--- .../spark/reader/CompactionStreamScanner.java | 32 -- .../cassandra/spark/reader/SSTableReaderTests.java | 120 + 14 files changed, 311 insertions(+), 61 deletions(-) diff --git a/CHANGES.txt b/CHANGES.txt index 8215822..92620a9 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 1.0.0 + * Use constant reference time during bulk read process (CASSANDRA-19452) * Update access of ClearSnapshotStrategy (CASSANDRA-19442) * Bulk reader fails to produce a row when regular column values are null (CASSANDRA-19411) * Use XXHash32 for digest calculation of SSTables (CASSANDRA-19369) diff --git a/cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/CassandraDataLayer.java b/cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/CassandraDataLayer.java index 40e0436..8ab1dd6 100644 --- a/cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/CassandraDataLayer.java +++ b/cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/CassandraDataLayer.java @@ -87,8 +87,10 @@ import org.apache.cassandra.spark.sparksql.LastModifiedTimestampDecorator; import org.apache.cassandra.spark.sparksql.RowBuilder; import org.apache.cassandra.spark.stats.Stats; import org.apache.cassandra.spark.utils.CqlUtils; +import org.apache.cassandra.spark.utils.ReaderTimeProvider; import org.apache.cassandra.spark.utils.ScalaFunctions; import org.apache.cassandra.spark.utils.ThrowableUtils; +import org.apache.cassandra.spark.utils.TimeProvider; import org.apache.cassandra.spark.validation.CassandraValidation; import org.apache.cassandra.spark.validation.SidecarValidation; import org.apache.cassandra.spark.validation.StartupValidatable; @@ -122,7 +124,6 @@ public class CassandraDataLayer extends PartitionedDataLayer implements StartupV protected TokenPartitioner tokenPartitioner; protected Map availabilityHints; protected Sidecar.ClientConfig sidecarClientConfig; -private SslConfig sslConfig; protected Map bigNumberConfigMap; protected boolean enableStats; protected boolean readIndexOffset; @@ -133,7 +134,11 @@ public class CassandraDataLayer extends PartitionedDataLayer implements StartupV protected String lastModifiedTimestampField; // volatile in order to publish the reference for visibility protected volatile CqlTable cqlTable; +protected transient TimeProvider timeProvider; protected transient SidecarClient sidecar; + +private SslConfig sslConfig; + @VisibleForTesting transient Map instanceMap; @@ -178,7 +183,8 @@ public class CassandraDataLayer extends PartitionedDataLayer implements StartupV boolean useIncrementalRepair, @Nullable String lastModifiedTimestampField, List requestedFeatures, - @NotNull Map rfMap) + @NotNull Map rfMap, + TimeProvider timeProvider) { super(consistencyLevel, datacenter); this.snapshotName = snapshotName; @@ -203,6 +209,7 @@ public class CassandraDataLayer extends PartitionedDataLayer implements StartupV aliasLastModifiedTimestamp(this.requestedFeatures, this.lastModifiedTimestampField); } this.rfMap = rfMap; +this.timeProvider = timeProvider; this.maybeQuoteKeyspaceAndTable(); this.initInstanceMap(); t
Re: [PR] CASSANDRA-19452 Use constant reference time during bulk read process [cassandra-analytics]
yifan-c merged PR #44: URL: https://github.com/apache/cassandra-analytics/pull/44 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
Re: [PR] CASSANDRA-19452 Use constant reference time during bulk read process [cassandra-analytics]
yifan-c commented on code in PR #44: URL: https://github.com/apache/cassandra-analytics/pull/44#discussion_r1513261310 ## cassandra-bridge/src/main/java/org/apache/cassandra/spark/data/DataLayer.java: ## @@ -164,6 +164,11 @@ public CassandraVersion version() public abstract boolean isInPartition(int partitionId, BigInteger token, ByteBuffer key); +/** + * @return a TimeProvider + */ +public abstract TimeProvider timeProvider(); Review Comment: Do not return DEFAULT time prover. The concrete implementation should return the correct value. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15452) Improve disk access patterns during compaction and streaming
[ https://issues.apache.org/jira/browse/CASSANDRA-15452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jon Haddad updated CASSANDRA-15452: --- Description: On read heavy workloads Cassandra performs much better when using a low read ahead setting. In my tests I've seen an 5x improvement in throughput and more than a 50% reduction in latency. However, I've also observed that it can have a negative impact on compaction and streaming throughput. It especially negatively impacts cloud environments where small reads incur high costs in IOPS due to tiny requests. # We should investigate using POSIX_FADV_DONTNEED on files we're compacting to see if we can improve performance and reduce page faults. # This should be combined with an internal read ahead style buffer that Cassandra manages, similar to a BufferedInputStream but with our own machinery. This buffer should read fairly large blocks of data off disk at at time. EBS, for example, allows 1 IOP to be up to 256KB. A considerable amount of time is spent in blocking I/O during compaction and streaming. Reducing the frequency we read from disk should speed up all sequential I/O operations. # We can reduce system calls by buffering writes as well, but I think it will have less of an impact than the reads was: On read heavy workloads Cassandra performs much better when using a low read ahead setting. In my tests I've seen an 5x improvement in throughput and more than a 50% reduction in latency. However, I've also observed that it can have a negative impact on compaction and streaming throughput. It especially negatively impacts cloud environments where small reads incur high costs in IOPS due to tiny requests. # We should investigate using POSIX_FADV_DONTNEED on files we're compacting to see if we can improve performance and reduce page faults. # This should be combined with an internal read ahead style buffer that Cassandra manages, similar to a BufferedInputStream but with our own machinery. This buffer should read fairly large blocks of data off disk at at time. EBS, for example, allows 1 IOP to be up to 256KB. A considerable amount of time is spent in blocking I/O during compaction and streaming. Reducing the frequency we read from disk should speed up all sequential I/O operations. > Improve disk access patterns during compaction and streaming > > > Key: CASSANDRA-15452 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15452 > Project: Cassandra > Issue Type: Improvement > Components: Legacy/Local Write-Read Paths, Local/Compaction >Reporter: Jon Haddad >Assignee: Jordan West >Priority: Normal > Attachments: everyfs.txt, results.txt, sequential.fio > > > On read heavy workloads Cassandra performs much better when using a low read > ahead setting. In my tests I've seen an 5x improvement in throughput and > more than a 50% reduction in latency. However, I've also observed that it > can have a negative impact on compaction and streaming throughput. It > especially negatively impacts cloud environments where small reads incur high > costs in IOPS due to tiny requests. > # We should investigate using POSIX_FADV_DONTNEED on files we're compacting > to see if we can improve performance and reduce page faults. > # This should be combined with an internal read ahead style buffer that > Cassandra manages, similar to a BufferedInputStream but with our own > machinery. This buffer should read fairly large blocks of data off disk at > at time. EBS, for example, allows 1 IOP to be up to 256KB. A considerable > amount of time is spent in blocking I/O during compaction and streaming. > Reducing the frequency we read from disk should speed up all sequential I/O > operations. > # We can reduce system calls by buffering writes as well, but I think it > will have less of an impact than the reads -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-15452) Improve disk access patterns during compaction and streaming
[ https://issues.apache.org/jira/browse/CASSANDRA-15452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jon Haddad reassigned CASSANDRA-15452: -- Assignee: Jordan West > Improve disk access patterns during compaction and streaming > > > Key: CASSANDRA-15452 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15452 > Project: Cassandra > Issue Type: Improvement > Components: Legacy/Local Write-Read Paths, Local/Compaction >Reporter: Jon Haddad >Assignee: Jordan West >Priority: Normal > Attachments: everyfs.txt, results.txt, sequential.fio > > > On read heavy workloads Cassandra performs much better when using a low read > ahead setting. In my tests I've seen an 5x improvement in throughput and > more than a 50% reduction in latency. However, I've also observed that it > can have a negative impact on compaction and streaming throughput. It > especially negatively impacts cloud environments where small reads incur high > costs in IOPS due to tiny requests. > # We should investigate using POSIX_FADV_DONTNEED on files we're compacting > to see if we can improve performance and reduce page faults. > # This should be combined with an internal read ahead style buffer that > Cassandra manages, similar to a BufferedInputStream but with our own > machinery. This buffer should read fairly large blocks of data off disk at > at time. EBS, for example, allows 1 IOP to be up to 256KB. A considerable > amount of time is spent in blocking I/O during compaction and streaming. > Reducing the frequency we read from disk should speed up all sequential I/O > operations. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19454) Revert switch to approximate time in Dispatcher to avoid mixing with nanoTime() in downstream timeout calculations
[ https://issues.apache.org/jira/browse/CASSANDRA-19454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823701#comment-17823701 ] Caleb Rackliffe commented on CASSANDRA-19454: - [~arkn98] I think a "bar" of not causing regressions in the existing test suite (which obviously includes tests from CASSANDRA-15241) is acceptable here. Once I have a chance to look at the PR, I can kick off a CI run myself (unless you have a paid CircleCI account and want to go that route). > Revert switch to approximate time in Dispatcher to avoid mixing with > nanoTime() in downstream timeout calculations > -- > > Key: CASSANDRA-19454 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19454 > Project: Cassandra > Issue Type: Bug > Components: Messaging/Client >Reporter: Caleb Rackliffe >Assignee: Arun Ganesh >Priority: Normal > Fix For: 5.0.x, 5.x > > Time Spent: 10m > Remaining Estimate: 0h > > CASSANDRA-15241 changed {{Dispatcher}} to use the {{approxTime}} > implementation of {{MonotonicClock}} rather than {{nanoTime()}}, but clock > drift between the two, can potentially cause queries to time out more > quickly. We should be able to revert the {{Dispatcher}} to use {{nanoTime()}} > again and similarly change {{QueriesTable} to {{nanoTime()}} as well for > consistency. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18661) Update cassandra-stress to use Apache Commons CLI
[ https://issues.apache.org/jira/browse/CASSANDRA-18661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823687#comment-17823687 ] Brad Schoening commented on CASSANDRA-18661: [~claude] [~smiklosovic] it's great to hear your update on the success you've had with this. Stefan raises an important point about how to make this unifying change. There is so much legacy baggage in cassandra-stress I think the change is very much warranted, but we may need to keep a cassandra-stress-old, create a cassandra-stress-new or something and should be discussed on the ML. > Update cassandra-stress to use Apache Commons CLI > - > > Key: CASSANDRA-18661 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18661 > Project: Cassandra > Issue Type: Improvement > Components: Tool/stress >Reporter: Brad Schoening >Assignee: Claude Warren >Priority: Normal > Labels: lhf > > The Apache Commons CLI library provides an API for parsing command line > options with the package org.apache.commons.cli and this is already used by a > dozen of existing Cassandra utilities including: > {quote}SSTableMetadataViewer, StandaloneScrubber, StandaloneSplitter, > SSTableExport, BulkLoader, and others. > {quote} > However, cassandra-stress is an outlier which uses its own custom classes to > parse command line options with classes such as OptionsSimple. In addition, > the options syntax for username, password, and others are not aligned with > the format used by CQLSH. > Currently, there are > 5K lines of code in 'settings' which appears to just > process command line args. > This suggestion is to: > > a) Upgrade cassandra-stress to use Apache Commons CLI (no new dependencies > are required as this library is already used by the project) > > b) Align the cassandra-stress CLI options with those in CQLSH, > > {quote}For example, using the new syntax like CQLSH: > {quote} > > cassandra-stress -username foo -password bar > {quote}and replacing the old syntax: > {quote} > cassandra-stress -mode username=foo and password=bar > > This will simplify and unify the code base, eliminate code and reduce the > confusion between similar named classes such as > org.apache.cassandra.stress.settings.\{Option, OptionsMulti, OptionsSimple} > and org.apache.commons.cli.{Option, OptionGroup, Options) > > Note: documentation will need to be updated as well -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-18661) Update cassandra-stress to use Apache Commons CLI
[ https://issues.apache.org/jira/browse/CASSANDRA-18661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823683#comment-17823683 ] Stefan Miklosovic edited comment on CASSANDRA-18661 at 3/5/24 3:46 PM: --- While I definitely appreciate the effort in this ticket to make it on par with other CLI tools, I would bring this to ML to see what broader audience thinks about this. There is a ton of legacy online with all options, all the docs etc so I wonder if we are not making more harm than good (even with very good intentions). Maybe supporting the old and the new way at the same time would be nice to have? Not sure how that would look like, I am just trying to figure out how to minimize the disruption. was (Author: smiklosovic): While I definitely appreciate the effort in this ticket to make it on par with other CLI tools, I would bring this to ML to see what broader audience thinks about this. There is a ton of legacy online with all options, all the docs etc so I wonder if we are not making more harm than good (even with very good intentions). Maybe supporting the old and the new way at the same time would be nice to have? Not sure how that would look like, I am just trying to figure out how to be at least disruptive towards users as possible. > Update cassandra-stress to use Apache Commons CLI > - > > Key: CASSANDRA-18661 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18661 > Project: Cassandra > Issue Type: Improvement > Components: Tool/stress >Reporter: Brad Schoening >Assignee: Claude Warren >Priority: Normal > Labels: lhf > > The Apache Commons CLI library provides an API for parsing command line > options with the package org.apache.commons.cli and this is already used by a > dozen of existing Cassandra utilities including: > {quote}SSTableMetadataViewer, StandaloneScrubber, StandaloneSplitter, > SSTableExport, BulkLoader, and others. > {quote} > However, cassandra-stress is an outlier which uses its own custom classes to > parse command line options with classes such as OptionsSimple. In addition, > the options syntax for username, password, and others are not aligned with > the format used by CQLSH. > Currently, there are > 5K lines of code in 'settings' which appears to just > process command line args. > This suggestion is to: > > a) Upgrade cassandra-stress to use Apache Commons CLI (no new dependencies > are required as this library is already used by the project) > > b) Align the cassandra-stress CLI options with those in CQLSH, > > {quote}For example, using the new syntax like CQLSH: > {quote} > > cassandra-stress -username foo -password bar > {quote}and replacing the old syntax: > {quote} > cassandra-stress -mode username=foo and password=bar > > This will simplify and unify the code base, eliminate code and reduce the > confusion between similar named classes such as > org.apache.cassandra.stress.settings.\{Option, OptionsMulti, OptionsSimple} > and org.apache.commons.cli.{Option, OptionGroup, Options) > > Note: documentation will need to be updated as well -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18661) Update cassandra-stress to use Apache Commons CLI
[ https://issues.apache.org/jira/browse/CASSANDRA-18661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823683#comment-17823683 ] Stefan Miklosovic commented on CASSANDRA-18661: --- While I definitely appreciate the effort in this ticket to make it on par with other CLI tools, I would bring this to ML to see what broader audience thinks about this. There is a ton of legacy online with all options, all the docs etc so I wonder if we are not making more harm than good (even with very good intentions). Maybe supporting the old and the new way at the same time would be nice to have? Not sure how that would look like, I am just trying to figure out how to be at least disruptive towards users as possible. > Update cassandra-stress to use Apache Commons CLI > - > > Key: CASSANDRA-18661 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18661 > Project: Cassandra > Issue Type: Improvement > Components: Tool/stress >Reporter: Brad Schoening >Assignee: Claude Warren >Priority: Normal > Labels: lhf > > The Apache Commons CLI library provides an API for parsing command line > options with the package org.apache.commons.cli and this is already used by a > dozen of existing Cassandra utilities including: > {quote}SSTableMetadataViewer, StandaloneScrubber, StandaloneSplitter, > SSTableExport, BulkLoader, and others. > {quote} > However, cassandra-stress is an outlier which uses its own custom classes to > parse command line options with classes such as OptionsSimple. In addition, > the options syntax for username, password, and others are not aligned with > the format used by CQLSH. > Currently, there are > 5K lines of code in 'settings' which appears to just > process command line args. > This suggestion is to: > > a) Upgrade cassandra-stress to use Apache Commons CLI (no new dependencies > are required as this library is already used by the project) > > b) Align the cassandra-stress CLI options with those in CQLSH, > > {quote}For example, using the new syntax like CQLSH: > {quote} > > cassandra-stress -username foo -password bar > {quote}and replacing the old syntax: > {quote} > cassandra-stress -mode username=foo and password=bar > > This will simplify and unify the code base, eliminate code and reduce the > confusion between similar named classes such as > org.apache.cassandra.stress.settings.\{Option, OptionsMulti, OptionsSimple} > and org.apache.commons.cli.{Option, OptionGroup, Options) > > Note: documentation will need to be updated as well -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18661) Update cassandra-stress to use Apache Commons CLI
[ https://issues.apache.org/jira/browse/CASSANDRA-18661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823674#comment-17823674 ] Brandon Williams commented on CASSANDRA-18661: -- I *think* that's a vestige of the old daemon mode for stress that was removed for security concerns in CASSANDRA-17535. > Update cassandra-stress to use Apache Commons CLI > - > > Key: CASSANDRA-18661 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18661 > Project: Cassandra > Issue Type: Improvement > Components: Tool/stress >Reporter: Brad Schoening >Assignee: Claude Warren >Priority: Normal > Labels: lhf > > The Apache Commons CLI library provides an API for parsing command line > options with the package org.apache.commons.cli and this is already used by a > dozen of existing Cassandra utilities including: > {quote}SSTableMetadataViewer, StandaloneScrubber, StandaloneSplitter, > SSTableExport, BulkLoader, and others. > {quote} > However, cassandra-stress is an outlier which uses its own custom classes to > parse command line options with classes such as OptionsSimple. In addition, > the options syntax for username, password, and others are not aligned with > the format used by CQLSH. > Currently, there are > 5K lines of code in 'settings' which appears to just > process command line args. > This suggestion is to: > > a) Upgrade cassandra-stress to use Apache Commons CLI (no new dependencies > are required as this library is already used by the project) > > b) Align the cassandra-stress CLI options with those in CQLSH, > > {quote}For example, using the new syntax like CQLSH: > {quote} > > cassandra-stress -username foo -password bar > {quote}and replacing the old syntax: > {quote} > cassandra-stress -mode username=foo and password=bar > > This will simplify and unify the code base, eliminate code and reduce the > confusion between similar named classes such as > org.apache.cassandra.stress.settings.\{Option, OptionsMulti, OptionsSimple} > and org.apache.commons.cli.{Option, OptionGroup, Options) > > Note: documentation will need to be updated as well -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18661) Update cassandra-stress to use Apache Commons CLI
[ https://issues.apache.org/jira/browse/CASSANDRA-18661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823670#comment-17823670 ] Claude Warren commented on CASSANDRA-18661: --- [~bschoeni] I have managed to get Stress to use the commons-cli code (after adding some more functionality to commons-cli). So we will have to wait for commons-cli 1.7.0 to be released. However, there is a requirement in the code for the StressSettings to be serializable. Is this an old Thrift requirement and can it be removed? As I recall serialization is fraught with security issues, though this is only a test tool. I just don't see how it would be used in the tool. Any suggestions? > Update cassandra-stress to use Apache Commons CLI > - > > Key: CASSANDRA-18661 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18661 > Project: Cassandra > Issue Type: Improvement > Components: Tool/stress >Reporter: Brad Schoening >Assignee: Claude Warren >Priority: Normal > Labels: lhf > > The Apache Commons CLI library provides an API for parsing command line > options with the package org.apache.commons.cli and this is already used by a > dozen of existing Cassandra utilities including: > {quote}SSTableMetadataViewer, StandaloneScrubber, StandaloneSplitter, > SSTableExport, BulkLoader, and others. > {quote} > However, cassandra-stress is an outlier which uses its own custom classes to > parse command line options with classes such as OptionsSimple. In addition, > the options syntax for username, password, and others are not aligned with > the format used by CQLSH. > Currently, there are > 5K lines of code in 'settings' which appears to just > process command line args. > This suggestion is to: > > a) Upgrade cassandra-stress to use Apache Commons CLI (no new dependencies > are required as this library is already used by the project) > > b) Align the cassandra-stress CLI options with those in CQLSH, > > {quote}For example, using the new syntax like CQLSH: > {quote} > > cassandra-stress -username foo -password bar > {quote}and replacing the old syntax: > {quote} > cassandra-stress -mode username=foo and password=bar > > This will simplify and unify the code base, eliminate code and reduce the > confusion between similar named classes such as > org.apache.cassandra.stress.settings.\{Option, OptionsMulti, OptionsSimple} > and org.apache.commons.cli.{Option, OptionGroup, Options) > > Note: documentation will need to be updated as well -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19398) Test Failure: org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading
[ https://issues.apache.org/jira/browse/CASSANDRA-19398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-19398: - Status: Ready to Commit (was: Review In Progress) Great job, +1 > Test Failure: > org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading > -- > > Key: CASSANDRA-19398 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19398 > Project: Cassandra > Issue Type: Bug > Components: CI >Reporter: Ekaterina Dimitrova >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 5.0-rc, 5.x > > > [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/2646/workflows/bc2bba74-9e56-4bea-8de7-4ff840c4f450/jobs/56028/tests#failed-test-0] > {code:java} > junit.framework.AssertionFailedError at > org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading(UpgradeSSTablesTest.java:220) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43){code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19398) Test Failure: org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading
[ https://issues.apache.org/jira/browse/CASSANDRA-19398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-19398: - Reviewers: Brandon Williams, Brandon Williams (was: Brandon Williams) Brandon Williams, Brandon Williams (was: Brandon Williams) Status: Review In Progress (was: Patch Available) > Test Failure: > org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading > -- > > Key: CASSANDRA-19398 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19398 > Project: Cassandra > Issue Type: Bug > Components: CI >Reporter: Ekaterina Dimitrova >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 5.0-rc, 5.x > > > [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/2646/workflows/bc2bba74-9e56-4bea-8de7-4ff840c4f450/jobs/56028/tests#failed-test-0] > {code:java} > junit.framework.AssertionFailedError at > org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading(UpgradeSSTablesTest.java:220) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43){code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19398) Test Failure: org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading
[ https://issues.apache.org/jira/browse/CASSANDRA-19398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823622#comment-17823622 ] Berenguer Blasi commented on CASSANDRA-19398: - Added trunk which turned out green as well. > Test Failure: > org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading > -- > > Key: CASSANDRA-19398 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19398 > Project: Cassandra > Issue Type: Bug > Components: CI >Reporter: Ekaterina Dimitrova >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 5.0-rc, 5.x > > > [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/2646/workflows/bc2bba74-9e56-4bea-8de7-4ff840c4f450/jobs/56028/tests#failed-test-0] > {code:java} > junit.framework.AssertionFailedError at > org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading(UpgradeSSTablesTest.java:220) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43){code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-19398) Test Failure: org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading
[ https://issues.apache.org/jira/browse/CASSANDRA-19398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823587#comment-17823587 ] Brandon Williams edited comment on CASSANDRA-19398 at 3/5/24 1:23 PM: -- This looks good to me if you want to start on trunk. As you say you've only made it more deterministic so I don't think there should be any problem with the approach. was (Author: brandon.williams): This looks good to me if you want to start on trunk. As you say you've only made it more deterministic so I don't there should be any problem with the approach. > Test Failure: > org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading > -- > > Key: CASSANDRA-19398 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19398 > Project: Cassandra > Issue Type: Bug > Components: CI >Reporter: Ekaterina Dimitrova >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 5.0-rc, 5.x > > > [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/2646/workflows/bc2bba74-9e56-4bea-8de7-4ff840c4f450/jobs/56028/tests#failed-test-0] > {code:java} > junit.framework.AssertionFailedError at > org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading(UpgradeSSTablesTest.java:220) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43){code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19391) Flush metadata snapshot table on every write
[ https://issues.apache.org/jira/browse/CASSANDRA-19391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823600#comment-17823600 ] Marcus Eriksson commented on CASSANDRA-19391: - attaching ci results for this + 19390, rebased on fairly current trunk > Flush metadata snapshot table on every write > > > Key: CASSANDRA-19391 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19391 > Project: Cassandra > Issue Type: Improvement > Components: Transactional Cluster Metadata >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Low > Fix For: 5.x > > Attachments: ci_summary.html, result_details.tar.gz > > > We depend on the latest snapshot when starting up, flushing avoids gaps > between latest snapshot and the most recent local log entry -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19390) Transformation.Kind should contain an explicit integer id
[ https://issues.apache.org/jira/browse/CASSANDRA-19390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823599#comment-17823599 ] Marcus Eriksson commented on CASSANDRA-19390: - attaching ci results for this + 19391, rebased on fairly current trunk > Transformation.Kind should contain an explicit integer id > - > > Key: CASSANDRA-19390 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19390 > Project: Cassandra > Issue Type: Improvement > Components: Transactional Cluster Metadata >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Low > Fix For: 5.x > > Attachments: ci_summary.html, result_details.tar.gz > > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19391) Flush metadata snapshot table on every write
[ https://issues.apache.org/jira/browse/CASSANDRA-19391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-19391: Attachment: (was: ci_summary.html) > Flush metadata snapshot table on every write > > > Key: CASSANDRA-19391 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19391 > Project: Cassandra > Issue Type: Improvement > Components: Transactional Cluster Metadata >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Low > Fix For: 5.x > > Attachments: ci_summary.html, result_details.tar.gz > > > We depend on the latest snapshot when starting up, flushing avoids gaps > between latest snapshot and the most recent local log entry -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19391) Flush metadata snapshot table on every write
[ https://issues.apache.org/jira/browse/CASSANDRA-19391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-19391: Attachment: ci_summary.html result_details.tar.gz > Flush metadata snapshot table on every write > > > Key: CASSANDRA-19391 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19391 > Project: Cassandra > Issue Type: Improvement > Components: Transactional Cluster Metadata >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Low > Fix For: 5.x > > Attachments: ci_summary.html, result_details.tar.gz > > > We depend on the latest snapshot when starting up, flushing avoids gaps > between latest snapshot and the most recent local log entry -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19391) Flush metadata snapshot table on every write
[ https://issues.apache.org/jira/browse/CASSANDRA-19391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-19391: Attachment: (was: result_details.tar.gz) > Flush metadata snapshot table on every write > > > Key: CASSANDRA-19391 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19391 > Project: Cassandra > Issue Type: Improvement > Components: Transactional Cluster Metadata >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Low > Fix For: 5.x > > Attachments: ci_summary.html, result_details.tar.gz > > > We depend on the latest snapshot when starting up, flushing avoids gaps > between latest snapshot and the most recent local log entry -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19390) Transformation.Kind should contain an explicit integer id
[ https://issues.apache.org/jira/browse/CASSANDRA-19390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-19390: Attachment: (was: ci_summary.html) > Transformation.Kind should contain an explicit integer id > - > > Key: CASSANDRA-19390 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19390 > Project: Cassandra > Issue Type: Improvement > Components: Transactional Cluster Metadata >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Low > Fix For: 5.x > > Attachments: ci_summary.html, result_details.tar.gz > > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19390) Transformation.Kind should contain an explicit integer id
[ https://issues.apache.org/jira/browse/CASSANDRA-19390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-19390: Attachment: ci_summary.html result_details.tar.gz > Transformation.Kind should contain an explicit integer id > - > > Key: CASSANDRA-19390 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19390 > Project: Cassandra > Issue Type: Improvement > Components: Transactional Cluster Metadata >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Low > Fix For: 5.x > > Attachments: ci_summary.html, result_details.tar.gz > > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19390) Transformation.Kind should contain an explicit integer id
[ https://issues.apache.org/jira/browse/CASSANDRA-19390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-19390: Attachment: (was: result_details.tar.gz) > Transformation.Kind should contain an explicit integer id > - > > Key: CASSANDRA-19390 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19390 > Project: Cassandra > Issue Type: Improvement > Components: Transactional Cluster Metadata >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Low > Fix For: 5.x > > Attachments: ci_summary.html, result_details.tar.gz > > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-15452) Improve disk access patterns during compaction and streaming
[ https://issues.apache.org/jira/browse/CASSANDRA-15452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823391#comment-17823391 ] Jon Haddad edited comment on CASSANDRA-15452 at 3/5/24 12:43 PM: - I took another look at this. This lets us extract every read operation against a single data file: {noformat} awk '$4 == "R" { print $0 }' everyfs.txt | grep '30-bti-Data.db' > 30-bti-data.txt{noformat} If you glance at the end of the data, the last entry is this: {noformat} 23:47:12 CompactionExec 44651 R 2699 12483 0.00 da-30-bti-Data.db{noformat} {-}The data file is only 15KB{-}. But we're doing over 6 thousand reads {noformat} wc -l ../research/30-bti-data.txt 6420 ../research/30-bti-data.txt{noformat} The 5th column is the number of bytes read. Summing this: {noformat} awk '{ sum += $5; } END {print sum}' ../research/30-bti-data.txt 25571844{noformat} = 25MB -which is a lot to pull through the filesystem when in an optimal situation we would have done a single 16KB read.- Since these numbers are really, really weird, I'm going back through and verifying there's not a bug in the tools, or my understanding of them. Edit: I just realized the offset is expressed in KB, not bytes, my math was off. I'm going to redo this test as I lost the instance. I'm now trying to figure out if we're double reading. The last offset is at 12MB, and each read is recorded 2x. was (Author: rustyrazorblade): I took another look at this. This lets us extract every read operation against a single data file: {noformat} awk '$4 == "R" { print $0 }' everyfs.txt | grep '30-bti-Data.db' > 30-bti-data.txt{noformat} If you glance at the end of the data, the last entry is this: {noformat} 23:47:12 CompactionExec 44651 R 2699 12483 0.00 da-30-bti-Data.db{noformat} The data file is only 15KB. But we're doing over 6 thousand reads {noformat} wc -l ../research/30-bti-data.txt 6420 ../research/30-bti-data.txt{noformat} The 5th column is the number of bytes read. Summing this: {noformat} awk '{ sum += $5; } END {print sum}' ../research/30-bti-data.txt 25571844{noformat} = 25MB which is a lot to pull through the filesystem when in an optimal situation we would have done a single 16KB read. Since these numbers are really, really weird, I'm going back through and verifying there's not a bug in the tools, or my understanding of them. > Improve disk access patterns during compaction and streaming > > > Key: CASSANDRA-15452 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15452 > Project: Cassandra > Issue Type: Improvement > Components: Legacy/Local Write-Read Paths, Local/Compaction >Reporter: Jon Haddad >Priority: Normal > Attachments: everyfs.txt, results.txt, sequential.fio > > > On read heavy workloads Cassandra performs much better when using a low read > ahead setting. In my tests I've seen an 5x improvement in throughput and > more than a 50% reduction in latency. However, I've also observed that it > can have a negative impact on compaction and streaming throughput. It > especially negatively impacts cloud environments where small reads incur high > costs in IOPS due to tiny requests. > # We should investigate using POSIX_FADV_DONTNEED on files we're compacting > to see if we can improve performance and reduce page faults. > # This should be combined with an internal read ahead style buffer that > Cassandra manages, similar to a BufferedInputStream but with our own > machinery. This buffer should read fairly large blocks of data off disk at > at time. EBS, for example, allows 1 IOP to be up to 256KB. A considerable > amount of time is spent in blocking I/O during compaction and streaming. > Reducing the frequency we read from disk should speed up all sequential I/O > operations. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19398) Test Failure: org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading
[ https://issues.apache.org/jira/browse/CASSANDRA-19398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-19398: - Reviewers: Brandon Williams > Test Failure: > org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading > -- > > Key: CASSANDRA-19398 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19398 > Project: Cassandra > Issue Type: Bug > Components: CI >Reporter: Ekaterina Dimitrova >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 5.0-rc, 5.x > > > [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/2646/workflows/bc2bba74-9e56-4bea-8de7-4ff840c4f450/jobs/56028/tests#failed-test-0] > {code:java} > junit.framework.AssertionFailedError at > org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading(UpgradeSSTablesTest.java:220) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43){code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19398) Test Failure: org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading
[ https://issues.apache.org/jira/browse/CASSANDRA-19398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823587#comment-17823587 ] Brandon Williams commented on CASSANDRA-19398: -- This looks good to me if you want to start on trunk. As you say you've only made it more deterministic so I don't there should be any problem with the approach. > Test Failure: > org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading > -- > > Key: CASSANDRA-19398 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19398 > Project: Cassandra > Issue Type: Bug > Components: CI >Reporter: Ekaterina Dimitrova >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 5.0-rc, 5.x > > > [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/2646/workflows/bc2bba74-9e56-4bea-8de7-4ff840c4f450/jobs/56028/tests#failed-test-0] > {code:java} > junit.framework.AssertionFailedError at > org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading(UpgradeSSTablesTest.java:220) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43){code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19398) Test Failure: org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading
[ https://issues.apache.org/jira/browse/CASSANDRA-19398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823568#comment-17823568 ] Berenguer Blasi commented on CASSANDRA-19398: - [~brandon.williams] I stole this one from you as agreed. I have submitted byteman latches to make the test's behavior more deterministic. I _think_ that is correct unless the original authors say otherwise. > Test Failure: > org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading > -- > > Key: CASSANDRA-19398 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19398 > Project: Cassandra > Issue Type: Bug > Components: CI >Reporter: Ekaterina Dimitrova >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 5.0-rc, 5.x > > > [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/2646/workflows/bc2bba74-9e56-4bea-8de7-4ff840c4f450/jobs/56028/tests#failed-test-0] > {code:java} > junit.framework.AssertionFailedError at > org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading(UpgradeSSTablesTest.java:220) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43){code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-19398) Test Failure: org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading
[ https://issues.apache.org/jira/browse/CASSANDRA-19398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Berenguer Blasi reassigned CASSANDRA-19398: --- Assignee: Berenguer Blasi (was: Brandon Williams) > Test Failure: > org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading > -- > > Key: CASSANDRA-19398 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19398 > Project: Cassandra > Issue Type: Bug > Components: CI >Reporter: Ekaterina Dimitrova >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 5.0-rc, 5.x > > > [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/2646/workflows/bc2bba74-9e56-4bea-8de7-4ff840c4f450/jobs/56028/tests#failed-test-0] > {code:java} > junit.framework.AssertionFailedError at > org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading(UpgradeSSTablesTest.java:220) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43){code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19398) Test Failure: org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading
[ https://issues.apache.org/jira/browse/CASSANDRA-19398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Berenguer Blasi updated CASSANDRA-19398: Test and Documentation Plan: See PR Status: Patch Available (was: Open) > Test Failure: > org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading > -- > > Key: CASSANDRA-19398 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19398 > Project: Cassandra > Issue Type: Bug > Components: CI >Reporter: Ekaterina Dimitrova >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 5.0-rc, 5.x > > > [https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/2646/workflows/bc2bba74-9e56-4bea-8de7-4ff840c4f450/jobs/56028/tests#failed-test-0] > {code:java} > junit.framework.AssertionFailedError at > org.apache.cassandra.distributed.test.UpgradeSSTablesTest.truncateWhileUpgrading(UpgradeSSTablesTest.java:220) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43){code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
(cassandra-builds) branch trunk updated: ninja-fix – temporarily disabled arm building in cassandra-builds/jenkins-dsl/cassandra_job_dsl_seed.groovy
This is an automated email from the ASF dual-hosted git repository. mck pushed a commit to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra-builds.git The following commit(s) were added to refs/heads/trunk by this push: new f253806 ninja-fix – temporarily disabled arm building in cassandra-builds/jenkins-dsl/cassandra_job_dsl_seed.groovy f253806 is described below commit f2538069436c0e2a35c087671a5b11d85fecef70 Author: Mick Semb Wever AuthorDate: Tue Mar 5 09:42:42 2024 +0100 ninja-fix – temporarily disabled arm building in cassandra-builds/jenkins-dsl/cassandra_job_dsl_seed.groovy ref: CASSANDRA-19241 – Upgrade ci-cassandra.a.o agents to Ubuntu 22.04.3 --- jenkins-dsl/cassandra_job_dsl_seed.groovy | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/jenkins-dsl/cassandra_job_dsl_seed.groovy b/jenkins-dsl/cassandra_job_dsl_seed.groovy index fb46360..9251db9 100755 --- a/jenkins-dsl/cassandra_job_dsl_seed.groovy +++ b/jenkins-dsl/cassandra_job_dsl_seed.groovy @@ -19,7 +19,7 @@ def jobDescription = ''' // architectures. blank is amd64 def archs = ['', '-arm64'] -arm64_enabled = true +arm64_enabled = false // TODO waiting on CASSANDRA-19241 arm64_test_label_enabled = false def use_arm64_test_label() { return arm64_enabled && arm64_test_label_enabled } - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18934) Downgrade to 4.1 fails due to schema changes
[ https://issues.apache.org/jira/browse/CASSANDRA-18934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823490#comment-17823490 ] Maxwell Guo commented on CASSANDRA-18934: - hi [~claude], what about through slack ? > Downgrade to 4.1 fails due to schema changes > > > Key: CASSANDRA-18934 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18934 > Project: Cassandra > Issue Type: Bug > Components: Local/Startup and Shutdown >Reporter: David Capwell >Assignee: Maxwell Guo >Priority: Normal > Fix For: 5.x > > > We are required to support 5.0 downgrading to 4.1 as a migration step, but we > don’t have tests to show this is working… I wrote a quick test to make sure a > change we needed in Accord wouldn’t block the downgrade and see that we fail > right now. > {code} > ERROR 20:56:39 Exiting due to error while processing commit log during > initialization. > org.apache.cassandra.db.commitlog.CommitLogReadHandler$CommitLogReadException: > Unexpected error deserializing mutation; saved to > /var/folders/h1/s_3p1x3s3hl0hltbpck67m0hgn/T/mutation418421767150092dat. > This may be caused by replaying a mutation against a table with the same > name but incompatible schema. Exception follows: java.lang.RuntimeException: > Unknown column compaction_properties during deserialization > at > org.apache.cassandra.db.commitlog.CommitLogReader.readMutation(CommitLogReader.java:464) > at > org.apache.cassandra.db.commitlog.CommitLogReader.readSection(CommitLogReader.java:397) > at > org.apache.cassandra.db.commitlog.CommitLogReader.readCommitLogSegment(CommitLogReader.java:244) > at > org.apache.cassandra.db.commitlog.CommitLogReader.readCommitLogSegment(CommitLogReader.java:147) > at > org.apache.cassandra.db.commitlog.CommitLogReplayer.replayFiles(CommitLogReplayer.java:191) > at > org.apache.cassandra.db.commitlog.CommitLog.recoverFiles(CommitLog.java:223) > at > org.apache.cassandra.db.commitlog.CommitLog.recoverSegmentsOnDisk(CommitLog.java:204) > {code} > This was caused by a schema change in CASSANDRA-18061 > {code} > /* > * Licensed to the Apache Software Foundation (ASF) under one > * or more contributor license agreements. See the NOTICE file > * distributed with this work for additional information > * regarding copyright ownership. The ASF licenses this file > * to you under the Apache License, Version 2.0 (the > * "License"); you may not use this file except in compliance > * with the License. You may obtain a copy of the License at > * > * http://www.apache.org/licenses/LICENSE-2.0 > * > * Unless required by applicable law or agreed to in writing, software > * distributed under the License is distributed on an "AS IS" BASIS, > * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. > * See the License for the specific language governing permissions and > * limitations under the License. > */ > package org.apache.cassandra.distributed.upgrade; > import java.io.IOException; > import java.io.File; > import java.util.concurrent.atomic.AtomicBoolean; > import org.junit.Test; > import org.apache.cassandra.distributed.api.IUpgradeableInstance; > public class DowngradeTest extends UpgradeTestBase > { > @Test > public void test() throws Throwable > { > AtomicBoolean first = new AtomicBoolean(true); > new TestCase() > .nodes(1) > .withConfig(c -> { > if (first.compareAndSet(true, false)) > c.set("storage_compatibility_mode", "CASSANDRA_4"); > }) > .downgradeTo(v41) > .setup(cluster -> {}) > // Uncomment if you want to test what happens after reading the commit log, > which fails right now > //.runBeforeNodeRestart((cluster, nodeId) -> { > //IUpgradeableInstance inst = cluster.get(nodeId); > //File f = new File((String) > inst.config().get("commitlog_directory")); > //deleteRecursive(f); > //}) > .runAfterClusterUpgrade(cluster -> {}) > .run(); > } > private void deleteRecursive(File f) > { > if (f.isDirectory()) > { > File[] children = f.listFiles(); > if (children != null) > { > for (File c : children) > deleteRecursive(c); > } > } > f.delete(); > } > } > {code} > {code} > diff --git > a/test/distributed/org/apache/cassandra/distributed/upgrade/UpgradeTestBase.java > > b/test/distributed/org/apache/cassandra/distributed/upgrade/UpgradeTestBase.java > index 5ee8780204..b4111e3b44 100644 > --- > a/test/distributed/org/apache/cassandra/distributed/upgrade/UpgradeTestBase.jav