from:"Jeremy Hanna \(JIRA\)"

[jira] [Comment Edited] (CASSANDRA-19448) CommitlogArchiver only has granularity to seconds for restore_point_in_time

2024-02-29 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-19448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17822210#comment-17822210
 ] 

Jeremy Hanna edited comment on CASSANDRA-19448 at 2/29/24 4:28 PM:
---

It's currently unassigned so feel free to take a look. Thanks!

I've thought about it as a bug just because Cassandra stores update times as 
milliseconds or microseconds and there is nothing in the description that says 
that you can't use that granularity. It's just that the example is in seconds. 
It's not clear and there's no warning or error if you give it something with a 
granularity greater than seconds - it just ignores it. What to do about that 
could be either to:
 # be clearer in the docs and have a warning/error when users try to use a 
granularity greater than seconds.
 # make it respect greater granularities which aligns more with the C* write 
timestamp formats

I think 2 is the better outcome.

So I think it could be argued as a bug or an improvement.  [~brandon.williams] 
do you have any thoughts on bug or improvement designation?


was (Author: jeromatron):
It's currently unassigned so feel free to take a look. Thanks!

I've thought about it as a bug just because Cassandra stores update times as 
milliseconds or microseconds and there is nothing in the description that says 
that you can't use that granularity. It's just that the example is in seconds. 
Since it's not clear and there's no warning or error if you give it something 
with a granularity greater than seconds - it just ignores it. What to do about 
that could be either to:
 # be clearer in the docs and have a warning/error when users try to use a 
granularity greater than seconds.
 # make it respect greater granularities which aligns more with the C* write 
timestamp formats

I think 2 is the better outcome.

So I think it could be argued as a bug or an improvement.  [~brandon.williams] 
do you have any thoughts on bug or improvement designation?

> CommitlogArchiver only has granularity to seconds for restore_point_in_time
> ---
>
> Key: CASSANDRA-19448
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19448
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log
>Reporter: Jeremy Hanna
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> Commitlog archiver allows users to backup commitlog files for the purpose of 
> doing point in time restores.  The [configuration 
> file|https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties]
>  gives an example of down to the seconds granularity but then asks what 
> whether the timestamps are microseconds or milliseconds - defaulting to 
> microseconds.  Because the [CommitLogArchiver uses a second based date 
> format|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogArchiver.java#L52],
>  if a user specifies to restore at something at a lower granularity like 
> milliseconds or microseconds, that means that the it will truncate everything 
> after the second and restore to that second.  So say you specify a 
> restore_point_in_time like this:
> restore_point_in_time=2024:01:18 17:01:01.623392
> it will silently truncate everything after the 01 seconds.  So effectively to 
> the user, it is missing updates between 01 and 01.623392.
> This appears to be a bug in the intent.  We should allow users to specify 
> down to the millisecond or even microsecond level. If we allow them to 
> specify down to microseconds for the restore point in time, then it may 
> internally need to change from a long.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-19448) CommitlogArchiver only has granularity to seconds for restore_point_in_time

2024-02-29 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-19448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17822210#comment-17822210
 ] 

Jeremy Hanna commented on CASSANDRA-19448:
--

It's currently unassigned so feel free to take a look. Thanks!

I've thought about it as a bug just because Cassandra stores update times as 
milliseconds or microseconds and there is nothing in the description that says 
that you can't use that granularity. It's just that the example is in seconds. 
Since it's not clear and there's no warning or error if you give it something 
with a granularity greater than seconds - it just ignores it. What to do about 
that could be either to:
 # be clearer in the docs and have a warning/error when users try to use a 
granularity greater than seconds.
 # make it respect greater granularities which aligns more with the C* write 
timestamp formats

I think 2 is the better outcome.

So I think it could be argued as a bug or an improvement.  [~brandon.williams] 
do you have any thoughts on bug or improvement designation?

> CommitlogArchiver only has granularity to seconds for restore_point_in_time
> ---
>
> Key: CASSANDRA-19448
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19448
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log
>Reporter: Jeremy Hanna
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> Commitlog archiver allows users to backup commitlog files for the purpose of 
> doing point in time restores.  The [configuration 
> file|https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties]
>  gives an example of down to the seconds granularity but then asks what 
> whether the timestamps are microseconds or milliseconds - defaulting to 
> microseconds.  Because the [CommitLogArchiver uses a second based date 
> format|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogArchiver.java#L52],
>  if a user specifies to restore at something at a lower granularity like 
> milliseconds or microseconds, that means that the it will truncate everything 
> after the second and restore to that second.  So say you specify a 
> restore_point_in_time like this:
> restore_point_in_time=2024:01:18 17:01:01.623392
> it will silently truncate everything after the 01 seconds.  So effectively to 
> the user, it is missing updates between 01 and 01.623392.
> This appears to be a bug in the intent.  We should allow users to specify 
> down to the millisecond or even microsecond level. If we allow them to 
> specify down to microseconds for the restore point in time, then it may 
> internally need to change from a long.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-19448) CommitlogArchiver only has granularity to seconds for restore_point_in_time

2024-02-28 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-19448:
-
Description: 
Commitlog archiver allows users to backup commitlog files for the purpose of 
doing point in time restores.  The [configuration 
file|https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties]
 gives an example of down to the seconds granularity but then asks what whether 
the timestamps are microseconds or milliseconds - defaulting to microseconds.  
Because the [CommitLogArchiver uses a second based date 
format|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogArchiver.java#L52],
 if a user specifies to restore at something at a lower granularity like 
milliseconds or microseconds, that means that the it will truncate everything 
after the second and restore to that second.  So say you specify a 
restore_point_in_time like this:
restore_point_in_time=2024:01:18 17:01:01.623392
it will silently truncate everything after the 01 seconds.  So effectively to 
the user, it is missing updates between 01 and 01.623392.

This appears to be a bug in the intent.  We should allow users to specify down 
to the millisecond or even microsecond level. If we allow them to specify down 
to microseconds for the restore point in time, then it may internally need to 
change from a long.

  was:
Commitlog archiver allows users to backup commitlog files for the purpose of 
doing point in time restores.  The [configuration 
file|https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties]
 gives an example of down to the seconds granularity but then asks what whether 
the timestamps are microseconds or milliseconds - defaulting to microseconds.  
Because the [CommitLogArchiver uses a second based date 
format|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogArchiver.java#L52],
 if a user specifies to restore at something at a lower granularity like 
milliseconds or microseconds, that means that the it will truncate everything 
after the second and restore to that second.  So say you specify a 
restore_point_in_time like this:
restore_point_in_time=2024:01:18 17:01:01.623392
it will silently truncate everything after the 01 seconds.  So effectively to 
the user, it is missing updates between 01 and 01.623392.

This appears to be bug in the intent.  We should allow users to specify down to 
the millisecond or even microsecond level. If we allow them to specify down to 
microseconds for the restore point in time, then it may internally need to 
change from a long.


> CommitlogArchiver only has granularity to seconds for restore_point_in_time
> ---
>
> Key: CASSANDRA-19448
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19448
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log
>Reporter: Jeremy Hanna
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> Commitlog archiver allows users to backup commitlog files for the purpose of 
> doing point in time restores.  The [configuration 
> file|https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties]
>  gives an example of down to the seconds granularity but then asks what 
> whether the timestamps are microseconds or milliseconds - defaulting to 
> microseconds.  Because the [CommitLogArchiver uses a second based date 
> format|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogArchiver.java#L52],
>  if a user specifies to restore at something at a lower granularity like 
> milliseconds or microseconds, that means that the it will truncate everything 
> after the second and restore to that second.  So say you specify a 
> restore_point_in_time like this:
> restore_point_in_time=2024:01:18 17:01:01.623392
> it will silently truncate everything after the 01 seconds.  So effectively to 
> the user, it is missing updates between 01 and 01.623392.
> This appears to be a bug in the intent.  We should allow users to specify 
> down to the millisecond or even microsecond level. If we allow them to 
> specify down to microseconds for the restore point in time, then it may 
> internally need to change from a long.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-19448) CommitlogArchiver only has granularity to seconds for restore_point_in_time

2024-02-28 Thread Jeremy Hanna (Jira)

Jeremy Hanna created CASSANDRA-19448:


 Summary: CommitlogArchiver only has granularity to seconds for 
restore_point_in_time
 Key: CASSANDRA-19448
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19448
 Project: Cassandra
  Issue Type: Bug
Reporter: Jeremy Hanna


Commitlog archiver allows users to backup commitlog files for the purpose of 
doing point in time restores.  The [configuration 
file|https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties]
 gives an example of down to the seconds granularity but then asks what whether 
the timestamps are microseconds or milliseconds - defaulting to microseconds.  
Because the [CommitLogArchiver uses a second based date 
format|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogArchiver.java#L52],
 if a user specifies to restore at something at a lower granularity like 
milliseconds or microseconds, that means that the it will truncate everything 
after the second and restore to that second.  So say you specify a 
restore_point_in_time like this:
restore_point_in_time=2024:01:18 17:01:01.623392
it will silently truncate everything after the 01 seconds.  So effectively to 
the user, it is missing updates between 01 and 01.623392.

This appears to be bug in the intent.  We should allow users to specify down to 
the millisecond or even microsecond level. If we allow them to specify down to 
microseconds for the restore point in time, then it may internally need to 
change from a long.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-19362) An "include" is broken on the Storage Engine documentation page

2024-02-02 Thread Jeremy Hanna (Jira)

Jeremy Hanna created CASSANDRA-19362:


 Summary: An "include" is broken on the Storage Engine 
documentation page
 Key: CASSANDRA-19362
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19362
 Project: Cassandra
  Issue Type: Bug
  Components: Documentation
Reporter: Jeremy Hanna


The example code at the bottom of the "Storage Engine" page doesn't appear to 
be including the code properly.  See 
https://cassandra.apache.org/doc/stable/cassandra/architecture/storage_engine.html#example-code



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-9328) WriteTimeoutException thrown when LWT concurrency > 1, despite the query duration taking MUCH less than cas_contention_timeout_in_ms

2024-01-30 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-9328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17812359#comment-17812359
 ] 

Jeremy Hanna commented on CASSANDRA-9328:
-

See CASSANDRA-15350 for a separated out exception type in native protocol v5.

> WriteTimeoutException thrown when LWT concurrency > 1, despite the query 
> duration taking MUCH less than cas_contention_timeout_in_ms
> 
>
> Key: CASSANDRA-9328
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9328
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Lightweight Transactions, Legacy/Coordination
>Reporter: Aaron Whiteside
>Priority: Normal
>  Labels: LWT
> Attachments: CassandraLWTTest.java, CassandraLWTTest2.java
>
>
> WriteTimeoutException thrown when LWT concurrency > 1, despite the query 
> duration taking MUCH less than cas_contention_timeout_in_ms.
> Unit test attached, run against a 3 node cluster running 2.1.5.
> If you reduce the threadCount to 1, you never see a WriteTimeoutException. If 
> the WTE is due to not being able to communicate with other nodes, why does 
> the concurrency >1 cause inter-node communication to fail?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-8110) Make streaming forward & backwards compatible

2023-09-28 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-8110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17770086#comment-17770086
 ] 

Jeremy Hanna commented on CASSANDRA-8110:
-

[~yukim] is this done with what [~Bereng] did on CASSANDRA-14227?  
Specifically, the {{storage_compatibility_mode}} as described in the latest 
cassandra.yaml 
([here|https://github.com/apache/cassandra/blob/trunk/conf/cassandra.yaml#L2090-L2115]
 is the block from cassandra.yaml):

 
{code:java}
# This property indicates with what Cassandra major version the storage format 
will be compatible with.
#
# The chosen storage compatiblity mode will determine the versions of the 
written sstables, commitlogs, hints,
# etc. Those storage elements will use the higher minor versions of the major 
version that corresponds to the
# Cassandra version we want to stay compatible with. For example, if we want to 
stay compatible with Cassandra 4.0
# or 4.1, the value of this property should be 4, and that will make us use 
'nc' sstables.
#
# This will also determine if certain features depending on newer formats are 
available. For example, extended TTLs
# up to 2106 depend on the sstable, commitlog, hints and messaging versions 
that were introduced by Cassandra 5.0,
# so that feature won't be available if this property is set to CASSANDRA_4. 
See upgrade guides for details. Currently 
# the only supported major is CASSANDRA_4.
#
# Possible values are in the StorageCompatibilityMode.java file accessible 
online. At the time of writing these are:
# - CASSANDRA_4: Stays compatible with the 4.x line in features, formats and 
component versions.
# - UPGRADING:   The cluster monitors nodes versions during this interim stage. 
_This has a cost_ but ensures any new features, 
#                formats, versions, etc are enabled safely.
# - NONE:        Start with all the new features and formats enabled.
#
# A typical upgrade would be:
# - Do a rolling upgrade starting all nodes in CASSANDRA_Y compatibility mode.
# - Once the new binary is rendered stable do a rolling restart with UPGRADING. 
The cluster will enable new features in a safe way 
#   until all nodes are started in UPGRADING, then all new features are enabled.
# - Do a rolling restart with all nodes starting with NONE. This sheds the 
extra cost of checking nodes versions and ensures 
#   a stable cluster. If a node from a previous version was started by accident 
we won't any longer toggle behaviors as when UPGRADING.
#
storage_compatibility_mode: CASSANDRA_4 {code}
 

 

 

> Make streaming forward & backwards compatible
> -
>
> Key: CASSANDRA-8110
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8110
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/Streaming and Messaging
>Reporter: Marcus Eriksson
>Priority: Normal
>  Labels: gsoc2016, mentor
>
> To be able to seamlessly upgrade clusters we need to make it possible to 
> stream files between nodes with different StreamMessage.CURRENT_VERSION



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-18837) Tab complete datacenter values in cqlsh

2023-09-11 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-18837:
-
Complexity: Low Hanging Fruit

> Tab complete datacenter values in cqlsh
> ---
>
> Key: CASSANDRA-18837
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18837
> Project: Cassandra
>  Issue Type: Task
>  Components: CQL/Interpreter
>Reporter: Jeremy Hanna
>Priority: Normal
>
> cqlsh has a number of great tab completions.  For example, when creating a 
> keyspace it will tab complete the syntax for options and give you options for 
> the replication strategy.  It doesn't show options for the data centers, 
> which would be nice to have.  The server has access to the list of data 
> centers in the cluster.  So there shouldn't be a reason why that couldn't tab 
> complete.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-18837) Tab complete datacenter values in cqlsh

2023-09-11 Thread Jeremy Hanna (Jira)

Jeremy Hanna created CASSANDRA-18837:


 Summary: Tab complete datacenter values in cqlsh
 Key: CASSANDRA-18837
 URL: https://issues.apache.org/jira/browse/CASSANDRA-18837
 Project: Cassandra
  Issue Type: Task
  Components: CQL/Interpreter
Reporter: Jeremy Hanna


cqlsh has a number of great tab completions.  For example, when creating a 
keyspace it will tab complete the syntax for options and give you options for 
the replication strategy.  It doesn't show options for the data centers, which 
would be nice to have.  The server has access to the list of data centers in 
the cluster.  So there shouldn't be a reason why that couldn't tab complete.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-18473) Storage Attached Indexes (Phase 2)

2023-09-05 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-18473:
-
Labels: SAI  (was: )

> Storage Attached Indexes (Phase 2)
> --
>
> Key: CASSANDRA-18473
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18473
> Project: Cassandra
>  Issue Type: Epic
>  Components: Feature/2i Index
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
>  Labels: SAI
>
> At the completion of CASSANDRA-16052, we should be able to release the core 
> capabilities of SAI in a stable, production-ready package. Once that begins 
> to gain traction, we'll be able to make improvements and add features for the 
> next major release. The major initial theme of this epic is likely to be 
> performance, but it will likely expand to include features like basic text 
> analysis, etc.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-8110) Make streaming forward & backwards compatible

2023-08-14 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-8110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17754185#comment-17754185
 ] 

Jeremy Hanna commented on CASSANDRA-8110:
-

Is there any update on this since the approach or implementation as we're 
getting closer to finalizing 5.0?

> Make streaming forward & backwards compatible
> -
>
> Key: CASSANDRA-8110
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8110
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/Streaming and Messaging
>Reporter: Marcus Eriksson
>Priority: Normal
>  Labels: gsoc2016, mentor
>
> To be able to seamlessly upgrade clusters we need to make it possible to 
> stream files between nodes with different StreamMessage.CURRENT_VERSION



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-18269) Update the client drivers list

2023-02-16 Thread Jeremy Hanna (Jira)

Jeremy Hanna created CASSANDRA-18269:


 Summary: Update the client drivers list
 Key: CASSANDRA-18269
 URL: https://issues.apache.org/jira/browse/CASSANDRA-18269
 Project: Cassandra
  Issue Type: Task
  Components: Documentation
Reporter: Jeremy Hanna


Currently, the docs has a page that lists client drivers by language.  It's got 
a lot of entries that, on further investigation, haven't been updated in 
several years.  It would be good to either indicate the activity on the 
driver/project or remove the older ones so that people don't get the wrong 
impression and use something that won't serve them well.

https://cassandra.apache.org/doc/latest/cassandra/getting_started/drivers.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-11721) Have a per operation truncate ddl "no snapshot" option

2022-12-08 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-11721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17644885#comment-17644885
 ] 

Jeremy Hanna edited comment on CASSANDRA-11721 at 12/8/22 4:06 PM:
---

I think CASSANDRA-10383 solves the production use cases for this and I'm very 
happy that it got implemented there.  There are cases in test and dev 
environments where I could still see a per operation setting being useful, but 
the majority of the use cases are covered by a table level setting.  I'm happy 
to "won't do" this one as updating CQL is a pain for just those use cases.


was (Author: jeromatron):
I think CASSANDRA-10383 solves the production use cases for this and I'm very 
happy that it got implemented there.  There are cases in test and dev 
environments where I could still see a per operation setting being useful, but 
the majority of the use cases are covered by a table level setting.  I'm happy 
to "won't fix" this one as updating CQL is a pain for just those use cases.

> Have a per operation truncate ddl "no snapshot" option
> --
>
> Key: CASSANDRA-11721
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11721
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/CQL, Local/Snapshots
>Reporter: Jeremy Hanna
>Priority: Low
>  Labels: AdventCalendar2021
>
> Right now with truncate, it will always create a snapshot.  That is the right 
> thing to do most of the time.  'auto_snapshot' exists as an option to disable 
> that but it is server wide and requires a restart to change.  There are data 
> models, however, that require rotating through a handful of tables and 
> periodically truncating them.  Currently you either have to operate with no 
> safety net (some actually do this) or manually clear those snapshots out 
> periodically.  Both are less than optimal.
> In HDFS, you generally delete something where it goes to the trash.  If you 
> don't want that safety net, you can do something like 'rm -rf -skiptrash 
> /jeremy/stuff' in one command.
> It would be nice to have something in the truncate ddl to skip the snapshot 
> on a per operation basis.  Perhaps 'TRUNCATE solarsystem.earth NO SNAPSHOT'.
> This might also be useful in those situations where you're just playing with 
> data and you don't want something to take a snapshot in a development system. 
>  If that's the case, this would also be useful for the DROP operation, but 
> that convenience is not the main reason for this option.
> +Additional information for newcomers:+
> This test is a bit more complex that normal LHF tickets but is still 
> reasonably easy.
> The idea is to support disabling snapshots when performing a Truncate as 
> follow:
> {code}TRUNCATE x WITH OPTIONS = { 'snapshot' : false }{code}
> In order to implement that feature several changes are required:
> * A new Class {{TruncateAttributes}} inheriting from {{PropertyDefinitions}} 
> must be create in a similar way to {{KeyspaceAttributes}} or 
> {{TableAttributes}}
> * This class should be passed to the {{TruncateStatement}} constructor and 
> stored as a field
> * The ANTLR parser logic should be change to retrieve the options and passe 
> them to the constructor (see {{createKeyspaceStatement}} for an example)
> * The {{TruncateStatement}} will then need to be modified to take into 
> account the new option. Locally it will neeed to call 
> {{ColumnFamilyStore#truncateBlockingWithoutSnapshot}} if no snapshot should 
> be done instead of  {{ColumnFamilyStore#truncateBlocking}}. For non local 
> call it will need to pass a new parameter to 
> {{StorageProxy#truncateBloking}}. That parameter will then need to be passed 
> to the other nodes through the {{TruncateRequest}}.
> * As a new field need to be added to {{TruncateRequest}} this field will need 
> to be serialized and deserialized and a new {{MessagingService.Version}} will 
> need to be created and set as the current version the new version should be 
> 50 (and yes it means that the next release will be a major one 5.0)
> * In {{TruncateVerbHandler}} the new field should be used to determine if 
> {{ColumnFamilyStore#truncateBlockingWithoutSnapshot}} or 
> {{ColumnFamilyStore#truncateBlocking}} should be called.  
> * An in-jvm test should be added in 
> {{test/distributed/org/apache/cassandra/distributed/test}}  to test that 
> truncate does not generate snapshots when the new option is specified. 
> Do not hesitate to ping the mentor for more information.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-11721) Have a per operation truncate ddl "no snapshot" option

2022-12-08 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-11721:
-
Resolution: Won't Do
Status: Resolved  (was: Open)

As discussed previously,  CASSANDRA-10383 solves the majority of what this 
covers.

> Have a per operation truncate ddl "no snapshot" option
> --
>
> Key: CASSANDRA-11721
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11721
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/CQL, Local/Snapshots
>Reporter: Jeremy Hanna
>Priority: Low
>  Labels: AdventCalendar2021
>
> Right now with truncate, it will always create a snapshot.  That is the right 
> thing to do most of the time.  'auto_snapshot' exists as an option to disable 
> that but it is server wide and requires a restart to change.  There are data 
> models, however, that require rotating through a handful of tables and 
> periodically truncating them.  Currently you either have to operate with no 
> safety net (some actually do this) or manually clear those snapshots out 
> periodically.  Both are less than optimal.
> In HDFS, you generally delete something where it goes to the trash.  If you 
> don't want that safety net, you can do something like 'rm -rf -skiptrash 
> /jeremy/stuff' in one command.
> It would be nice to have something in the truncate ddl to skip the snapshot 
> on a per operation basis.  Perhaps 'TRUNCATE solarsystem.earth NO SNAPSHOT'.
> This might also be useful in those situations where you're just playing with 
> data and you don't want something to take a snapshot in a development system. 
>  If that's the case, this would also be useful for the DROP operation, but 
> that convenience is not the main reason for this option.
> +Additional information for newcomers:+
> This test is a bit more complex that normal LHF tickets but is still 
> reasonably easy.
> The idea is to support disabling snapshots when performing a Truncate as 
> follow:
> {code}TRUNCATE x WITH OPTIONS = { 'snapshot' : false }{code}
> In order to implement that feature several changes are required:
> * A new Class {{TruncateAttributes}} inheriting from {{PropertyDefinitions}} 
> must be create in a similar way to {{KeyspaceAttributes}} or 
> {{TableAttributes}}
> * This class should be passed to the {{TruncateStatement}} constructor and 
> stored as a field
> * The ANTLR parser logic should be change to retrieve the options and passe 
> them to the constructor (see {{createKeyspaceStatement}} for an example)
> * The {{TruncateStatement}} will then need to be modified to take into 
> account the new option. Locally it will neeed to call 
> {{ColumnFamilyStore#truncateBlockingWithoutSnapshot}} if no snapshot should 
> be done instead of  {{ColumnFamilyStore#truncateBlocking}}. For non local 
> call it will need to pass a new parameter to 
> {{StorageProxy#truncateBloking}}. That parameter will then need to be passed 
> to the other nodes through the {{TruncateRequest}}.
> * As a new field need to be added to {{TruncateRequest}} this field will need 
> to be serialized and deserialized and a new {{MessagingService.Version}} will 
> need to be created and set as the current version the new version should be 
> 50 (and yes it means that the next release will be a major one 5.0)
> * In {{TruncateVerbHandler}} the new field should be used to determine if 
> {{ColumnFamilyStore#truncateBlockingWithoutSnapshot}} or 
> {{ColumnFamilyStore#truncateBlocking}} should be called.  
> * An in-jvm test should be added in 
> {{test/distributed/org/apache/cassandra/distributed/test}}  to test that 
> truncate does not generate snapshots when the new option is specified. 
> Do not hesitate to ping the mentor for more information.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-11721) Have a per operation truncate ddl "no snapshot" option

2022-12-08 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-11721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17644885#comment-17644885
 ] 

Jeremy Hanna commented on CASSANDRA-11721:
--

I think CASSANDRA-10383 solves the production use cases for this and I'm very 
happy that it got implemented there.  There are cases in test and dev 
environments where I could still see a per operation setting being useful, but 
the majority of the use cases are covered by a table level setting.  I'm happy 
to "won't fix" this one as updating CQL is a pain for just those use cases.

> Have a per operation truncate ddl "no snapshot" option
> --
>
> Key: CASSANDRA-11721
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11721
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/CQL, Local/Snapshots
>Reporter: Jeremy Hanna
>Priority: Low
>  Labels: AdventCalendar2021
>
> Right now with truncate, it will always create a snapshot.  That is the right 
> thing to do most of the time.  'auto_snapshot' exists as an option to disable 
> that but it is server wide and requires a restart to change.  There are data 
> models, however, that require rotating through a handful of tables and 
> periodically truncating them.  Currently you either have to operate with no 
> safety net (some actually do this) or manually clear those snapshots out 
> periodically.  Both are less than optimal.
> In HDFS, you generally delete something where it goes to the trash.  If you 
> don't want that safety net, you can do something like 'rm -rf -skiptrash 
> /jeremy/stuff' in one command.
> It would be nice to have something in the truncate ddl to skip the snapshot 
> on a per operation basis.  Perhaps 'TRUNCATE solarsystem.earth NO SNAPSHOT'.
> This might also be useful in those situations where you're just playing with 
> data and you don't want something to take a snapshot in a development system. 
>  If that's the case, this would also be useful for the DROP operation, but 
> that convenience is not the main reason for this option.
> +Additional information for newcomers:+
> This test is a bit more complex that normal LHF tickets but is still 
> reasonably easy.
> The idea is to support disabling snapshots when performing a Truncate as 
> follow:
> {code}TRUNCATE x WITH OPTIONS = { 'snapshot' : false }{code}
> In order to implement that feature several changes are required:
> * A new Class {{TruncateAttributes}} inheriting from {{PropertyDefinitions}} 
> must be create in a similar way to {{KeyspaceAttributes}} or 
> {{TableAttributes}}
> * This class should be passed to the {{TruncateStatement}} constructor and 
> stored as a field
> * The ANTLR parser logic should be change to retrieve the options and passe 
> them to the constructor (see {{createKeyspaceStatement}} for an example)
> * The {{TruncateStatement}} will then need to be modified to take into 
> account the new option. Locally it will neeed to call 
> {{ColumnFamilyStore#truncateBlockingWithoutSnapshot}} if no snapshot should 
> be done instead of  {{ColumnFamilyStore#truncateBlocking}}. For non local 
> call it will need to pass a new parameter to 
> {{StorageProxy#truncateBloking}}. That parameter will then need to be passed 
> to the other nodes through the {{TruncateRequest}}.
> * As a new field need to be added to {{TruncateRequest}} this field will need 
> to be serialized and deserialized and a new {{MessagingService.Version}} will 
> need to be created and set as the current version the new version should be 
> 50 (and yes it means that the next release will be a major one 5.0)
> * In {{TruncateVerbHandler}} the new field should be used to determine if 
> {{ColumnFamilyStore#truncateBlockingWithoutSnapshot}} or 
> {{ColumnFamilyStore#truncateBlocking}} should be called.  
> * An in-jvm test should be added in 
> {{test/distributed/org/apache/cassandra/distributed/test}}  to test that 
> truncate does not generate snapshots when the new option is specified. 
> Do not hesitate to ping the mentor for more information.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-17352) CVE-2021-44521: Apache Cassandra: Remote code execution for scripted UDFs

2022-09-27 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-17352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17610206#comment-17610206
 ] 

Jeremy Hanna commented on CASSANDRA-17352:
--

[~marcuse] do you have any thoughts on the flags that were used here?  Am I 
misunderstanding intent of having two flags?

> CVE-2021-44521: Apache Cassandra: Remote code execution for scripted UDFs
> -
>
> Key: CASSANDRA-17352
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17352
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/UDF
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
> Fix For: 3.0.26, 3.11.12, 4.0.2
>
>
> When running Apache Cassandra with the following configuration:
> enable_user_defined_functions: true
> enable_scripted_user_defined_functions: true
> enable_user_defined_functions_threads: false 
> it is possible for an attacker to execute arbitrary code on the host. The 
> attacker would need to have enough permissions to create user defined 
> functions in the cluster to be able to exploit this. Note that this 
> configuration is documented as unsafe, and will continue to be considered 
> unsafe after this CVE.
> This issue is being tracked as CASSANDRA-17352
> Mitigation:
> Set `enable_user_defined_functions_threads: true` (this is default)
> or
> 3.0 users should upgrade to 3.0.26
> 3.11 users should upgrade to 3.11.12
> 4.0 users should upgrade to 4.0.2
> Credit:
> This issue was discovered by Omer Kaspi of the JFrog Security vulnerability 
> research team.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-17352) CVE-2021-44521: Apache Cassandra: Remote code execution for scripted UDFs

2022-09-08 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-17352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17601997#comment-17601997
 ] 

Jeremy Hanna commented on CASSANDRA-17352:
--

I just want to make sure the settings have the practical outcomes that are 
intended.

I can use UDFs with just the following setting:

{{enable_user_defined_functions: true}}

However if I want to enable multi-threaded behavior in the UDFs, I would need 
to set:

{{enable_user_defined_functions: true}}
{{enable_user_defined_functions_threads: false}}
{{allow_insecure_udfs: true}}

If I don't do the last one, {{allow_insecure_udfs: true}}, then the server 
doesn't start and it gives the warning/recommendation but also says that it 
would require that field to be set to true to continue.

Once these fields are set, I can start the server (in my case 3.11.13).  
However according to the 
[code|https://github.com/apache/cassandra/blob/cassandra-3.11/src/java/org/apache/cassandra/security/ThreadAwareSecurityManager.java#L186],
 it looks like the {{allow_extra_insecure_udfs}} setting should also be set to 
true for the server to start up.  Otherwise it should throw an AccessDenied 
exception.

So my question is: is there a bug in the implementation where we allow it to 
start without setting {{allow_extra_insecure_udfs: true}}?  Also if it does 
throw an AccessDenied exception, shouldn't it fail earlier when parsing the 
configuration with a log message that it is required?

That leads to another question about this, if it does require both flags to 
start the server, why do we have two flags?  Why not just 
{{allow_insecure_udfs}} if there is no effective difference between setting 
{{allow_insecure_udfs}} and setting both of them.  I know the intent from the 
ticket was that the {{allow_extra_insecure_udfs}} was to further relax security 
for those wanting to use the java.lang.System package in the UDF, but the line 
of code from the ThreadAwareSecurityManager seems to suggest that there is no 
difference.

> CVE-2021-44521: Apache Cassandra: Remote code execution for scripted UDFs
> -
>
> Key: CASSANDRA-17352
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17352
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/UDF
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
> Fix For: 3.0.26, 3.11.12, 4.0.2
>
>
> When running Apache Cassandra with the following configuration:
> enable_user_defined_functions: true
> enable_scripted_user_defined_functions: true
> enable_user_defined_functions_threads: false 
> it is possible for an attacker to execute arbitrary code on the host. The 
> attacker would need to have enough permissions to create user defined 
> functions in the cluster to be able to exploit this. Note that this 
> configuration is documented as unsafe, and will continue to be considered 
> unsafe after this CVE.
> This issue is being tracked as CASSANDRA-17352
> Mitigation:
> Set `enable_user_defined_functions_threads: true` (this is default)
> or
> 3.0 users should upgrade to 3.0.26
> 3.11 users should upgrade to 3.11.12
> 4.0 users should upgrade to 4.0.2
> Credit:
> This issue was discovered by Omer Kaspi of the JFrog Security vulnerability 
> research team.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15803) Separate out allow filtering scanning through a partition versus scanning over the table

2022-07-18 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17568052#comment-17568052
 ] 

Jeremy Hanna commented on CASSANDRA-15803:
--

I could see this getting added to the guardrails framework - separating out 
cluster scanning from partition scanning as two separate guardrails.

> Separate out allow filtering scanning through a partition versus scanning 
> over the table
> 
>
> Key: CASSANDRA-15803
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15803
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL/Syntax
>Reporter: Jeremy Hanna
>Priority: Normal
>
> Currently allow filtering can mean two things in the spirit of "avoid 
> operations that don't seek to a specific row or sequential rows of data."  
> First, it can mean scanning across the entire table to meet the criteria of 
> the query.  That's almost always a bad thing and should be discouraged or 
> disabled (see CASSANDRA-8303).  Second, it can mean filtering within a 
> specific partition.  For example, in a query you could specify the full 
> partition key and if you specify a criterion on a non-key field, it requires 
> allow filtering.
> The second reason to require allow filtering is significantly less work to 
> scan through a partition.  It is still extra work over seeking to a specific 
> row and getting N sequential rows though.  So while an application developer 
> and/or operator needs to be cautious about this second type, it's not 
> necessarily a bad thing, depending on the table and the use case.
> I propose that we separate the way to specify allow filtering across an 
> entire table from specifying allow filtering across a partition in a 
> backwards compatible way.  One idea that was brought up in Slack in the 
> cassandra-dev room was to have allow filtering mean the superset - scanning 
> across the table.  Then if you want to specify that you *only* want to scan 
> within a partition you would use something like
> {{ALLOW FILTERING [WITHIN PARTITION]}}
> So it will succeed if you specify non-key criteria within a single partition, 
> but fail with a message to say it requires the full allow filtering.  This 
> would allow for a backwards compatible full allow filtering while allowing a 
> user to specify that they want to just scan within a partition, but error out 
> if trying to scan a full table.
> This is potentially also related to the capability limitation framework by 
> which operators could more granularly specify what features are allowed or 
> disallowed per user, discussed in CASSANDRA-8303.  This way an operator could 
> disallow the more general allow filtering while allowing the partition scan 
> (or disallow them both at their discretion).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-17707) Clarify intent when replaying hint files "partially"

2022-06-21 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-17707:
-
Description: 
As part of CASSANDRA-6230, hints were redesigned to come from files.  As part 
of this, we log when the hint files are dispatched.

See 
https://github.com/apache/cassandra/blob/cassandra-4.1/src/java/org/apache/cassandra/hints/HintsDispatchExecutor.java#L318

{code}
logger.info("Finished hinted handoff of file {} to endpoint {}: {}, partially", 
descriptor.fileName(), address, hostId);
{code}

This has caused some confusion among some users who wonder whether their files 
were only partially replayed and whether data is consistent.

This ticket is to clarify in the log statement itself or document in the 
official docs what is meant by {{partially}}.

My understanding is that it's really that sometimes when shutting down, all of 
the file metadata isn't written so it replays the file anyway.  Is that right?  
I wasn't sure about the dispatch failure and what that means in practice.

CC [~aleksey]

  was:
As part of CASSANDRA-6230, hints were redesigned to come from files.  As part 
of this, we log when the hint files are dispatched.

See 
https://github.com/apache/cassandra/blob/cassandra-4.1/src/java/org/apache/cassandra/hints/HintsDispatchExecutor.java#L318

{code}
logger.info("Finished hinted handoff of file {} to endpoint {}: {}, partially", 
descriptor.fileName(), address, hostId);
{code}

This has caused some confusion among some users who wonder whether their files 
were only partially replayed and whether data is consistent.

This ticket is to clarify in the log statement itself or document in the 
official docs what is meant by `partially`.

My understanding is that it's really that sometimes when shutting down, all of 
the file metadata isn't written so it replays the file anyway.  Is that right?  
I wasn't sure about the dispatch failure and what that means in practice.

CC [~aleksey]


> Clarify intent when replaying hint files "partially"
> 
>
> Key: CASSANDRA-17707
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17707
> Project: Cassandra
>  Issue Type: Task
>Reporter: Jeremy Hanna
>Priority: Normal
>
> As part of CASSANDRA-6230, hints were redesigned to come from files.  As part 
> of this, we log when the hint files are dispatched.
> See 
> https://github.com/apache/cassandra/blob/cassandra-4.1/src/java/org/apache/cassandra/hints/HintsDispatchExecutor.java#L318
> {code}
> logger.info("Finished hinted handoff of file {} to endpoint {}: {}, 
> partially", descriptor.fileName(), address, hostId);
> {code}
> This has caused some confusion among some users who wonder whether their 
> files were only partially replayed and whether data is consistent.
> This ticket is to clarify in the log statement itself or document in the 
> official docs what is meant by {{partially}}.
> My understanding is that it's really that sometimes when shutting down, all 
> of the file metadata isn't written so it replays the file anyway.  Is that 
> right?  I wasn't sure about the dispatch failure and what that means in 
> practice.
> CC [~aleksey]



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-17707) Clarify intent when replaying hint files "partially"

2022-06-21 Thread Jeremy Hanna (Jira)

Jeremy Hanna created CASSANDRA-17707:


 Summary: Clarify intent when replaying hint files "partially"
 Key: CASSANDRA-17707
 URL: https://issues.apache.org/jira/browse/CASSANDRA-17707
 Project: Cassandra
  Issue Type: Task
Reporter: Jeremy Hanna


As part of CASSANDRA-6230, hints were redesigned to come from files.  As part 
of this, we log when the hint files are dispatched.

See 
https://github.com/apache/cassandra/blob/cassandra-4.1/src/java/org/apache/cassandra/hints/HintsDispatchExecutor.java#L318

{code}
logger.info("Finished hinted handoff of file {} to endpoint {}: {}, partially", 
descriptor.fileName(), address, hostId);
{code}

This has caused some confusion among some users who wonder whether their files 
were only partially replayed and whether data is consistent.

This ticket is to clarify in the log statement itself or document in the 
official docs what is meant by `partially`.

My understanding is that it's really that sometimes when shutting down, all of 
the file metadata isn't written so it replays the file anyway.  Is that right?  
I wasn't sure about the dispatch failure and what that means in practice.

CC [~aleksey]



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-9753) LOCAL_QUORUM reads can block cross-DC if there is a digest mismatch

2021-09-08 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-9753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17412304#comment-17412304
 ] 

Jeremy Hanna commented on CASSANDRA-9753:
-

Is it fair to say that temporarily disabling (dc_local_)read_repair_chance and 
speculative retry while adding a new data center will mean that all LOCAL_* 
consistency level based queries will stay in the origin data center?

> LOCAL_QUORUM reads can block cross-DC if there is a digest mismatch
> ---
>
> Key: CASSANDRA-9753
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9753
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Coordination
>Reporter: Richard Low
>Priority: Normal
>
> When there is a digest mismatch during the initial read, a data read request 
> is sent to all replicas involved in the initial read. This can be more than 
> the initial blockFor if read repair was done and if speculative retry kicked 
> in. E.g. for RF 3 in two DCs, the number of reads could be 4: 2 for 
> LOCAL_QUORUM, 1 for read repair and 1 for speculative read if one replica was 
> slow. If there is then a digest mismatch, Cassandra will issue the data read 
> to all 4 and set blockFor=4. Now the read query is blocked on cross-DC 
> latency. The digest mismatch read blockFor should be capped at RF for the 
> local DC when using CL.LOCAL_*.
> You can reproduce this behaviour by creating a keyspace with 
> NetworkTopologyStrategy, RF 3 per DC, dc_local_read_repair=1.0 and ALWAYS for 
> speculative read. If you force a digest mismatch (e.g. by deleting a replicas 
> SSTables and restarting) you can see in tracing that it is blocking for 4 
> responses.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16730) Describe audit log categories in documentation

2021-06-13 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-16730:
-
Resolution: (was: Fixed)
Status: Open  (was: Resolved)

> Describe audit log categories in documentation
> --
>
> Key: CASSANDRA-16730
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16730
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation/Website
>Reporter: Jeremy Hanna
>Priority: Normal
>
> With CASSANDRA-12151 we have a nice audit log functionality for the database 
> and it's [described in the 
> docs|https://cassandra.apache.org/doc/latest/operating/audit_logging.html] 
> with the associated options.  One thing that's missing is a description of 
> the categories that can be enabled and disabled.  The categories are found in 
> the code 
> [here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/audit/AuditLogEntryCategory.java#L26]:
> {{QUERY, DML, DDL, DCL, OTHER, AUTH, ERROR, PREPARE}}
> So it would just be good to have those and a brief description in the docs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16730) Describe audit log categories in documentation

2021-06-13 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-16730:
-
Resolution: Not A Problem
Status: Resolved  (was: Open)

> Describe audit log categories in documentation
> --
>
> Key: CASSANDRA-16730
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16730
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation/Website
>Reporter: Jeremy Hanna
>Priority: Normal
>
> With CASSANDRA-12151 we have a nice audit log functionality for the database 
> and it's [described in the 
> docs|https://cassandra.apache.org/doc/latest/operating/audit_logging.html] 
> with the associated options.  One thing that's missing is a description of 
> the categories that can be enabled and disabled.  The categories are found in 
> the code 
> [here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/audit/AuditLogEntryCategory.java#L26]:
> {{QUERY, DML, DDL, DCL, OTHER, AUTH, ERROR, PREPARE}}
> So it would just be good to have those and a brief description in the docs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16730) Describe audit log categories in documentation

2021-06-13 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-16730:
-
Resolution: Fixed
Status: Resolved  (was: Open)

Will be unified in the updated docs with the more comprehensive explanation 
from the What's New in C* 4 section.

> Describe audit log categories in documentation
> --
>
> Key: CASSANDRA-16730
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16730
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation/Website
>Reporter: Jeremy Hanna
>Priority: Normal
>
> With CASSANDRA-12151 we have a nice audit log functionality for the database 
> and it's [described in the 
> docs|https://cassandra.apache.org/doc/latest/operating/audit_logging.html] 
> with the associated options.  One thing that's missing is a description of 
> the categories that can be enabled and disabled.  The categories are found in 
> the code 
> [here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/audit/AuditLogEntryCategory.java#L26]:
> {{QUERY, DML, DDL, DCL, OTHER, AUTH, ERROR, PREPARE}}
> So it would just be good to have those and a brief description in the docs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16730) Describe audit log categories in documentation

2021-06-13 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17362672#comment-17362672
 ] 

Jeremy Hanna commented on CASSANDRA-16730:
--

Ah - I didn't see the other section.  I'm glad we're putting them together to 
have a more comprehensive page.  Thanks Ekaterina!

> Describe audit log categories in documentation
> --
>
> Key: CASSANDRA-16730
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16730
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation/Website
>Reporter: Jeremy Hanna
>Priority: Normal
>
> With CASSANDRA-12151 we have a nice audit log functionality for the database 
> and it's [described in the 
> docs|https://cassandra.apache.org/doc/latest/operating/audit_logging.html] 
> with the associated options.  One thing that's missing is a description of 
> the categories that can be enabled and disabled.  The categories are found in 
> the code 
> [here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/audit/AuditLogEntryCategory.java#L26]:
> {{QUERY, DML, DDL, DCL, OTHER, AUTH, ERROR, PREPARE}}
> So it would just be good to have those and a brief description in the docs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16730) Describe audit log categories in documentation

2021-06-10 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-16730:
-
Description: 
With CASSANDRA-12151 we have a nice audit log functionality for the database 
and it's [described in the 
docs|https://cassandra.apache.org/doc/latest/operating/audit_logging.html] with 
the associated options.  One thing that's missing is a description of the 
categories that can be enabled and disabled.  The categories are found in the 
code 
[here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/audit/AuditLogEntryCategory.java#L26]:

{{QUERY, DML, DDL, DCL, OTHER, AUTH, ERROR, PREPARE}}

So it would just be good to have those and a brief description in the docs.

  was:
With CASSANDRA-12151 we have a nice audit log functionality for the database 
and it's described in the docs with the associated options.  One thing that's 
missing is a description of the categories that can be enabled and disabled.  
The categories are found in the code 
[here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/audit/AuditLogEntryCategory.java#L26]:

{{QUERY, DML, DDL, DCL, OTHER, AUTH, ERROR, PREPARE}}

So it would just be good to have those and a brief description in the docs.


> Describe audit log categories in documentation
> --
>
> Key: CASSANDRA-16730
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16730
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation/Website
>Reporter: Jeremy Hanna
>Priority: Normal
>
> With CASSANDRA-12151 we have a nice audit log functionality for the database 
> and it's [described in the 
> docs|https://cassandra.apache.org/doc/latest/operating/audit_logging.html] 
> with the associated options.  One thing that's missing is a description of 
> the categories that can be enabled and disabled.  The categories are found in 
> the code 
> [here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/audit/AuditLogEntryCategory.java#L26]:
> {{QUERY, DML, DDL, DCL, OTHER, AUTH, ERROR, PREPARE}}
> So it would just be good to have those and a brief description in the docs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16730) Describe audit log categories in documentation

2021-06-10 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-16730:
-
Complexity: Low Hanging Fruit

> Describe audit log categories in documentation
> --
>
> Key: CASSANDRA-16730
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16730
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation/Website
>Reporter: Jeremy Hanna
>Priority: Normal
>
> With CASSANDRA-12151 we have a nice audit log functionality for the database 
> and it's described in the docs with the associated options.  One thing that's 
> missing is a description of the categories that can be enabled and disabled.  
> The categories are found in the code 
> [here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/audit/AuditLogEntryCategory.java#L26]:
> {{QUERY, DML, DDL, DCL, OTHER, AUTH, ERROR, PREPARE}}
> So it would just be good to have those and a brief description in the docs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-16730) Describe audit log categories in documentation

2021-06-10 Thread Jeremy Hanna (Jira)

Jeremy Hanna created CASSANDRA-16730:


 Summary: Describe audit log categories in documentation
 Key: CASSANDRA-16730
 URL: https://issues.apache.org/jira/browse/CASSANDRA-16730
 Project: Cassandra
  Issue Type: Improvement
  Components: Documentation/Website
Reporter: Jeremy Hanna


With CASSANDRA-12151 we have a nice audit log functionality for the database 
and it's described in the docs with the associated options.  One thing that's 
missing is a description of the categories that can be enabled and disabled.  
The categories are found in the code 
[here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/audit/AuditLogEntryCategory.java#L26]:

{{QUERY, DML, DDL, DCL, OTHER, AUTH, ERROR, PREPARE}}

So it would just be good to have those and a brief description in the docs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16391) Migrate use of maven-ant-tasks to resolver-ant-tasks

2021-03-22 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-16391:
-
Complexity: Low Hanging Fruit  (was: Normal)

> Migrate use of maven-ant-tasks to resolver-ant-tasks
> 
>
> Key: CASSANDRA-16391
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16391
> Project: Cassandra
>  Issue Type: Task
>  Components: Build, Dependencies
>Reporter: Michael Semb Wever
>Priority: High
>  Labels: gsoc2021, lhf, mentor
>
> Cassandra resolves dependencies and generates maven pom files through the use 
> of [maven-ant-tasks|http://maven.apache.org/ant-tasks/]. This is no longer a 
> supported project.
> The recommended upgrade is to 
> [resolver-ant-tasks|http://maven.apache.org/resolver-ant-tasks/]. It follows 
> similar APIs so shouldn't be too impactful a change.
> The existing maven-ant-tasks has caused [some headaches 
> already|https://issues.apache.org/jira/browse/CASSANDRA-16359] with internal 
> super poms referencing insecure http:// central maven repository URLs that 
> are no longer supported.
> We should also take the opportunity to 
>  - define the "test" scope (classpath) for those dependencies only used for 
> tests (currently we are packaging test dependencies into the release binary 
> artefact),
>  - remove the jar files stored in the git repo under the "lib/" folder.
> These two above points have to happen in tandem, as the jar files under 
> {{lib/}} are those that get bundled into the {{build/dist/lib/}} and hence 
> the binary artefact. That is, all jar files under {{lib/}} are the project's 
> "compile" scope, and all other dependencies defined in build.xml are either 
> "provided" or "test" scope. These different scopes for dependencies are 
> currently configured in different maven-ant-tasks poms. See 
> https://github.com/apache/cassandra/commit/d43b9ce5092f8879a1a66afebab74d86e9e127fb#r45659668



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-16429) cqlsh garbles column names with Japanese characters

2021-02-07 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17280703#comment-17280703
 ] 

Jeremy Hanna edited comment on CASSANDRA-16429 at 2/8/21, 12:38 AM:


Could this be related to the new code that exposes table schema directly to the 
drivers?  CASSANDRA-14825


was (Author: jeromatron):
Could this be related to the new code that exposes table schema directly to the 
drivers?

> cqlsh garbles column names with Japanese characters
> ---
>
> Key: CASSANDRA-16429
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16429
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Yoshi Kimoto
>Priority: Normal
> Attachments: jptest.cql
>
>
> Tables created with Japanese character name columns are working well in C* 
> 3.11.10 when doing a SELECT * in cqlsh but will show as garbled (shown as 
> "?") in 4.0-beta4. DESCRIBE shows the column names correctly in both cases.
> Run the attached jptest.cql script in both envs with cqlsh -f. They will 
> yield different results.
> My test env (MacOS 10.15.7):
> C* 3.11.10 with
>  - OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_252-b09)
>  - Python 2.7.16
> C* 4.0-beta4
>  - OpenJDK Runtime Environment AdoptOpenJDK (build 11.0.9.1+1)
>  - Python 3.8.2



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16429) cqlsh garbles column names with Japanese characters

2021-02-07 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17280703#comment-17280703
 ] 

Jeremy Hanna commented on CASSANDRA-16429:
--

Could this be related to the new code that exposes table schema directly to the 
drivers?

> cqlsh garbles column names with Japanese characters
> ---
>
> Key: CASSANDRA-16429
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16429
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Yoshi Kimoto
>Priority: Normal
> Attachments: jptest.cql
>
>
> Tables created with Japanese character name columns are working well in C* 
> 3.11.10 when doing a SELECT * in cqlsh but will show as garbled (shown as 
> "?") in 4.0-beta4. DESCRIBE shows the column names correctly in both cases.
> Run the attached jptest.cql script in both envs with cqlsh -f. They will 
> yield different results.
> My test env (MacOS 10.15.7):
> C* 3.11.10 with
>  - OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_252-b09)
>  - Python 2.7.16
> C* 4.0-beta4
>  - OpenJDK Runtime Environment AdoptOpenJDK (build 11.0.9.1+1)
>  - Python 3.8.2



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16315) Remove bad advice on concurrent compactors from cassandra.yaml

2020-12-07 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-16315:
-
Description: 
Since CASSANDRA-7551, we gave the following advice for setting 
{{concurrent_compactors}}:

{code}
# If your data directories are backed by SSD, you should increase this
# to the number of cores.
{code}

However in practice there are a number of problems with this.  While it's true 
that one can increase {{concurrent_compactors}} to improve efficiency of 
compactions on machines with more cpu cores, the context switching with random 
IO and GC associated with bringing compaction data into the heap will work 
against the additional parallelism.

This has caused problems for those who have taken this advice literally.

I propose that we adjust this language to give a limit on number of 
{{concurrent_compactors}} for this setting both in the 3.x line and in trunk so 
that new users do not stumble when reviewing whether to change defaults.

See also CASSANDRA-7139 for a discussion on considerations.

I see two short-term options to avoid new user pain:

1. Change the language to say something like this:

{quote}
When using SSD based storage, you can increase the number of 
{{concurrent_compactors}}.  However be aware that using too many concurrent 
compactors can have a detrimental effect such as GC pressure, more context 
switching among compactors and realtime operations, and more random IO pulling 
data for different compactions.  It's best to test and measure with your 
workload and hardware.
{quote}

2. Do some significant testing of compaction efficient and read/write 
latency/throughput targets to see where the tipping point is - considering some 
constants around memory and heap size and configuration to keep it simple.

  was:
Since CASSANDRA-7551, we gave the following advice for setting 
concurrent_compactors:

{code}
# If your data directories are backed by SSD, you should increase this
# to the number of cores.
{code}

However in practice there are a number of problems with this.  While it's true 
that one can increase {{concurrent_compactors}} to improve efficiency of 
compactions on machines with more cpu cores, the context switching with random 
IO and GC associated with bringing compaction data into the heap will work 
against the additional parallelism.

This has caused problems for those who have taken this advice literally.

I propose that we adjust this language to give a limit on number of 
{{concurrent_compactors}} for this setting both in the 3.x line and in trunk so 
that new users do not stumble when reviewing whether to change defaults.

See also CASSANDRA-7139 for a discussion on considerations.

I see two short-term options to avoid new user pain:

1. Change the language to say something like this:

{quote}
When using SSD based storage, you can increase the number of 
{{concurrent_compactors}}.  However be aware that using too many concurrent 
compactors can have a detrimental effect such as GC pressure, more context 
switching among compactors and realtime operations, and more random IO pulling 
data for different compactions.  It's best to test and measure with your 
workload and hardware.
{quote}

2. Do some significant testing of compaction efficient and read/write 
latency/throughput targets to see where the tipping point is - considering some 
constants around memory and heap size and configuration to keep it simple.


> Remove bad advice on concurrent compactors from cassandra.yaml
> --
>
> Key: CASSANDRA-16315
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16315
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Jeremy Hanna
>Priority: Normal
>
> Since CASSANDRA-7551, we gave the following advice for setting 
> {{concurrent_compactors}}:
> {code}
> # If your data directories are backed by SSD, you should increase this
> # to the number of cores.
> {code}
> However in practice there are a number of problems with this.  While it's 
> true that one can increase {{concurrent_compactors}} to improve efficiency of 
> compactions on machines with more cpu cores, the context switching with 
> random IO and GC associated with bringing compaction data into the heap will 
> work against the additional parallelism.
> This has caused problems for those who have taken this advice literally.
> I propose that we adjust this language to give a limit on number of 
> {{concurrent_compactors}} for this setting both in the 3.x line and in trunk 
> so that new users do not stumble when reviewing whether to change defaults.
> See also CASSANDRA-7139 for a discussion on considerations.
> I see two short-term options to avoid new user pain:
> 1. Change the language to say something like

[jira] [Updated] (CASSANDRA-16315) Remove bad advice on concurrent compactors from cassandra.yaml

2020-12-07 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-16315:
-
Description: 
Since CASSANDRA-7551, we gave the following advice for setting 
concurrent_compactors:

{code}
# If your data directories are backed by SSD, you should increase this
# to the number of cores.
{code}

However in practice there are a number of problems with this.  While it's true 
that one can increase {{concurrent_compactors}} to improve efficiency of 
compactions on machines with more cpu cores, the context switching with random 
IO and GC associated with bringing compaction data into the heap will work 
against the additional parallelism.

This has caused problems for those who have taken this advice literally.

I propose that we adjust this language to give a limit on number of 
{{concurrent_compactors}} for this setting both in the 3.x line and in trunk so 
that new users do not stumble when reviewing whether to change defaults.

See also CASSANDRA-7139 for a discussion on considerations.

I see two short-term options to avoid new user pain:

1. Change the language to say something like this:

{quote}
When using fast SSD, you can increase the number of {{concurrent_compactors}}.  
However be aware that using too many concurrent compactors can have a 
detrimental effect such as GC pressure, more context switching among compactors 
and realtime operations, and more random IO pulling data for different 
compactions.  It's best to test and measure with your workload and hardware.
{quote}

2. Do some significant testing of compaction efficient and read/write 
latency/throughput targets to see where the tipping point is - considering some 
constants around memory and heap size and configuration to keep it simple.

  was:
Since CASSANDRA-7551, we gave the following advice for setting 
concurrent_compactors:

{code}
# If your data directories are backed by SSD, you should increase this
# to the number of cores.
{code}

However in practice there are a number of problems with this.  While it's true 
that one can increase concurrent_compactors to improve efficiency of 
compactions on machines with more cpu cores, the context switching with random 
IO and GC associated with bringing compaction data into the heap will work 
against the additional parallelism.

This has caused problems for those who have taken this advice literally.

I propose that we adjust this language to give a limit on number of 
concurrent_compactors for this setting both in the 3.x line and in trunk so 
that new users do not stumble when reviewing whether to change defaults.

See also CASSANDRA-7139 for a discussion on considerations.

I see two short-term options to avoid new user pain:

1. Change the language to say something like this:

{quote}
When using fast SSD, you can increase the number of {{concurrent_compactors}}.  
However be aware that using too many concurrent compactors can have a 
detrimental effect such as GC pressure, more context switching among compactors 
and realtime operations, and more random IO pulling data for different 
compactions.  It's best to test and measure with your workload and hardware.
{quote}

2. Do some significant testing of compaction efficient and read/write 
latency/throughput targets to see where the tipping point is - considering some 
constants around memory and heap size and configuration to keep it simple.


> Remove bad advice on concurrent compactors from cassandra.yaml
> --
>
> Key: CASSANDRA-16315
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16315
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Jeremy Hanna
>Priority: Normal
>
> Since CASSANDRA-7551, we gave the following advice for setting 
> concurrent_compactors:
> {code}
> # If your data directories are backed by SSD, you should increase this
> # to the number of cores.
> {code}
> However in practice there are a number of problems with this.  While it's 
> true that one can increase {{concurrent_compactors}} to improve efficiency of 
> compactions on machines with more cpu cores, the context switching with 
> random IO and GC associated with bringing compaction data into the heap will 
> work against the additional parallelism.
> This has caused problems for those who have taken this advice literally.
> I propose that we adjust this language to give a limit on number of 
> {{concurrent_compactors}} for this setting both in the 3.x line and in trunk 
> so that new users do not stumble when reviewing whether to change defaults.
> See also CASSANDRA-7139 for a discussion on considerations.
> I see two short-term options to avoid new user pain:
> 1. Change the language to say something like this:
> {quote}
> When using fast SSD,

[jira] [Updated] (CASSANDRA-16315) Remove bad advice on concurrent compactors from cassandra.yaml

2020-12-07 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-16315:
-
Description: 
Since CASSANDRA-7551, we gave the following advice for setting 
concurrent_compactors:

{code}
# If your data directories are backed by SSD, you should increase this
# to the number of cores.
{code}

However in practice there are a number of problems with this.  While it's true 
that one can increase {{concurrent_compactors}} to improve efficiency of 
compactions on machines with more cpu cores, the context switching with random 
IO and GC associated with bringing compaction data into the heap will work 
against the additional parallelism.

This has caused problems for those who have taken this advice literally.

I propose that we adjust this language to give a limit on number of 
{{concurrent_compactors}} for this setting both in the 3.x line and in trunk so 
that new users do not stumble when reviewing whether to change defaults.

See also CASSANDRA-7139 for a discussion on considerations.

I see two short-term options to avoid new user pain:

1. Change the language to say something like this:

{quote}
When using SSD based storage, you can increase the number of 
{{concurrent_compactors}}.  However be aware that using too many concurrent 
compactors can have a detrimental effect such as GC pressure, more context 
switching among compactors and realtime operations, and more random IO pulling 
data for different compactions.  It's best to test and measure with your 
workload and hardware.
{quote}

2. Do some significant testing of compaction efficient and read/write 
latency/throughput targets to see where the tipping point is - considering some 
constants around memory and heap size and configuration to keep it simple.

  was:
Since CASSANDRA-7551, we gave the following advice for setting 
concurrent_compactors:

{code}
# If your data directories are backed by SSD, you should increase this
# to the number of cores.
{code}

However in practice there are a number of problems with this.  While it's true 
that one can increase {{concurrent_compactors}} to improve efficiency of 
compactions on machines with more cpu cores, the context switching with random 
IO and GC associated with bringing compaction data into the heap will work 
against the additional parallelism.

This has caused problems for those who have taken this advice literally.

I propose that we adjust this language to give a limit on number of 
{{concurrent_compactors}} for this setting both in the 3.x line and in trunk so 
that new users do not stumble when reviewing whether to change defaults.

See also CASSANDRA-7139 for a discussion on considerations.

I see two short-term options to avoid new user pain:

1. Change the language to say something like this:

{quote}
When using fast SSD, you can increase the number of {{concurrent_compactors}}.  
However be aware that using too many concurrent compactors can have a 
detrimental effect such as GC pressure, more context switching among compactors 
and realtime operations, and more random IO pulling data for different 
compactions.  It's best to test and measure with your workload and hardware.
{quote}

2. Do some significant testing of compaction efficient and read/write 
latency/throughput targets to see where the tipping point is - considering some 
constants around memory and heap size and configuration to keep it simple.


> Remove bad advice on concurrent compactors from cassandra.yaml
> --
>
> Key: CASSANDRA-16315
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16315
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Jeremy Hanna
>Priority: Normal
>
> Since CASSANDRA-7551, we gave the following advice for setting 
> concurrent_compactors:
> {code}
> # If your data directories are backed by SSD, you should increase this
> # to the number of cores.
> {code}
> However in practice there are a number of problems with this.  While it's 
> true that one can increase {{concurrent_compactors}} to improve efficiency of 
> compactions on machines with more cpu cores, the context switching with 
> random IO and GC associated with bringing compaction data into the heap will 
> work against the additional parallelism.
> This has caused problems for those who have taken this advice literally.
> I propose that we adjust this language to give a limit on number of 
> {{concurrent_compactors}} for this setting both in the 3.x line and in trunk 
> so that new users do not stumble when reviewing whether to change defaults.
> See also CASSANDRA-7139 for a discussion on considerations.
> I see two short-term options to avoid new user pain:
> 1. Change the language to say something like this:
> {quote}
>

[jira] [Created] (CASSANDRA-16315) Remove bad advice on concurrent compactors from cassandra.yaml

2020-12-07 Thread Jeremy Hanna (Jira)

Jeremy Hanna created CASSANDRA-16315:


 Summary: Remove bad advice on concurrent compactors from 
cassandra.yaml
 Key: CASSANDRA-16315
 URL: https://issues.apache.org/jira/browse/CASSANDRA-16315
 Project: Cassandra
  Issue Type: Improvement
  Components: Local/Config
Reporter: Jeremy Hanna


Since CASSANDRA-7551, we gave the following advice for setting 
concurrent_compactors:

{code}
# If your data directories are backed by SSD, you should increase this
# to the number of cores.
{code}

However in practice there are a number of problems with this.  While it's true 
that one can increase concurrent_compactors to improve efficiency of 
compactions on machines with more cpu cores, the context switching with random 
IO and GC associated with bringing compaction data into the heap will work 
against the additional parallelism.

This has caused problems for those who have taken this advice literally.

I propose that we adjust this language to give a limit on number of 
concurrent_compactors for this setting both in the 3.x line and in trunk so 
that new users do not stumble when reviewing whether to change defaults.

See also CASSANDRA-7139 for a discussion on considerations.

I see two short-term options to avoid new user pain:

1. Change the language to say something like this:

{quote}
When using fast SSD, you can increase the number of {{concurrent_compactors}}.  
However be aware that using too many concurrent compactors can have a 
detrimental effect such as GC pressure, more context switching among compactors 
and realtime operations, and more random IO pulling data for different 
compactions.  It's best to test and measure with your workload and hardware.
{quote}

2. Do some significant testing of compaction efficient and read/write 
latency/throughput targets to see where the tipping point is - considering some 
constants around memory and heap size and configuration to keep it simple.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16205) Offline token allocation strategy generator tool

2020-10-28 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17222696#comment-17222696
 ] 

Jeremy Hanna commented on CASSANDRA-16205:
--

I think we still want to run the algorithm in the dtests if possible, at least 
the ones that have to do with cluster membership and consistency like 
bootstrap, replace, decommission, and tests involving range movements in 
general.  Could we run with the new algorithm at least for those tests?  Is the 
thought to use the algorithm to do that and then for the other tests use this 
script to pre-allocate the tokens?

> Offline token allocation strategy generator tool
> 
>
> Key: CASSANDRA-16205
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16205
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config, Local/Scripts
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
>
> A command line tool to generate tokens (using the 
> allocate_tokens_for_local_replication_factor algorithm) for pre-configuration 
> of {{initial_tokens}} in cassandra.yaml.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13701) Lower default num_tokens

2020-10-05 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-13701:
-
Fix Version/s: (was: 4.0-triage)

> Lower default num_tokens
> 
>
> Key: CASSANDRA-13701
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13701
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Chris Lohfink
>Assignee: Alexander Dejanovski
>Priority: Low
> Fix For: 4.0-alpha
>
>
> For reasons highlighted in CASSANDRA-7032, the high number of vnodes is not 
> necessary. It is very expensive for operations processes and scanning. Its 
> come up a lot and its pretty standard and known now to always reduce the 
> num_tokens within the community. We should just lower the defaults.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-16079) Improve dtest runtime

2020-09-16 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197300#comment-17197300
 ] 

Jeremy Hanna edited comment on CASSANDRA-16079 at 9/17/20, 12:22 AM:
-

Is it possible to implement some sort of "reset" operation in CCM so that it 
drops all non-system keyspaces so that the clusters that don't explicitly test 
cluster membership operations can just be reused as has been said?  We could 
disable snapshotting on them as well so they wouldn't build up state over time 
too.

In other words, it sounds like if we made the time for starting single node 
clusters essentially instant, that's 171 * single node startup time that we've 
reduced for the overall dtests.


was (Author: jeromatron):
Is it possible to implement some sort of "reset" operation in CCM so that it 
drops all non-system keyspaces so that the clusters that don't explicitly test 
cluster membership operations can just be reused as has been said?  We could 
disable snapshotting on them as well so they wouldn't build up state over time 
too.

> Improve dtest runtime
> -
>
> Key: CASSANDRA-16079
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16079
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CI
>Reporter: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta
>
>
> A recent ticket, CASSANDRA-13701, changed the way dtests run, resulting in a 
> [30% increase in run 
> time|https://www.mail-archive.com/dev@cassandra.apache.org/msg15606.html]. 
> While that change was accepted, we wanted to spin out a ticket to optimize 
> dtests in an attempt to gain back some of that runtime.
> At this time we don't have concrete improvements in mind, so the first order 
> of this ticket will be to analyze the state of things currently, and try to 
> ascertain some valuable optimizations. Once the problems are understood, we 
> will break down subtasks to divide the work.
> Some areas to consider:
> * cluster reuse
> * C* startup optimizations
> * Tests that should be ported to in-JVM dtest or even unit tests



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16079) Improve dtest runtime

2020-09-16 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17197300#comment-17197300
 ] 

Jeremy Hanna commented on CASSANDRA-16079:
--

Is it possible to implement some sort of "reset" operation in CCM so that it 
drops all non-system keyspaces so that the clusters that don't explicitly test 
cluster membership operations can just be reused as has been said?  We could 
disable snapshotting on them as well so they wouldn't build up state over time 
too.

> Improve dtest runtime
> -
>
> Key: CASSANDRA-16079
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16079
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CI
>Reporter: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta
>
>
> A recent ticket, CASSANDRA-13701, changed the way dtests run, resulting in a 
> [30% increase in run 
> time|https://www.mail-archive.com/dev@cassandra.apache.org/msg15606.html]. 
> While that change was accepted, we wanted to spin out a ticket to optimize 
> dtests in an attempt to gain back some of that runtime.
> At this time we don't have concrete improvements in mind, so the first order 
> of this ticket will be to analyze the state of things currently, and try to 
> ascertain some valuable optimizations. Once the problems are understood, we 
> will break down subtasks to divide the work.
> Some areas to consider:
> * cluster reuse
> * C* startup optimizations
> * Tests that should be ported to in-JVM dtest or even unit tests



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-8720) Provide tools for finding wide row/partition keys

2020-09-13 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-8720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17195146#comment-17195146
 ] 

Jeremy Hanna edited comment on CASSANDRA-8720 at 9/14/20, 1:50 AM:
---

We've had this in DataStax Enterprise's version of Cassandra for a couple of 
years now.  Any chance we could just port that over to Cassandra at this point? 
 It's an offline tool called sstablepartitions that gets a variety of 
information about partitions in an sstable or directory of sstables using the 
methodology discussed in this ticket.  See 
https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/tools/toolsSStables/toolsSSTablepartitions.html.
  [~snazy] do you think it's a straightforward port at this point?


was (Author: jeromatron):
We've had this in DataStax Enterprise's version of Cassandra for a couple of 
years now.  Any chance we could just port that over to Cassandra at this point? 
 It's an offline tool called sstablepartitions that gets a variety of 
information about partitions in an sstable or directory sstables using the 
methodology discussed in this ticket.  See 
https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/tools/toolsSStables/toolsSSTablepartitions.html.
  [~snazy] do you think it's a straightforward port at this point?

> Provide tools for finding wide row/partition keys
> -
>
> Key: CASSANDRA-8720
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8720
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Tools
>Reporter: J.B. Langston
>Priority: Normal
> Fix For: 2.1.x, 2.2.x
>
> Attachments: 8720.txt
>
>
> Multiple users have requested some sort of tool to help identify wide row 
> keys. They get into a situation where they know a wide row/partition has been 
> inserted and it's causing problems for them but they have no idea what the 
> row key is in order to remove it.  
> Maintaining the widest row key currently encountered and displaying it in 
> cfstats would be one possible approach.
> Another would be an offline tool (possibly an enhancement to sstablekeys) to 
> show the number of columns/bytes per key in each sstable. If a tool to 
> aggregate the information at a CF-level could be provided that would be a 
> bonus, but it shouldn't be too hard to write a script wrapper to aggregate 
> them if not.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-8720) Provide tools for finding wide row/partition keys

2020-09-13 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-8720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17195146#comment-17195146
 ] 

Jeremy Hanna commented on CASSANDRA-8720:
-

We've had this in DataStax Enterprise's version of Cassandra for a couple of 
years now.  Any chance we could just port that over to Cassandra at this point? 
 It's an offline tool called sstablepartitions that gets a variety of 
information about partitions in an sstable or directory sstables using the 
methodology discussed in this ticket.  See 
https://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/tools/toolsSStables/toolsSSTablepartitions.html.
  [~snazy] do you think it's a straightforward port at this point?

> Provide tools for finding wide row/partition keys
> -
>
> Key: CASSANDRA-8720
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8720
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Tools
>Reporter: J.B. Langston
>Priority: Normal
> Fix For: 2.1.x, 2.2.x
>
> Attachments: 8720.txt
>
>
> Multiple users have requested some sort of tool to help identify wide row 
> keys. They get into a situation where they know a wide row/partition has been 
> inserted and it's causing problems for them but they have no idea what the 
> row key is in order to remove it.  
> Maintaining the widest row key currently encountered and displaying it in 
> cfstats would be one possible approach.
> Another would be an offline tool (possibly an enhancement to sstablekeys) to 
> show the number of columns/bytes per key in each sstable. If a tool to 
> aggregate the information at a CF-level could be provided that would be a 
> bonus, but it shouldn't be too hard to write a script wrapper to aggregate 
> them if not.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-13701) Lower default num_tokens

2020-08-05 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-13701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17171556#comment-17171556
 ] 

Jeremy Hanna edited comment on CASSANDRA-13701 at 8/5/20, 3:29 PM:
---

I started down this road but don't think I can get through fixing all of the 
dtests.  On this line in bootstrap-test.py I changed the time.sleep to 10 
seconds and it appears to solve that problem - 
https://github.com/apache/cassandra-dtest/blob/master/bootstrap_test.py#L485

However there were many tests with replace_address that I'm not sure about.  I 
don't know how or why replace address would be affected by the new token 
allocation algorithm.  Dimitar said something about parallel bootstrap but I 
don't see that - sometimes no_wait or wait_other_notice is true or false so I 
thought it was that, but perhaps someone more familiar with ccm could see.

I'm sorry - I really want this to get in for the release but I don't have the 
time to dedicate to learning dtest at a deeper level to fix all of these in 
time.


was (Author: jeromatron):
I started down this road but don't think I can get through fixing all of the 
dtests.  On this line in bootstrap-test.py I changed the time.sleep to 10 
seconds and it appears to solve that problem - 
https://github.com/apache/cassandra-dtest/blob/master/bootstrap_test.py#L485

However there were many tests with replace_address that I'm not sure about.  I 
don't know how or why replace address would be affected by the new token 
allocation algorithm.  Dmitri said something about parallel bootstrap but I 
don't see that - sometimes no_wait or wait_other_notice is true or false so I 
thought it was that, but perhaps someone more familiar with ccm could see.

I'm sorry - I really want this to get in for the release but I don't have the 
time to dedicate to learning dtest at a deeper level to fix all of these in 
time.

> Lower default num_tokens
> 
>
> Key: CASSANDRA-13701
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13701
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Chris Lohfink
>Priority: Low
> Fix For: 4.0-alpha
>
>
> For reasons highlighted in CASSANDRA-7032, the high number of vnodes is not 
> necessary. It is very expensive for operations processes and scanning. Its 
> come up a lot and its pretty standard and known now to always reduce the 
> num_tokens within the community. We should just lower the defaults.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13701) Lower default num_tokens

2020-08-05 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-13701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17171556#comment-17171556
 ] 

Jeremy Hanna commented on CASSANDRA-13701:
--

I started down this road but don't think I can get through fixing all of the 
dtests.  On this line in bootstrap-test.py I changed the time.sleep to 10 
seconds and it appears to solve that problem - 
https://github.com/apache/cassandra-dtest/blob/master/bootstrap_test.py#L485

However there were many tests with replace_address that I'm not sure about.  I 
don't know how or why replace address would be affected by the new token 
allocation algorithm.  Dmitri said something about parallel bootstrap but I 
don't see that - sometimes no_wait or wait_other_notice is true or false so I 
thought it was that, but perhaps someone more familiar with ccm could see.

I'm sorry - I really want this to get in for the release but I don't have the 
time to dedicate to learning dtest at a deeper level to fix all of these in 
time.

> Lower default num_tokens
> 
>
> Key: CASSANDRA-13701
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13701
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Chris Lohfink
>Priority: Low
> Fix For: 4.0-alpha
>
>
> For reasons highlighted in CASSANDRA-7032, the high number of vnodes is not 
> necessary. It is very expensive for operations processes and scanning. Its 
> come up a lot and its pretty standard and known now to always reduce the 
> num_tokens within the community. We should just lower the defaults.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Assigned] (CASSANDRA-13701) Lower default num_tokens

2020-08-05 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna reassigned CASSANDRA-13701:


Assignee: (was: Jeremy Hanna)

> Lower default num_tokens
> 
>
> Key: CASSANDRA-13701
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13701
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Chris Lohfink
>Priority: Low
> Fix For: 4.0-alpha
>
>
> For reasons highlighted in CASSANDRA-7032, the high number of vnodes is not 
> necessary. It is very expensive for operations processes and scanning. Its 
> come up a lot and its pretty standard and known now to always reduce the 
> num_tokens within the community. We should just lower the defaults.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-15961) Reference CASSANDRA-12607

2020-08-03 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-15961:
-
Resolution: Duplicate
Status: Resolved  (was: Triage Needed)

> Reference CASSANDRA-12607
> -
>
> Key: CASSANDRA-15961
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15961
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Kapil Shewate
>Assignee: Mattias W
>Priority: Normal
>
> In cassandra 3.11.0 , the issue of commit logs being corrupted is still 
> observed. Will this be fixed in higher versions of Cassandra?
>  
> 02 19:58:33,677 JVMStabilityInspector.java:82 - Exiting due to error while 
> processing commit log during 
> initialization.org.apache.cassandra.db.commitlog.CommitLogReadHandler$CommitLogReadException:
>  Mutation checksum failure at 191598541 in Next section at 191590263 in 
> CommitLog-6-1592895482005.log at 
> org.apache.cassandra.db.commitlog.CommitLogReader.readSection(CommitLogReader.java:344)
>  [apache-cassandra-3.11.0.jar:3.11.0] at 
> org.apache.cassandra.db.commitlog.CommitLogReader.readCommitLogSegment(CommitLogReader.java:201)
>  [apache-cassandra-3.11.0.jar:3.11.0] at 
> org.apache.cassandra.db.commitlog.CommitLogReader.readAllFiles(CommitLogReader.java:84)
>  [apache-cassandra-3.11.0.jar:3.11.0] at 
> org.apache.cassandra.db.commitlog.CommitLogReplayer.replayFiles(CommitLogReplayer.java:140)
>  [apache-cassandra-3.11.0.jar:3.11.0] at 
> org.apache.cassandra.db.commitlog.CommitLog.recoverFiles(CommitLog.java:177) 
> [apache-cassandra-3.11.0.jar:3.11.0] at 
> org.apache.cassandra.db.commitlog.CommitLog.recoverSegmentsOnDisk(CommitLog.java:158)
>  [apache-cassandra-3.11.0.jar:3.11.0] at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:325) 
> [apache-cassandra-3.11.0.jar:3.11.0] at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:600)
>  [apache-cassandra-3.11.0.jar:3.11.0] at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:689) 
> [apache-cassandra-3.11.0.jar:3.11.0][WARN] [main] 2020-07-02 20:31:30,334 
> DatabaseDescriptor.java:540 - Only 2.339GiB free across all data volumes. 
> Consider adding more capacity to your cluster or removing obsolete 
> snapshots[WARN] [main] 2020-07-02 20:31:30,763 NativeLibrary.java:187 - 
> Unable to lock JVM memory (ENOMEM). This can result in part of the JVM being 
> swapped out, especially with mmapped I/O enabled. Increase RLIMIT_MEMLOCK or 
> run Cassandra as root.[WARN] [main] 2020-07-02 20:31:30,764 
> StartupChecks.java:127 - jemalloc shared library could not be preloaded to 
> speed up memory allocations[WARN] [main] 2020-07-02 20:31:30,764 
> StartupChecks.java:201 - Non-Oracle JVM detected.  Some features, such as 
> immediate unmap of compacted SSTables, may not work as intended[WARN] [main] 
> 2020-07-02 20:31:30,786 SigarLibrary.java:174 - Cassandra server running in 
> degraded mode. Is swap disabled? : false,  Address space adequate? : true,  
> nofile limit adequate? : false, nproc limit adequate? : true [WARN] [main] 
> 2020-07-02 20:31:30,789 StartupChecks.java:265 - Maximum number of memory map 
> areas per process (vm.max_map_count) 65530 is too low, recommended value: 
> 1048575, you can change it with sysctl.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-15947) nodetool gossipinfo doc does not document the output

2020-08-03 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-15947:
-
Complexity: Low Hanging Fruit

> nodetool gossipinfo doc does not document the output
> 
>
> Key: CASSANDRA-15947
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15947
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jens Rantil
>Priority: Low
>
> [https://cassandra.apache.org/doc/latest/tools/nodetool/gossipinfo.html] does 
> not contain any sample output, nor does does it explain what the fields mean.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-15947) nodetool gossipinfo doc does not document the output

2020-08-03 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-15947:
-
Priority: Normal  (was: Low)

> nodetool gossipinfo doc does not document the output
> 
>
> Key: CASSANDRA-15947
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15947
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jens Rantil
>Priority: Normal
>
> [https://cassandra.apache.org/doc/latest/tools/nodetool/gossipinfo.html] does 
> not contain any sample output, nor does does it explain what the fields mean.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-15947) nodetool gossipinfo doc does not document the output

2020-08-03 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-15947:
-
Priority: Low  (was: Normal)

> nodetool gossipinfo doc does not document the output
> 
>
> Key: CASSANDRA-15947
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15947
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jens Rantil
>Priority: Low
>
> [https://cassandra.apache.org/doc/latest/tools/nodetool/gossipinfo.html] does 
> not contain any sample output, nor does does it explain what the fields mean.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13701) Lower default num_tokens

2020-07-30 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-13701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17168411#comment-17168411
 ] 

Jeremy Hanna commented on CASSANDRA-13701:
--

I'm finally getting things going with dtests on a server (after spending hours 
trying to run them on my laptop).  One of the failures with the num_tokens 
update with bootstrap.py just needed to have a little more time.sleep - from 5 
to 10 seconds.  I'm going through all of them to see if I can fix them in some 
way and will then see if I can work with someone to get an updated set of 
dtests running on the jenkins server.

> Lower default num_tokens
> 
>
> Key: CASSANDRA-13701
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13701
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Chris Lohfink
>Assignee: Jeremy Hanna
>Priority: Low
> Fix For: 4.0-alpha
>
>
> For reasons highlighted in CASSANDRA-7032, the high number of vnodes is not 
> necessary. It is very expensive for operations processes and scanning. Its 
> come up a lot and its pretty standard and known now to always reduce the 
> num_tokens within the community. We should just lower the defaults.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13701) Lower default num_tokens

2020-07-09 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-13701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155036#comment-17155036
 ] 

Jeremy Hanna commented on CASSANDRA-13701:
--

So if it's a matter of like Dimitar says, making the bootstraps sequential, 
then that isn't strictly an error as much as an unfortunate side effect of the 
new algorithm with dtest parallelism.  So there appears to be two paths forward:

1) Use the randomized algorithm both in tests and in the defaults with a higher 
num_tokens count
2) Change the dtests with bootstrapping/joining to be sequential with the new 
defaults

Is it possible to start by trying option 2 and see where that gets us in terms 
of dtest runtimes and errors?  I don't want to go down a rabbit hole but it 
would be nice to quantify the trade-offs.

> Lower default num_tokens
> 
>
> Key: CASSANDRA-13701
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13701
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Chris Lohfink
>Assignee: Jeremy Hanna
>Priority: Low
>
> For reasons highlighted in CASSANDRA-7032, the high number of vnodes is not 
> necessary. It is very expensive for operations processes and scanning. Its 
> come up a lot and its pretty standard and known now to always reduce the 
> num_tokens within the community. We should just lower the defaults.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14902) Update the default for compaction_throughput_mb_per_sec

2020-07-08 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17153426#comment-17153426
 ] 

Jeremy Hanna commented on CASSANDRA-14902:
--

Dev mailing list discussion on this and {{num_tokens}} update to the defaults: 
https://lists.apache.org/thread.html/r3cdf12db175c3f49a7ecda7632c821c5ef37fd0d95ffdc0e28e2d120%40%3Cdev.cassandra.apache.org%3E
 

> Update the default for compaction_throughput_mb_per_sec
> ---
>
> Key: CASSANDRA-14902
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14902
> Project: Cassandra
>  Issue Type: Task
>  Components: Local/Compaction, Local/Config
>Reporter: Jeremy Hanna
>Assignee: Jeremy Hanna
>Priority: Low
>
> compaction_throughput_mb_per_sec has been at 16 since probably 0.6 or 0.7 
> back when a lot of people had to deploy on spinning disks.  It seems like it 
> would make sense to update the default to something more reasonable - 
> assuming a reasonably decent SSD and competing IO.  One idea that could be 
> bikeshedded to death could be to just default it to 64 - simply to avoid 
> people from having to always change that any time they download a new version 
> as well as avoid problems with new users thinking that the defaults are sane.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-15931) USING_G1 is incorrectly set in cassandra-env.sh if G1 is explicitly disabled

2020-07-07 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-15931:
-
Component/s: Local/Startup and Shutdown

> USING_G1 is incorrectly set in cassandra-env.sh if G1 is explicitly disabled
> 
>
> Key: CASSANDRA-15931
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15931
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Startup and Shutdown
>Reporter: Jeremy Hanna
>Assignee: Jeremy Hanna
>Priority: Normal
>
> {code}
> echo $JVM_OPTS | grep -q UseG1GC
> USING_G1=$?
> {code}
> This code will set {{USING_G1}} to {{0}} if G1 is explicitly enabled 
> ({{+UseG1GC}}) *or* explicitly disabled ({{-UseG1GC}}), as found on 
> CASSANDRA-15839.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-15931) USING_G1 is incorrectly set in cassandra-env.sh if G1 is explicitly disabled

2020-07-07 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-15931:
-
Status: Triage Needed  (was: Awaiting Feedback)

> USING_G1 is incorrectly set in cassandra-env.sh if G1 is explicitly disabled
> 
>
> Key: CASSANDRA-15931
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15931
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jeremy Hanna
>Assignee: Jeremy Hanna
>Priority: Normal
>
> {code}
> echo $JVM_OPTS | grep -q UseG1GC
> USING_G1=$?
> {code}
> This code will set {{USING_G1}} to {{0}} if G1 is explicitly enabled 
> ({{+UseG1GC}}) *or* explicitly disabled ({{-UseG1GC}}), as found on 
> CASSANDRA-15839.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Assigned] (CASSANDRA-15931) USING_G1 is incorrectly set in cassandra-env.sh if G1 is explicitly disabled

2020-07-07 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna reassigned CASSANDRA-15931:


Assignee: Jeremy Hanna

> USING_G1 is incorrectly set in cassandra-env.sh if G1 is explicitly disabled
> 
>
> Key: CASSANDRA-15931
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15931
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jeremy Hanna
>Assignee: Jeremy Hanna
>Priority: Normal
>
> {code}
> echo $JVM_OPTS | grep -q UseG1GC
> USING_G1=$?
> {code}
> This code will set {{USING_G1}} to {{0}} if G1 is explicitly enabled 
> ({{+UseG1GC}}) *or* explicitly disabled ({{-UseG1GC}}), as found on 
> CASSANDRA-15839.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-15931) USING_G1 is incorrectly set in cassandra-env.sh if G1 is explicitly disabled

2020-07-07 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-15931:
-
Status: Awaiting Feedback  (was: Triage Needed)

> USING_G1 is incorrectly set in cassandra-env.sh if G1 is explicitly disabled
> 
>
> Key: CASSANDRA-15931
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15931
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jeremy Hanna
>Assignee: Jeremy Hanna
>Priority: Normal
>
> {code}
> echo $JVM_OPTS | grep -q UseG1GC
> USING_G1=$?
> {code}
> This code will set {{USING_G1}} to {{0}} if G1 is explicitly enabled 
> ({{+UseG1GC}}) *or* explicitly disabled ({{-UseG1GC}}), as found on 
> CASSANDRA-15839.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15931) USING_G1 is incorrectly set in cassandra-env.sh if G1 is explicitly disabled

2020-07-07 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17153201#comment-17153201
 ] 

Jeremy Hanna commented on CASSANDRA-15931:
--

PR after testing that it matched {{+UseG1GC}} but not {{-UseG1GC}}: 
https://github.com/apache/cassandra/pull/667

> USING_G1 is incorrectly set in cassandra-env.sh if G1 is explicitly disabled
> 
>
> Key: CASSANDRA-15931
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15931
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Startup and Shutdown
>Reporter: Jeremy Hanna
>Assignee: Jeremy Hanna
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {code}
> echo $JVM_OPTS | grep -q UseG1GC
> USING_G1=$?
> {code}
> This code will set {{USING_G1}} to {{0}} if G1 is explicitly enabled 
> ({{+UseG1GC}}) *or* explicitly disabled ({{-UseG1GC}}), as found on 
> CASSANDRA-15839.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-15931) USING_G1 is incorrectly set in cassandra-env.sh if G1 is explicitly disabled

2020-07-07 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-15931:
-
Description: 
{code}
echo $JVM_OPTS | grep -q UseG1GC
USING_G1=$?
{code}
This code will set {{USING_G1}} to {{0}} if G1 is explicitly enabled 
({{+UseG1GC}}) *or* explicitly disabled ({{-UseG1GC}}), as found on 
CASSANDRA-15839.

  was:
{code}
echo $JVM_OPTS | grep -q UseG1GC
USING_G1=$?
{code}
This code will set {{USING_G1}} to {{0}} if G1 is explicitly enabled *or* 
explicitly disabled, as found on CASSANDRA-15839.


> USING_G1 is incorrectly set in cassandra-env.sh if G1 is explicitly disabled
> 
>
> Key: CASSANDRA-15931
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15931
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jeremy Hanna
>Priority: Normal
>
> {code}
> echo $JVM_OPTS | grep -q UseG1GC
> USING_G1=$?
> {code}
> This code will set {{USING_G1}} to {{0}} if G1 is explicitly enabled 
> ({{+UseG1GC}}) *or* explicitly disabled ({{-UseG1GC}}), as found on 
> CASSANDRA-15839.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-15839) Warn or fail to start server when G1 is used and Xmn is set

2020-07-07 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17153193#comment-17153193
 ] 

Jeremy Hanna edited comment on CASSANDRA-15839 at 7/8/20, 2:49 AM:
---

This is the code that *should* detect whether or not G1 is enabled.  However 
you can either enable or disable it with {{+UseG1GC}} or {{-UseG1GC}} which are 
both matched by that {{grep}}.  I have to admit I've never seen anyone in 
practice explicitly disable G1.
{code:sh}
echo $JVM_OPTS | grep -q UseG1GC
USING_G1=$?
{code}
That also affects whether we calculate heap sizes automatically or whether we 
fail out when we don't set the heap and new size in pairs:
{code:sh}
# only calculate the size if it's not set manually
if [ "x$MAX_HEAP_SIZE" = "x" ] && [ "x$HEAP_NEWSIZE" = "x" -o $USING_G1 -eq 0 
]; then
calculate_heap_sizes
elif [ "x$MAX_HEAP_SIZE" = "x" ] ||  [ "x$HEAP_NEWSIZE" = "x" -a $USING_G1 -ne 
0 ]; then
echo "please set or unset MAX_HEAP_SIZE and HEAP_NEWSIZE in pairs when 
using CMS GC (see cassandra-env.sh)"
exit 1
fi
{code}
Created CASSANDRA-15931 to address this.


was (Author: jeromatron):
This is the code that *should* detect whether or not G1 is enabled.  However 
since you can either enable or disable it with that string {{+UseG1GC}} or 
{{-UseG1GC}}.  I have to admit I've never seen anyone in practice explicitly 
disable G1.
{code:sh}
echo $JVM_OPTS | grep -q UseG1GC
USING_G1=$?
{code}
That also affects whether we calculate heap sizes automatically or whether we 
fail out when we don't set the heap and new size in pairs:
{code:sh}
# only calculate the size if it's not set manually
if [ "x$MAX_HEAP_SIZE" = "x" ] && [ "x$HEAP_NEWSIZE" = "x" -o $USING_G1 -eq 0 
]; then
calculate_heap_sizes
elif [ "x$MAX_HEAP_SIZE" = "x" ] ||  [ "x$HEAP_NEWSIZE" = "x" -a $USING_G1 -ne 
0 ]; then
echo "please set or unset MAX_HEAP_SIZE and HEAP_NEWSIZE in pairs when 
using CMS GC (see cassandra-env.sh)"
exit 1
fi
{code}
Created CASSANDRA-15931 to address this.

> Warn or fail to start server when G1 is used and Xmn is set
> ---
>
> Key: CASSANDRA-15839
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15839
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Startup and Shutdown
>Reporter: Jeremy Hanna
>Assignee: Anthony Grasso
>Priority: Normal
> Fix For: 4.0, 4.0-alpha5
>
>
> In jvm.options, we currently have a comment above where Xmn is set that says 
> that you shouldn't set Xmn with G1 GC.  That isn't enough - we should either 
> warn in the logs or fail startup when they are set together.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15839) Warn or fail to start server when G1 is used and Xmn is set

2020-07-07 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17153193#comment-17153193
 ] 

Jeremy Hanna commented on CASSANDRA-15839:
--

This is the code that *should* detect whether or not G1 is enabled.  However 
since you can either enable or disable it with that string {{+UseG1GC}} or 
{{-UseG1GC}}.  I have to admit I've never seen anyone in practice explicitly 
disable G1.
{code:sh}
echo $JVM_OPTS | grep -q UseG1GC
USING_G1=$?
{code}
That also affects whether we calculate heap sizes automatically or whether we 
fail out when we don't set the heap and new size in pairs:
{code:sh}
# only calculate the size if it's not set manually
if [ "x$MAX_HEAP_SIZE" = "x" ] && [ "x$HEAP_NEWSIZE" = "x" -o $USING_G1 -eq 0 
]; then
calculate_heap_sizes
elif [ "x$MAX_HEAP_SIZE" = "x" ] ||  [ "x$HEAP_NEWSIZE" = "x" -a $USING_G1 -ne 
0 ]; then
echo "please set or unset MAX_HEAP_SIZE and HEAP_NEWSIZE in pairs when 
using CMS GC (see cassandra-env.sh)"
exit 1
fi
{code}
Created CASSANDRA-15931 to address this.

> Warn or fail to start server when G1 is used and Xmn is set
> ---
>
> Key: CASSANDRA-15839
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15839
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Startup and Shutdown
>Reporter: Jeremy Hanna
>Assignee: Anthony Grasso
>Priority: Normal
> Fix For: 4.0, 4.0-alpha5
>
>
> In jvm.options, we currently have a comment above where Xmn is set that says 
> that you shouldn't set Xmn with G1 GC.  That isn't enough - we should either 
> warn in the logs or fail startup when they are set together.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-15931) USING_G1 is incorrectly set in cassandra-env.sh if G1 is explicitly disabled

2020-07-07 Thread Jeremy Hanna (Jira)

Jeremy Hanna created CASSANDRA-15931:


 Summary: USING_G1 is incorrectly set in cassandra-env.sh if G1 is 
explicitly disabled
 Key: CASSANDRA-15931
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15931
 Project: Cassandra
  Issue Type: Bug
Reporter: Jeremy Hanna


{code}
echo $JVM_OPTS | grep -q UseG1GC
USING_G1=$?
{code}
This code will set {{USING_G1}} to {{0}} if G1 is explicitly enabled *or* 
explicitly disabled, as found on CASSANDRA-15839.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-15803) Separate out allow filtering scanning through a partition versus scanning over the table

2020-07-07 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-15803:
-
Description: 
Currently allow filtering can mean two things in the spirit of "avoid 
operations that don't seek to a specific row or sequential rows of data."  
First, it can mean scanning across the entire table to meet the criteria of the 
query.  That's almost always a bad thing and should be discouraged or disabled 
(see CASSANDRA-8303).  Second, it can mean filtering within a specific 
partition.  For example, in a query you could specify the full partition key 
and if you specify a criterion on a non-key field, it requires allow filtering.

The second reason to require allow filtering is significantly less work to scan 
through a partition.  It is still extra work over seeking to a specific row and 
getting N sequential rows though.  So while an application developer and/or 
operator needs to be cautious about this second type, it's not necessarily a 
bad thing, depending on the table and the use case.

I propose that we separate the way to specify allow filtering across an entire 
table from specifying allow filtering across a partition in a backwards 
compatible way.  One idea that was brought up in Slack in the cassandra-dev 
room was to have allow filtering mean the superset - scanning across the table. 
 Then if you want to specify that you *only* want to scan within a partition 
you would use something like

{{ALLOW FILTERING [WITHIN PARTITION]}}

So it will succeed if you specify non-key criteria within a single partition, 
but fail with a message to say it requires the full allow filtering.  This 
would allow for a backwards compatible full allow filtering while allowing a 
user to specify that they want to just scan within a partition, but error out 
if trying to scan a full table.

This is potentially also related to the capability limitation framework by 
which operators could more granularly specify what features are allowed or 
disallowed per user, discussed in CASSANDRA-8303.  This way an operator could 
disallow the more general allow filtering while allowing the partition scan (or 
disallow them both at their discretion).

  was:
Currently allow filtering can mean two things in the spirit of "avoid 
operations that don't seek to a specific row or sequential rows of data."  
First, it can mean scanning across the entire table to meet the criteria of the 
query.  That's almost always a bad thing and should be discouraged or disabled 
(see CASSANDRA-8303).  Second, it can mean filtering within a specific 
partition.  For example, in a query you could specify the full partition key 
and if you specify a criterion on a non-key field, it requires allow filtering.

The second reason to require allow filtering is significantly less work to scan 
through a partition.  It is still extra work over seeking to a specific row and 
getting N sequential rows though.  So while an application developer and/or 
operator needs to be cautious about this second type, it's not necessarily a 
bad thing, depending on the table and the use case.

I propose that we separate the way to specify allow filtering across an entire 
table (involving a scatter gather) from specifying allow filtering across a 
partition in a backwards compatible way.  One idea that was brought up in Slack 
in the cassandra-dev room was to have allow filtering mean the superset - 
scanning across the table.  Then if you want to specify that you *only* want to 
scan within a partition you would use something like

{{ALLOW FILTERING [WITHIN PARTITION]}}

So it will succeed if you specify non-key criteria within a single partition, 
but fail with a message to say it requires the full allow filtering.  This 
would allow for a backwards compatible full allow filtering while allowing a 
user to specify that they want to just scan within a partition, but error out 
if trying to scan a full table.

This is potentially also related to the capability limitation framework by 
which operators could more granularly specify what features are allowed or 
disallowed per user, discussed in CASSANDRA-8303.  This way an operator could 
disallow the more general allow filtering while allowing the partition scan (or 
disallow them both at their discretion).


> Separate out allow filtering scanning through a partition versus scanning 
> over the table
> 
>
> Key: CASSANDRA-15803
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15803
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL/Syntax
>Reporter: Jeremy Hanna
>Priority: Normal
>
> Currently allow filtering can mean two things in the spirit of "avoid 
> operations that don't seek to a specific row or

[jira] [Comment Edited] (CASSANDRA-13701) Lower default num_tokens

2020-07-06 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-13701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17152460#comment-17152460
 ] 

Jeremy Hanna edited comment on CASSANDRA-13701 at 7/7/20, 3:28 AM:
---

Can we also standardize the tests to use the default values - that is, from 32 
to the new defaults (16 {{num_tokens}} with 
{{allocate_tokens_for_local_replication_factor=3}} uncommented).


was (Author: jeromatron):
Can we also standardize the tests to use the default values - that is, from 32 
to the new defaults (16 {{num_tokens}} with 
{{allocate_tokens_for_local_replication_factor=3}} uncommented.

> Lower default num_tokens
> 
>
> Key: CASSANDRA-13701
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13701
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Chris Lohfink
>Assignee: Jeremy Hanna
>Priority: Low
>
> For reasons highlighted in CASSANDRA-7032, the high number of vnodes is not 
> necessary. It is very expensive for operations processes and scanning. Its 
> come up a lot and its pretty standard and known now to always reduce the 
> num_tokens within the community. We should just lower the defaults.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13701) Lower default num_tokens

2020-07-06 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-13701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17152460#comment-17152460
 ] 

Jeremy Hanna commented on CASSANDRA-13701:
--

Can we also standardize the tests to use the default values - that is, from 32 
to the new defaults (16 {{num_tokens}} with 
{{allocate_tokens_for_local_replication_factor=3}} uncommented.

> Lower default num_tokens
> 
>
> Key: CASSANDRA-13701
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13701
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Chris Lohfink
>Assignee: Jeremy Hanna
>Priority: Low
>
> For reasons highlighted in CASSANDRA-7032, the high number of vnodes is not 
> necessary. It is very expensive for operations processes and scanning. Its 
> come up a lot and its pretty standard and known now to always reduce the 
> num_tokens within the community. We should just lower the defaults.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-13701) Lower default num_tokens

2020-07-06 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-13701:
-
Test and Documentation Plan: Associated documentation about num_tokens is 
in 
[https://cassandra.apache.org/doc/latest/getting_started/production.html#tokens]
 as part of CASSANDRA-15618 as well as upgrading information in NEWS.txt.
 Status: Patch Available  (was: In Progress)

Pull request: https://github.com/apache/cassandra/pull/663

> Lower default num_tokens
> 
>
> Key: CASSANDRA-13701
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13701
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Chris Lohfink
>Assignee: Jeremy Hanna
>Priority: Low
>
> For reasons highlighted in CASSANDRA-7032, the high number of vnodes is not 
> necessary. It is very expensive for operations processes and scanning. Its 
> come up a lot and its pretty standard and known now to always reduce the 
> num_tokens within the community. We should just lower the defaults.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14902) Update the default for compaction_throughput_mb_per_sec

2020-07-06 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17151826#comment-17151826
 ] 

Jeremy Hanna commented on CASSANDRA-14902:
--

Added a NEWS.txt entry in the upgrading section to the PR.

> Update the default for compaction_throughput_mb_per_sec
> ---
>
> Key: CASSANDRA-14902
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14902
> Project: Cassandra
>  Issue Type: Task
>  Components: Local/Compaction, Local/Config
>Reporter: Jeremy Hanna
>Assignee: Jeremy Hanna
>Priority: Low
>
> compaction_throughput_mb_per_sec has been at 16 since probably 0.6 or 0.7 
> back when a lot of people had to deploy on spinning disks.  It seems like it 
> would make sense to update the default to something more reasonable - 
> assuming a reasonably decent SSD and competing IO.  One idea that could be 
> bikeshedded to death could be to just default it to 64 - simply to avoid 
> people from having to always change that any time they download a new version 
> as well as avoid problems with new users thinking that the defaults are sane.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14902) Update the default for compaction_throughput_mb_per_sec

2020-07-05 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17151696#comment-17151696
 ] 

Jeremy Hanna commented on CASSANDRA-14902:
--

I assumed that updating to 64 would be uncontroversial because that's what I 
know many change it to (including myself) as a first step/starting point.  If 
we want to do more extensive comparison testing of different values, that's 
fine, but I think it would depend on the goal.  IO is going to be different for 
every system and every workload/pattern is going to be somewhat unique.  I 
thought 64 would at least make it not *required* to change it from the default 
as a first step.

> Update the default for compaction_throughput_mb_per_sec
> ---
>
> Key: CASSANDRA-14902
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14902
> Project: Cassandra
>  Issue Type: Task
>  Components: Local/Compaction, Local/Config
>Reporter: Jeremy Hanna
>Assignee: Jeremy Hanna
>Priority: Low
>
> compaction_throughput_mb_per_sec has been at 16 since probably 0.6 or 0.7 
> back when a lot of people had to deploy on spinning disks.  It seems like it 
> would make sense to update the default to something more reasonable - 
> assuming a reasonably decent SSD and competing IO.  One idea that could be 
> bikeshedded to death could be to just default it to 64 - simply to avoid 
> people from having to always change that any time they download a new version 
> as well as avoid problems with new users thinking that the defaults are sane.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14902) Update the default for compaction_throughput_mb_per_sec

2020-07-05 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-14902:
-
Test and Documentation Plan: Just updated the default value and comments so 
don't need much.
 Status: Patch Available  (was: In Progress)

The pull request: https://github.com/apache/cassandra/pull/662

> Update the default for compaction_throughput_mb_per_sec
> ---
>
> Key: CASSANDRA-14902
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14902
> Project: Cassandra
>  Issue Type: Task
>  Components: Local/Compaction, Local/Config
>Reporter: Jeremy Hanna
>Assignee: Jeremy Hanna
>Priority: Low
>
> compaction_throughput_mb_per_sec has been at 16 since probably 0.6 or 0.7 
> back when a lot of people had to deploy on spinning disks.  It seems like it 
> would make sense to update the default to something more reasonable - 
> assuming a reasonably decent SSD and competing IO.  One idea that could be 
> bikeshedded to death could be to just default it to 64 - simply to avoid 
> people from having to always change that any time they download a new version 
> as well as avoid problems with new users thinking that the defaults are sane.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Assigned] (CASSANDRA-14902) Update the default for compaction_throughput_mb_per_sec

2020-07-05 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna reassigned CASSANDRA-14902:


Assignee: Jeremy Hanna

> Update the default for compaction_throughput_mb_per_sec
> ---
>
> Key: CASSANDRA-14902
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14902
> Project: Cassandra
>  Issue Type: Task
>  Components: Local/Compaction, Local/Config
>Reporter: Jeremy Hanna
>Assignee: Jeremy Hanna
>Priority: Low
>
> compaction_throughput_mb_per_sec has been at 16 since probably 0.6 or 0.7 
> back when a lot of people had to deploy on spinning disks.  It seems like it 
> would make sense to update the default to something more reasonable - 
> assuming a reasonably decent SSD and competing IO.  One idea that could be 
> bikeshedded to death could be to just default it to 64 - simply to avoid 
> people from having to always change that any time they download a new version 
> as well as avoid problems with new users thinking that the defaults are sane.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-15860) Cannot change the number of tokens from 512 to 256 Fatal configuration error; unable to start server.

2020-06-15 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-15860:
-
Resolution: Not A Bug
Status: Resolved  (was: Triage Needed)

As mentioned previously, it's simply not possible to change num_tokens after 
data is written to a data center.

> Cannot change the number of tokens from 512 to 256 Fatal configuration error; 
> unable to start server.
> -
>
> Key: CASSANDRA-15860
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15860
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Krishnakumar Jinka
>Priority: Normal
>
> Hello, I was following this issue from jira: 
> https://issues.apache.org/jira/browse/CASSANDRA-11811?jql=text%20~%20%22CassandraDaemon.java%20Cannot%20change%22
> . We are using 3.11.2 and i see this error in the log while starting the 
> cassandra, and it fails. I read the jira and understood that mutation 
> happens, thereby doubling the number of tokens, and hence due to mismatch
> INFO [main] [2020-05-28 11:05:14] OutboundTcpConnection.java:108 - 
> OutboundTcpConnection using coalescing strategy DISABLED
> INFO [HANDSHAKE-/192.168.5.53] [2020-05-28 11:05:14] 
> OutboundTcpConnection.java:560 - Handshaking version with /192.168.5.53
> INFO [main] [2020-05-28 11:05:15] StorageService.java:707 - Loading persisted 
> ring state
> INFO [main] [2020-05-28 11:05:15] StorageService.java:825 - Starting up 
> server gossip
> INFO [main] [2020-05-28 11:05:15] TokenMetadata.java:479 - Updating topology 
> for /192.168.5.52
> INFO [main] [2020-05-28 11:05:15] TokenMetadata.java:479 - Updating topology 
> for /192.168.5.52
> Cannot change the number of tokens from 512 to 256
> Fatal configuration error; unable to start server. See log for stacktrace.
> ERROR [main] [2020-05-28 11:05:15] CassandraDaemon.java:708 - Fatal 
> configuration error
> org.apache.cassandra.exceptions.ConfigurationException: Cannot change the 
> number of tokens from 512 to 256
> at 
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:989)
>  ~[apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:682)
>  ~[apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:613)
>  ~[apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:379) 
> [apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:602)
>  [apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:691) 
> [apache-cassandra-3.11.2.jar:3.11.2]
> INFO [StorageServiceShutdownHook] [2020-05-28 11:05:15] HintsService.java:220 
> - Paused hints dispatch
> INFO [StorageServiceShutdownHook] [2020-05-28 11:05:15] Gossiper.java:1540 - 
> Announcing shutdown
> INFO [StorageServiceShutdownHook] [2020-05-28 11:05:15] 
> StorageService.java:2292 - Node /192.168.5.52 state jump to shutdown
> INFO [HANDSHAKE-/192.168.5.53] [2020-05-28 11:05:15] 
> OutboundTcpConnection.java:560 - Handshaking version with /192.168.5.53
> I would like to know 
>  # what would be the root cause of this error
>  # How to recover from this error. Because everytime i start the Cassandra, 
> it is blocked due to this. 
> /etc/cassandra/conf/cassandra.yaml 
> contains num_tokens as 256  , auto_bootstrap is not provided, i guess by 
> default it will be true. 
> INFO [main] [2020-05-28 11:05:13] StorageService.java:618 - Cassandra 
> version: 3.11.2
> INFO [main] [2020-05-28 11:05:13] StorageService.java:619 - Thrift API 
> version: 20.1.0
> INFO [main] [2020-05-28 11:05:13] StorageService.java:620 - CQL supported 
> versions: 3.4.4 (default: 3.4.4)
> INFO [main] [2020-05-28 11:05:13] StorageService.java:622 - Native protocol 
> supported versions: 3/v3, 4/v4, 5/v5-beta (default: 4/v4)
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15860) Cannot change the number of tokens from 512 to 256 Fatal configuration error; unable to start server.

2020-06-15 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17136328#comment-17136328
 ] 

Jeremy Hanna commented on CASSANDRA-15860:
--

You cannot change the num_tokens on a running cluster, you have to either
 # add another logical datacenter with a different num_tokens value (normal add 
datacenter procedure, add nodes without replication and then do a nodetool 
rebuild on each node after adding replication) or
 # create a new cluster with a different num_tokens value and sstable load data 
from the original cluster

I would go with option 1 if you can.

The reason why you can't change after data has been written is because data is 
already stored on the node for the 512 token ranges it has already claimed.  
You can't change that without rewriting the data around the cluster.  So it's 
simplest to add a new DC where that data can be written again.  See this [blog 
post|https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html]
 for setting up token allocation optimally.

> Cannot change the number of tokens from 512 to 256 Fatal configuration error; 
> unable to start server.
> -
>
> Key: CASSANDRA-15860
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15860
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Krishnakumar Jinka
>Priority: Normal
>
> Hello, I was following this issue from jira: 
> https://issues.apache.org/jira/browse/CASSANDRA-11811?jql=text%20~%20%22CassandraDaemon.java%20Cannot%20change%22
> . We are using 3.11.2 and i see this error in the log while starting the 
> cassandra, and it fails. I read the jira and understood that mutation 
> happens, thereby doubling the number of tokens, and hence due to mismatch
> INFO [main] [2020-05-28 11:05:14] OutboundTcpConnection.java:108 - 
> OutboundTcpConnection using coalescing strategy DISABLED
> INFO [HANDSHAKE-/192.168.5.53] [2020-05-28 11:05:14] 
> OutboundTcpConnection.java:560 - Handshaking version with /192.168.5.53
> INFO [main] [2020-05-28 11:05:15] StorageService.java:707 - Loading persisted 
> ring state
> INFO [main] [2020-05-28 11:05:15] StorageService.java:825 - Starting up 
> server gossip
> INFO [main] [2020-05-28 11:05:15] TokenMetadata.java:479 - Updating topology 
> for /192.168.5.52
> INFO [main] [2020-05-28 11:05:15] TokenMetadata.java:479 - Updating topology 
> for /192.168.5.52
> Cannot change the number of tokens from 512 to 256
> Fatal configuration error; unable to start server. See log for stacktrace.
> ERROR [main] [2020-05-28 11:05:15] CassandraDaemon.java:708 - Fatal 
> configuration error
> org.apache.cassandra.exceptions.ConfigurationException: Cannot change the 
> number of tokens from 512 to 256
> at 
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:989)
>  ~[apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:682)
>  ~[apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:613)
>  ~[apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:379) 
> [apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:602)
>  [apache-cassandra-3.11.2.jar:3.11.2]
> at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:691) 
> [apache-cassandra-3.11.2.jar:3.11.2]
> INFO [StorageServiceShutdownHook] [2020-05-28 11:05:15] HintsService.java:220 
> - Paused hints dispatch
> INFO [StorageServiceShutdownHook] [2020-05-28 11:05:15] Gossiper.java:1540 - 
> Announcing shutdown
> INFO [StorageServiceShutdownHook] [2020-05-28 11:05:15] 
> StorageService.java:2292 - Node /192.168.5.52 state jump to shutdown
> INFO [HANDSHAKE-/192.168.5.53] [2020-05-28 11:05:15] 
> OutboundTcpConnection.java:560 - Handshaking version with /192.168.5.53
> I would like to know 
>  # what would be the root cause of this error
>  # How to recover from this error. Because everytime i start the Cassandra, 
> it is blocked due to this. 
> /etc/cassandra/conf/cassandra.yaml 
> contains num_tokens as 256  , auto_bootstrap is not provided, i guess by 
> default it will be true. 
> INFO [main] [2020-05-28 11:05:13] StorageService.java:618 - Cassandra 
> version: 3.11.2
> INFO [main] [2020-05-28 11:05:13] StorageService.java:619 - Thrift API 
> version: 20.1.0
> INFO [main] [2020-05-28 11:05:13] StorageService.java:620 - CQL supported 
> versions: 3.4.4 (default: 3.4.4)
> INFO [main] [2020-05-28 11:05:13] StorageService.java:622 - Native protocol 
> supported versions: 3/v3, 4/v4, 5/v5-beta (default: 4/v4)
>  



--
This message was sent by

[jira] [Updated] (CASSANDRA-15839) Warn or fail to start server when G1 is used and Xmn is set

2020-05-31 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-15839:
-
Reviewers: Jon Haddad

> Warn or fail to start server when G1 is used and Xmn is set
> ---
>
> Key: CASSANDRA-15839
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15839
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Startup and Shutdown
>Reporter: Jeremy Hanna
>Assignee: Jeremy Hanna
>Priority: Normal
>
> In jvm.options, we currently have a comment above where Xmn is set that says 
> that you shouldn't set Xmn with G1 GC.  That isn't enough - we should either 
> warn in the logs or fail startup when they are set together.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-15839) Warn or fail to start server when G1 is used and Xmn is set

2020-05-31 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-15839:
-
Test and Documentation Plan: I did some basic testing around the startup 
options and it works as expected.
 Status: Patch Available  (was: Open)

https://github.com/apache/cassandra/pull/607

> Warn or fail to start server when G1 is used and Xmn is set
> ---
>
> Key: CASSANDRA-15839
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15839
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Startup and Shutdown
>Reporter: Jeremy Hanna
>Assignee: Jeremy Hanna
>Priority: Normal
>
> In jvm.options, we currently have a comment above where Xmn is set that says 
> that you shouldn't set Xmn with G1 GC.  That isn't enough - we should either 
> warn in the logs or fail startup when they are set together.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-15839) Warn or fail to start server when G1 is used and Xmn is set

2020-05-31 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-15839:
-
Change Category: Operability
 Complexity: Low Hanging Fruit
Component/s: Local/Startup and Shutdown
 Status: Open  (was: Triage Needed)

> Warn or fail to start server when G1 is used and Xmn is set
> ---
>
> Key: CASSANDRA-15839
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15839
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Startup and Shutdown
>Reporter: Jeremy Hanna
>Assignee: Jeremy Hanna
>Priority: Normal
>
> In jvm.options, we currently have a comment above where Xmn is set that says 
> that you shouldn't set Xmn with G1 GC.  That isn't enough - we should either 
> warn in the logs or fail startup when they are set together.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-15839) Warn or fail to start server when G1 is used and Xmn is set

2020-05-31 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-15839:
-
Status: Triage Needed  (was: Awaiting Feedback)

> Warn or fail to start server when G1 is used and Xmn is set
> ---
>
> Key: CASSANDRA-15839
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15839
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jeremy Hanna
>Assignee: Jeremy Hanna
>Priority: Normal
>
> In jvm.options, we currently have a comment above where Xmn is set that says 
> that you shouldn't set Xmn with G1 GC.  That isn't enough - we should either 
> warn in the logs or fail startup when they are set together.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-15839) Warn or fail to start server when G1 is used and Xmn is set

2020-05-31 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-15839:
-
Status: Awaiting Feedback  (was: Triage Needed)

> Warn or fail to start server when G1 is used and Xmn is set
> ---
>
> Key: CASSANDRA-15839
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15839
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jeremy Hanna
>Assignee: Jeremy Hanna
>Priority: Normal
>
> In jvm.options, we currently have a comment above where Xmn is set that says 
> that you shouldn't set Xmn with G1 GC.  That isn't enough - we should either 
> warn in the logs or fail startup when they are set together.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-15521) Update default for num_tokens from 256 to something more reasonable

2020-05-31 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-15521:
-
Resolution: Duplicate
Status: Resolved  (was: Triage Needed)

> Update default for num_tokens from 256 to something more reasonable
> ---
>
> Key: CASSANDRA-15521
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15521
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Virtual Nodes
>Reporter: Jeremy Hanna
>Assignee: Jeremy Hanna
>Priority: Normal
>
> The default for num_tokens or the number of token ranges assigned to a node 
> using virtual nodes is way too high.  256 token ranges makes repair painful.  
> Since it's a default, someone new to Cassandra won't know better and if left 
> unchanged, they will have to live with it or perform a migration to a new 
> datacenter with a lower number.
> At the same time, going too low with the default allocation algorithm can 
> hotspot nodes to have more tokens assigned than others.  There is a new token 
> allocation algorithm introduced but it's not default.
> The proposal of this ticket is to set the default to something more 
> reasonable to align with best practices without using the new token algorithm 
> or giving it specific token values as some do.  32 is a good compromise and 
> is what the project uses in a lot of the tests that are done.
> So generally it would be good to move to a more sane value and to align with 
> testing so users are more confident that the defaults have a lot of testing 
> behind them.
> As discussed on the dev mailing list, we want to make sure this change to the 
> default doesn't come as an unpleasant surprise to cluster operators.  For 
> num_tokens specifically, if you were to upgrade to a version with the new 
> default and the user didn't change it to the existing value, the node would 
> not start, saying you can't change the num_tokens on an existing node.  So we 
> will want to put a release note to indicate that when upgrading, make a note 
> of the num_tokens change when looking at the new configuration.
> Along with not being able to start nodes, which is fail-fast, there is the 
> matter of adding new nodes to the cluster.  You can certainly add a new node 
> to a cluster or datacenter with a different number of token ranges assigned.  
> It will give that node a different amount of data to be responsible for.  For 
> example, if the nodes in a datacenter all have num_tokens=256 (current 
> default) and you add a node to that datacenter with num_tokens=32 (new 
> default), it will only claim 1/8th of the token ranges and data as the other 
> nodes in that datacenter.  Fortunately, this is a property that is explicitly 
> defined rather than implicit like some of the table settings.  Also most if 
> not all operators will upgrade the existing nodes to that new version before 
> trying to add a node with that new version.  So if there is a different 
> number for num_tokens on the existing nodes, they'll be aware of it 
> immediately.
> In any case, this is a long proposal for what will be a small change in the 
> cassandra.yaml and something in the release notes, that is, changing the 
> default num_tokens value from 256 to 32.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Assigned] (CASSANDRA-15839) Warn or fail to start server when G1 is used and Xmn is set

2020-05-31 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna reassigned CASSANDRA-15839:


Assignee: Jeremy Hanna

> Warn or fail to start server when G1 is used and Xmn is set
> ---
>
> Key: CASSANDRA-15839
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15839
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jeremy Hanna
>Assignee: Jeremy Hanna
>Priority: Normal
>
> In jvm.options, we currently have a comment above where Xmn is set that says 
> that you shouldn't set Xmn with G1 GC.  That isn't enough - we should either 
> warn in the logs or fail startup when they are set together.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-15839) Warn or fail to start server when G1 is used and Xmn is set

2020-05-28 Thread Jeremy Hanna (Jira)

Jeremy Hanna created CASSANDRA-15839:


 Summary: Warn or fail to start server when G1 is used and Xmn is 
set
 Key: CASSANDRA-15839
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15839
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jeremy Hanna


In jvm.options, we currently have a comment above where Xmn is set that says 
that you shouldn't set Xmn with G1 GC.  That isn't enough - we should either 
warn in the logs or fail startup when they are set together.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-15823) Support for networking via identity instead of IP

2020-05-26 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116489#comment-17116489
 ] 

Jeremy Hanna edited comment on CASSANDRA-15823 at 5/26/20, 6:50 AM:


Adding in related issues where host id (CASSANDRA-4120 for vnodes in 1.2) and 
previously token (CASSANDRA-1518 for 0.7) were put in place of IP address to 
identify nodes - for historical purposes.


was (Author: jeromatron):
Adding in related issues where host id (for vnodes in 1.2) and previously token 
(for 0.7) were put in place of IP address to identify nodes - for historical 
purposes.

> Support for networking via identity instead of IP
> -
>
> Key: CASSANDRA-15823
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15823
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Christopher Bradford
>Priority: Normal
> Attachments: consul-mesh-gateways.png, 
> istio-multicluster-with-gateways.svg, linkerd-service-mirroring.svg
>
>
> TL;DR: Instead of mapping host ids to IPs, use hostnames. This allows 
> resolution to different IP addresses per DC that may then be forwarded to 
> nodes on remote networks without requiring node to node IP connectivity for 
> cross-dc links.
>  
> This approach should not affect existing deployments as those could continue 
> to use IPs as the hostname and skip resolution.
> 
> With orchestration platforms like Kubernetes and the usage of ephemeral 
> containers in environments today we should consider some changes to how we 
> handle the tracking of nodes and their network location. Currently we 
> maintain a mapping between host ids and IP addresses.
>  
> With traditional infrastructure, if a node goes down it, usually, comes back 
> up with the same IP. In some environments this contract may be explicit with 
> virtual IPs that may move between hosts. In newer deployments, like on 
> Kubernetes, this contract is not possible. Pods (analogous to nodes) are 
> assigned an IP address at start time. Should the pod be restarted or 
> scheduled on a different host there is no guarantee we would have the same 
> IP. Cassandra is protected here as we already have logic in place to update 
> peers when we come up with the same host id, but a different IP address.
>  
> There are ways to get Kubernetes to assign a specific IP per Pod. Most 
> recommendations involve the use of a service per pod. Communication with the 
> fixed service IP would automatically forward to the associated pod, 
> regardless of address. We _could_ use this approach, but it seems like this 
> would needlessly create a number of extra resources in our k8s cluster to get 
> around the problem. Which, to be fair, doesn't seem like much of a problem 
> with the aforementioned mitigations built into C*.
>  
> So what is the _actual_ problem? *Cross-region, cross-cloud, 
> hybrid-deployment connectivity between pods is a pain.* This can be solved 
> with significant investment by those who want to deploy these types of 
> topologies. You can definitely configure connectivity between clouds over 
> dedicated connections, or VPN tunnels. With a big chunk of time insuring that 
> pod to pod connectivity just works even if those pods are managed by separate 
> control planes, but that again requires time and talent. There are a number 
> of edge cases to support between the ever so slight, but very important, 
> differences in cloud vendor networks.
>  
> Recently there have been a number of innovations that aid in the deployment 
> and operation of these types of applications on Kubernetes. Service meshes 
> support distributed microservices running across multiple k8s cluster control 
> planes in disparate networks. Instead of directly connecting to IP addresses 
> of remote services instead they use a hostname. With this approach, hostname 
> traffic may then be routed to a proxy that sends traffic over the WAN 
> (sometimes with mTLS) to another proxy pod in the remote cluster which then 
> forwards the data along to the correct pod in that network. (See attached 
> diagrams)
>  
> Which brings us to the point of this ticket. Instead of mapping host ids to 
> IPs, use hostnames (and update the underlying address periodically instead of 
> caching indefinitely). This allows resolution to different IP addresses per 
> DC (k8s cluster) that may then be forwarded to nodes (pods) on remote 
> networks (k8s clusters) without requiring node to node (pod to pod) IP 
> connectivity between them. Traditional deployments can still function like 
> they do today (even if operators opt to keep using IPs as identifiers instead 
> of hostnames). This proxy approach is then enabled like those we see in 
> service meshes.
>  
> _Notes_
> C* already has the concept of broadcast

[jira] [Commented] (CASSANDRA-15823) Support for networking via identity instead of IP

2020-05-26 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116489#comment-17116489
 ] 

Jeremy Hanna commented on CASSANDRA-15823:
--

Adding in related issues where host id (for vnodes in 1.2) and previously token 
(for 0.7) were put in place of IP address to identify nodes - for historical 
purposes.

> Support for networking via identity instead of IP
> -
>
> Key: CASSANDRA-15823
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15823
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Christopher Bradford
>Priority: Normal
> Attachments: consul-mesh-gateways.png, 
> istio-multicluster-with-gateways.svg, linkerd-service-mirroring.svg
>
>
> TL;DR: Instead of mapping host ids to IPs, use hostnames. This allows 
> resolution to different IP addresses per DC that may then be forwarded to 
> nodes on remote networks without requiring node to node IP connectivity for 
> cross-dc links.
>  
> This approach should not affect existing deployments as those could continue 
> to use IPs as the hostname and skip resolution.
> 
> With orchestration platforms like Kubernetes and the usage of ephemeral 
> containers in environments today we should consider some changes to how we 
> handle the tracking of nodes and their network location. Currently we 
> maintain a mapping between host ids and IP addresses.
>  
> With traditional infrastructure, if a node goes down it, usually, comes back 
> up with the same IP. In some environments this contract may be explicit with 
> virtual IPs that may move between hosts. In newer deployments, like on 
> Kubernetes, this contract is not possible. Pods (analogous to nodes) are 
> assigned an IP address at start time. Should the pod be restarted or 
> scheduled on a different host there is no guarantee we would have the same 
> IP. Cassandra is protected here as we already have logic in place to update 
> peers when we come up with the same host id, but a different IP address.
>  
> There are ways to get Kubernetes to assign a specific IP per Pod. Most 
> recommendations involve the use of a service per pod. Communication with the 
> fixed service IP would automatically forward to the associated pod, 
> regardless of address. We _could_ use this approach, but it seems like this 
> would needlessly create a number of extra resources in our k8s cluster to get 
> around the problem. Which, to be fair, doesn't seem like much of a problem 
> with the aforementioned mitigations built into C*.
>  
> So what is the _actual_ problem? *Cross-region, cross-cloud, 
> hybrid-deployment connectivity between pods is a pain.* This can be solved 
> with significant investment by those who want to deploy these types of 
> topologies. You can definitely configure connectivity between clouds over 
> dedicated connections, or VPN tunnels. With a big chunk of time insuring that 
> pod to pod connectivity just works even if those pods are managed by separate 
> control planes, but that again requires time and talent. There are a number 
> of edge cases to support between the ever so slight, but very important, 
> differences in cloud vendor networks.
>  
> Recently there have been a number of innovations that aid in the deployment 
> and operation of these types of applications on Kubernetes. Service meshes 
> support distributed microservices running across multiple k8s cluster control 
> planes in disparate networks. Instead of directly connecting to IP addresses 
> of remote services instead they use a hostname. With this approach, hostname 
> traffic may then be routed to a proxy that sends traffic over the WAN 
> (sometimes with mTLS) to another proxy pod in the remote cluster which then 
> forwards the data along to the correct pod in that network. (See attached 
> diagrams)
>  
> Which brings us to the point of this ticket. Instead of mapping host ids to 
> IPs, use hostnames (and update the underlying address periodically instead of 
> caching indefinitely). This allows resolution to different IP addresses per 
> DC (k8s cluster) that may then be forwarded to nodes (pods) on remote 
> networks (k8s clusters) without requiring node to node (pod to pod) IP 
> connectivity between them. Traditional deployments can still function like 
> they do today (even if operators opt to keep using IPs as identifiers instead 
> of hostnames). This proxy approach is then enabled like those we see in 
> service meshes.
>  
> _Notes_
> C* already has the concept of broadcast addresses vs those which are bound on 
> the node. This approach _could_ be leveraged to provide the behavior we're 
> looking for, but then the broadcast values would need to be pre-computed 
> _*and match*_ across all k8s control planes. By using hostnames the 
> underlying IP

[jira] [Updated] (CASSANDRA-15803) Separate out allow filtering scanning through a partition versus scanning over the table

2020-05-11 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-15803:
-
Description: 
Currently allow filtering can mean two things in the spirit of "avoid 
operations that don't seek to a specific row or sequential rows of data."  
First, it can mean scanning across the entire table to meet the criteria of the 
query.  That's almost always a bad thing and should be discouraged or disabled 
(see CASSANDRA-8303).  Second, it can mean filtering within a specific 
partition.  For example, in a query you could specify the full partition key 
and if you specify a criterion on a non-key field, it requires allow filtering.

The second reason to require allow filtering is significantly less work to scan 
through a partition.  It is still extra work over seeking to a specific row and 
getting N sequential rows though.  So while an application developer and/or 
operator needs to be cautious about this second type, it's not necessarily a 
bad thing, depending on the table and the use case.

I propose that we separate the way to specify allow filtering across an entire 
table (involving a scatter gather) from specifying allow filtering across a 
partition in a backwards compatible way.  One idea that was brought up in Slack 
in the cassandra-dev room was to have allow filtering mean the superset - 
scanning across the table.  Then if you want to specify that you *only* want to 
scan within a partition you would use something like

{{ALLOW FILTERING [WITHIN PARTITION]}}

So it will succeed if you specify non-key criteria within a single partition, 
but fail with a message to say it requires the full allow filtering.  This 
would allow for a backwards compatible full allow filtering while allowing a 
user to specify that they want to just scan within a partition, but error out 
if trying to scan a full table.

This is potentially also related to the capability limitation framework by 
which operators could more granularly specify what features are allowed or 
disallowed per user, discussed in CASSANDRA-8303.  This way an operator could 
disallow the more general allow filtering while allowing the partition scan (or 
disallow them both at their discretion).

  was:
Currently allow filtering can mean two things in the spirit of "avoid 
operations that don't seek to a specific row or sequential rows of data."  
First, it can mean scanning across the entire table to meet the criteria of the 
query.  That's almost always a bad thing and should be discouraged or disabled 
(see CASSANDRA-8303).  Second, it can mean filtering within a specific 
partition.  For example, in a query you could specify the full partition key 
and if you specify a criterion on a non-key field, it requires allow filtering.

The second reason to require allow filtering is significantly less work to scan 
through a partition.  It is still extra work over seeking to a specific row and 
getting N sequential rows though.  So while an application developer and/or 
operator needs to be cautious about this second type, it's not necessarily a 
bad thing, depending on the table and the use case.

I propose that we separate the way to specify allow filtering across an entire 
table (involving a scatter gather) from specifying allow filtering across a 
partition in a backwards compatible way.  One idea that was brought up in Slack 
in the cassandra-dev room was to have allow filtering mean the superset - 
scanning across the table.  Then if you want to specify that you *only* want to 
scan within a partition you would use something like

{{ALLOW FILTERING [WITHIN PARTITION]}}

So it will succeed if you specify non-key criteria within a single partition, 
but fail with a message to say it requires the full allow filtering.

 

This would allow for a backwards compatible full allow filtering while allowing 
a user to specify that they want to just scan within a partition, but error out 
if trying to scan a full table.

This is potentially also related to the capability limitation framework by 
which operators could more granularly specify what features are allowed or 
disallowed per user, discussed in CASSANDRA-8303.  This way an operator could 
disallow the more general allow filtering while allowing the partition scan (or 
disallow them both at their discretion).


> Separate out allow filtering scanning through a partition versus scanning 
> over the table
> 
>
> Key: CASSANDRA-15803
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15803
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL/Syntax
>Reporter: Jeremy Hanna
>Priority: Normal
>
> Currently allow filtering can mean two things in the spirit of "avoid 
> operations that

[jira] [Updated] (CASSANDRA-15803) Separate out allow filtering scanning through a partition versus scanning over the table

2020-05-11 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-15803:
-
Description: 
Currently allow filtering can mean two things in the spirit of "avoid 
operations that don't seek to a specific row or sequential rows of data."  
First, it can mean scanning across the entire table to meet the criteria of the 
query.  That's almost always a bad thing and should be discouraged or disabled 
(see CASSANDRA-8303).  Second, it can mean filtering within a specific 
partition.  For example, in a query you could specify the full partition key 
and if you specify a criterion on a non-key field, it requires allow filtering.

The second reason to require allow filtering is significantly less work to scan 
through a partition.  It is still extra work over seeking to a specific row and 
getting N sequential rows though.  So while an application developer and/or 
operator needs to be cautious about this second type, it's not necessarily a 
bad thing, depending on the table and the use case.

I propose that we separate the way to specify allow filtering across an entire 
table (involving a scatter gather) from specifying allow filtering across a 
partition in a backwards compatible way.  One idea that was brought up in Slack 
in the cassandra-dev room was to have allow filtering mean the superset - 
scanning across the table.  Then if you want to specify that you *only* want to 
scan within a partition you would use something like

{{ALLOW FILTERING [WITHIN PARTITION]}}

So it will succeed if you specify non-key criteria within a single partition, 
but fail with a message to say it requires the full allow filtering.

 

This would allow for a backwards compatible full allow filtering while allowing 
a user to specify that they want to just scan within a partition, but error out 
if trying to scan a full table.

This is potentially also related to the capability limitation framework by 
which operators could more granularly specify what features are allowed or 
disallowed per user, discussed in CASSANDRA-8303.  This way an operator could 
disallow the more general allow filtering while allowing the partition scan (or 
disallow them both at their discretion).

  was:
Currently allow filtering can mean two things in the spirit of "avoid 
operations that don't seek to a specific row or sequential rows of data."  
First, it can mean scanning across the entire table to meet the criteria of the 
query.  That's almost always a bad thing and should be discouraged or disabled 
(see CASSANDRA-8303).  Second, it can mean filtering within a specific 
partition.  For example, in a query you could specify the full partition key 
and if you specify a criterion on a non-key field, it requires allow filtering.

The second reason to require allow filtering is significantly less work to scan 
through a partition.  It is still extra work over seeking to a specific row and 
getting N sequential rows though.  So while an application developer and/or 
operator needs to be cautious about this second type, it's not necessarily a 
bad thing, depending on the table and the use case.

I propose that we separate the way to specify allow filtering across an entire 
table (involving a scatter gather) from specifying allow filtering across a 
partition in a backwards compatible way.  One idea that was brought up in Slack 
in the cassandra-dev room was to have allow filtering mean the superset - 
scanning across the table.  Then if you want to specify that you *only* want to 
scan within a partition.  So it will succeed if you specify non-key criteria 
within a single partition, but fail with a message to say it requires the full 
allow filtering.  One way would be to have it be 

{{ALLOW FILTERING [WITHIN PARTITION]}}

This would allow for a backwards compatible full allow filtering while allowing 
a user to specify that they want to just scan within a partition, but error out 
if trying to scan a full table.

This is potentially also related to the capability limitation framework by 
which operators could more granularly specify what features are allowed or 
disallowed per user, discussed in CASSANDRA-8303.  This way an operator could 
disallow the more general allow filtering while allowing the partition scan (or 
disallow them both at their discretion).


> Separate out allow filtering scanning through a partition versus scanning 
> over the table
> 
>
> Key: CASSANDRA-15803
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15803
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL/Syntax
>Reporter: Jeremy Hanna
>Priority: Normal
>
> Currently allow filtering can mean two things in the spirit of "avoid 
> operations that

[jira] [Commented] (CASSANDRA-15775) Configuration to disallow queries with "allow filtering"

2020-05-11 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105070#comment-17105070
 ] 

Jeremy Hanna commented on CASSANDRA-15775:
--

See CASSANDRA-8303 which generalizes this.  The problem though is that allow 
filtering has two purposes - first, it allows you to scan over multiple 
partitions which is almost always bad.  Second it is needed if you are scanning 
through a partition.  See CASSANDRA-15803 for a proposal to separate out the 
first from the second case.

> Configuration to disallow queries with "allow filtering"
> 
>
> Key: CASSANDRA-15775
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15775
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Local Write-Read Paths
>Reporter: Christian Fredriksson
>Priority: Normal
>
> Problem: We have inexperienced developers not following guidelines or best 
> pratices who do queries with "allow filtering" which have negative impact on 
> performance on other queries and developers.
> It would be beneficial to have a (server side) configuration to disallow 
> these queries altogether.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-15803) Separate out allow filtering scanning through a partition versus scanning over the table

2020-05-11 Thread Jeremy Hanna (Jira)

Jeremy Hanna created CASSANDRA-15803:


 Summary: Separate out allow filtering scanning through a partition 
versus scanning over the table
 Key: CASSANDRA-15803
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15803
 Project: Cassandra
  Issue Type: Improvement
  Components: CQL/Syntax
Reporter: Jeremy Hanna


Currently allow filtering can mean two things in the spirit of "avoid 
operations that don't seek to a specific row or sequential rows of data."  
First, it can mean scanning across the entire table to meet the criteria of the 
query.  That's almost always a bad thing and should be discouraged or disabled 
(see CASSANDRA-8303).  Second, it can mean filtering within a specific 
partition.  For example, in a query you could specify the full partition key 
and if you specify a criterion on a non-key field, it requires allow filtering.

The second reason to require allow filtering is significantly less work to scan 
through a partition.  It is still extra work over seeking to a specific row and 
getting N sequential rows though.  So while an application developer and/or 
operator needs to be cautious about this second type, it's not necessarily a 
bad thing, depending on the table and the use case.

I propose that we separate the way to specify allow filtering across an entire 
table (involving a scatter gather) from specifying allow filtering across a 
partition in a backwards compatible way.  One idea that was brought up in Slack 
in the cassandra-dev room was to have allow filtering mean the superset - 
scanning across the table.  Then if you want to specify that you *only* want to 
scan within a partition.  So it will succeed if you specify non-key criteria 
within a single partition, but fail with a message to say it requires the full 
allow filtering.  One way would be to have it be 

{{ALLOW FILTERING [WITHIN PARTITION]}}

This would allow for a backwards compatible full allow filtering while allowing 
a user to specify that they want to just scan within a partition, but error out 
if trying to scan a full table.

This is potentially also related to the capability limitation framework by 
which operators could more granularly specify what features are allowed or 
disallowed per user, discussed in CASSANDRA-8303.  This way an operator could 
disallow the more general allow filtering while allowing the partition scan (or 
disallow them both at their discretion).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13701) Lower default num_tokens

2020-03-31 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-13701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072398#comment-17072398
 ] 

Jeremy Hanna commented on CASSANDRA-13701:
--

[~mshuler] What would we need to do to update testing to 16 so that it 
coincides with the new defaults?

> Lower default num_tokens
> 
>
> Key: CASSANDRA-13701
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13701
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Chris Lohfink
>Assignee: Jeremy Hanna
>Priority: Low
>
> For reasons highlighted in CASSANDRA-7032, the high number of vnodes is not 
> necessary. It is very expensive for operations processes and scanning. Its 
> come up a lot and its pretty standard and known now to always reduce the 
> num_tokens within the community. We should just lower the defaults.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13701) Lower default num_tokens

2020-03-31 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-13701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17072390#comment-17072390
 ] 

Jeremy Hanna commented on CASSANDRA-13701:
--

As discussed in [this 
thread|https://lists.apache.org/thread.html/r164d8a4143551b5ef774734afdce0ef31a0e461d71276f8446be%40%3Cdev.cassandra.apache.org%3E]
 the community decided on a default of 16 for now.  I'll assign to myself and 
put in some release notes about it.  At the same time I'll add some 
documentation for the topic.

> Lower default num_tokens
> 
>
> Key: CASSANDRA-13701
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13701
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Chris Lohfink
>Priority: Low
>
> For reasons highlighted in CASSANDRA-7032, the high number of vnodes is not 
> necessary. It is very expensive for operations processes and scanning. Its 
> come up a lot and its pretty standard and known now to always reduce the 
> num_tokens within the community. We should just lower the defaults.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Assigned] (CASSANDRA-13701) Lower default num_tokens

2020-03-31 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna reassigned CASSANDRA-13701:


Assignee: Jeremy Hanna

> Lower default num_tokens
> 
>
> Key: CASSANDRA-13701
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13701
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Chris Lohfink
>Assignee: Jeremy Hanna
>Priority: Low
>
> For reasons highlighted in CASSANDRA-7032, the high number of vnodes is not 
> necessary. It is very expensive for operations processes and scanning. Its 
> come up a lot and its pretty standard and known now to always reduce the 
> num_tokens within the community. We should just lower the defaults.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13749) add documentation about upgrade process to docs

2020-03-30 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17071481#comment-17071481
 ] 

Jeremy Hanna commented on CASSANDRA-13749:
--

I'm going to take a stab at this as it would be good to get in place with the 
upcoming 4.0 upgrade testing that people will be doing.

> add documentation about upgrade process to docs
> ---
>
> Key: CASSANDRA-13749
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13749
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Documentation and Website
>Reporter: Jon Haddad
>Assignee: Sumanth Pasupuleti
>Priority: Normal
>  Labels: documentation
>
> The docs don't have any information on how to upgrade.  This question gets 
> asked constantly on the mailing list.
> Seems like it belongs under the "Operating Cassandra" section.
> https://cassandra.apache.org/doc/latest/operating/index.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Assigned] (CASSANDRA-13749) add documentation about upgrade process to docs

2020-03-30 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna reassigned CASSANDRA-13749:


Assignee: Jeremy Hanna  (was: Sumanth Pasupuleti)

> add documentation about upgrade process to docs
> ---
>
> Key: CASSANDRA-13749
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13749
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Documentation and Website
>Reporter: Jon Haddad
>Assignee: Jeremy Hanna
>Priority: Normal
>  Labels: documentation
>
> The docs don't have any information on how to upgrade.  This question gets 
> asked constantly on the mailing list.
> Seems like it belongs under the "Operating Cassandra" section.
> https://cassandra.apache.org/doc/latest/operating/index.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-15522) Update defaults for the validity timeframe of roles, permissions, and credentials for 4.0

2020-02-02 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-15522:
-
Summary: Update defaults for the validity timeframe of roles, permissions, 
and credentials for 4.0  (was: Update defaults for the validity timeframe of 
roles, permissions, and credentials)

> Update defaults for the validity timeframe of roles, permissions, and 
> credentials for 4.0
> -
>
> Key: CASSANDRA-15522
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15522
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Authorization
>Reporter: Jeremy Hanna
>Priority: Normal
>
> It's been found that the defaults for \{{roles_validity_in_ms}}, 
> \{{permissions_validity_in_ms}}, and \{{credentials_validity_in_ms}} have 
> been too low at 2000 ms or 2 seconds each.  As [~alexott] put it in the dev 
> list discussion about defaults:
> {quote}I have seen multiple times when authentication was failing under the 
> heavy load because queries to system tables were timing out - with these 
> defaults people may still have the possibility to get updates to 
> roles/credentials faster when specifying _update_interval_ variants of these 
> configurations.
> {quote}
> The suggestion is to set it to 6 (1 minute) or 12 (2 minutes).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-15523) Update default snitch from SimpleSnitch to GossipingPropertyFileSnitch

2020-01-23 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-15523:
-
Description: 
Traditionally the project has had {{SimpleSnitch}} as the default primarily 
because it makes it easy for a user to download the software and start 
Cassandra without changing any defaults and it will just work.  However, 
{{SimpleSnitch}} is not datacenter aware.  {{GossipingPropertyFileSnitch}} 
could also be used as the default and would make the default snitch be 
datacenter aware.  Out of the box it will work with the default configuration.

This ticket would be to update the default from {{SimpleSnitch}} to 
{{GossipingPropertyFileSnitch}} to make the onboarding experience better for 
those who don't know to change this default if they intend to expand to 
multiple datacenters without too much hassle.

  was:
Traditionally the project has had {{SimpleSnitch}} as the default primarily 
because it makes it easy for a user to download the software and start 
Cassandra without changing any defaults and it will just work.  However, 
{{SimpleSnitch}} is not datacenter aware.  {{GossipingPropertyFileSnitch}} 
could also be used as the default and would make the default snitch be 
datacenter aware.  The user simply needs to update the local datacenter and 
rack names on each node and when it joins the cluster, it will propagate that 
topology information around the ring.

This ticket would be to update the default from {{SimpleSnitch}} to 
{{GossipingPropertyFileSnitch}} to make the onboarding experience better for 
those who don't know to change this default if they intend to expand to 
multiple datacenters without too much hassle.


> Update default snitch from SimpleSnitch to GossipingPropertyFileSnitch
> --
>
> Key: CASSANDRA-15523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15523
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Jeremy Hanna
>Priority: Normal
>
> Traditionally the project has had {{SimpleSnitch}} as the default primarily 
> because it makes it easy for a user to download the software and start 
> Cassandra without changing any defaults and it will just work.  However, 
> {{SimpleSnitch}} is not datacenter aware.  {{GossipingPropertyFileSnitch}} 
> could also be used as the default and would make the default snitch be 
> datacenter aware.  Out of the box it will work with the default configuration.
> This ticket would be to update the default from {{SimpleSnitch}} to 
> {{GossipingPropertyFileSnitch}} to make the onboarding experience better for 
> those who don't know to change this default if they intend to expand to 
> multiple datacenters without too much hassle.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-15523) Update default snitch from SimpleSnitch to GossipingPropertyFileSnitch

2020-01-23 Thread Jeremy Hanna (Jira)

Jeremy Hanna created CASSANDRA-15523:


 Summary: Update default snitch from SimpleSnitch to 
GossipingPropertyFileSnitch
 Key: CASSANDRA-15523
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15523
 Project: Cassandra
  Issue Type: Improvement
  Components: Local/Config
Reporter: Jeremy Hanna


Traditionally the project has had {{SimpleSnitch}} as the default primarily 
because it makes it easy for a user to download the software and start 
Cassandra without changing any defaults and it will just work.  However, 
{{SimpleSnitch}} is not datacenter aware.  {{GossipingPropertyFileSnitch}} 
could also be used as the default and would make the default snitch be 
datacenter aware.  The user simply needs to update the local datacenter and 
rack names on each node and when it joins the cluster, it will propagate that 
topology information around the ring.

This ticket would be to update the default from {{SimpleSnitch}} to 
{{GossipingPropertyFileSnitch}} to make the onboarding experience better for 
those who don't know to change this default if they intend to expand to 
multiple datacenters without too much hassle.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-15522) Update defaults for the validity timeframe of roles, permissions, and credentials

2020-01-23 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-15522:
-
Description: 
It's been found that the defaults for \{{roles_validity_in_ms}}, 
\{{permissions_validity_in_ms}}, and \{{credentials_validity_in_ms}} have been 
too low at 2000 ms or 2 seconds each.  As [~alexott] put it in the dev list 
discussion about defaults:
{quote}I have seen multiple times when authentication was failing under the 
heavy load because queries to system tables were timing out - with these 
defaults people may still have the possibility to get updates to 
roles/credentials faster when specifying _update_interval_ variants of these 
configurations.
{quote}
The suggestion is to set it to 6 (1 minute) or 12 (2 minutes).

  was:
It's been found that the defaults for \{roles_validity_in_ms}, 
\{permissions_validity_in_ms}, and \{credentials_validity_in_ms} have been too 
low at 2000 ms or 2 seconds each.  As [~alexott] put it in the dev list 
discussion about defaults:

{quote}

I have seen multiple times when authentication was failing under the heavy load 
because queries to system tables were timing out - with these defaults people 
may still have the possibility to get updates to roles/credentials faster when 
specifying _update_interval_ variants of these configurations.

{quote}

The suggestion is to set it to 6 (1 minute) or 12 (2 minutes).


> Update defaults for the validity timeframe of roles, permissions, and 
> credentials
> -
>
> Key: CASSANDRA-15522
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15522
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Authorization
>Reporter: Jeremy Hanna
>Priority: Normal
>
> It's been found that the defaults for \{{roles_validity_in_ms}}, 
> \{{permissions_validity_in_ms}}, and \{{credentials_validity_in_ms}} have 
> been too low at 2000 ms or 2 seconds each.  As [~alexott] put it in the dev 
> list discussion about defaults:
> {quote}I have seen multiple times when authentication was failing under the 
> heavy load because queries to system tables were timing out - with these 
> defaults people may still have the possibility to get updates to 
> roles/credentials faster when specifying _update_interval_ variants of these 
> configurations.
> {quote}
> The suggestion is to set it to 6 (1 minute) or 12 (2 minutes).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-15522) Update defaults for the validity timeframe of roles, permissions, and credentials

2020-01-23 Thread Jeremy Hanna (Jira)

Jeremy Hanna created CASSANDRA-15522:


 Summary: Update defaults for the validity timeframe of roles, 
permissions, and credentials
 Key: CASSANDRA-15522
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15522
 Project: Cassandra
  Issue Type: Improvement
  Components: Feature/Authorization
Reporter: Jeremy Hanna


It's been found that the defaults for \{roles_validity_in_ms}, 
\{permissions_validity_in_ms}, and \{credentials_validity_in_ms} have been too 
low at 2000 ms or 2 seconds each.  As [~alexott] put it in the dev list 
discussion about defaults:

{quote}

I have seen multiple times when authentication was failing under the heavy load 
because queries to system tables were timing out - with these defaults people 
may still have the possibility to get updates to roles/credentials faster when 
specifying _update_interval_ variants of these configurations.

{quote}

The suggestion is to set it to 6 (1 minute) or 12 (2 minutes).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-15521) Update default for num_tokens from 256 to something more reasonable

2020-01-23 Thread Jeremy Hanna (Jira)

[
https://issues.apache.org/jira/browse/CASSANDRA-15521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jeremy Hanna updated CASSANDRA-15521:
-
Description:
The default for num_tokens or the number of token ranges assigned to a node
using virtual nodes is way too high. 256 token ranges makes repair painful.
Since it's a default, someone new to Cassandra won't know better and if left
unchanged, they will have to live with it or perform a migration to a new
datacenter with a lower number.

At the same time, going too low with the default allocation algorithm can
hotspot nodes to have more tokens assigned than others. There is a new token
allocation algorithm introduced but it's not default.

The proposal of this ticket is to set the default to something more reasonable
to align with best practices without using the new token algorithm or giving it
specific token values as some do. 32 is a good compromise and is what the
project uses in a lot of the tests that are done.

So generally it would be good to move to a more sane value and to align with
testing so users are more confident that the defaults have a lot of testing
behind them.

As discussed on the dev mailing list, we want to make sure this change to the
default doesn't come as an unpleasant surprise to cluster operators. For
num_tokens specifically, if you were to upgrade to a version with the new
default and the user didn't change it to the existing value, the node would not
start, saying you can't change the num_tokens on an existing node. So we will
want to put a release note to indicate that when upgrading, make a note of the
num_tokens change when looking at the new configuration.

Along with not being able to start nodes, which is fail-fast, there is the
matter of adding new nodes to the cluster. You can certainly add a new node to
a cluster or datacenter with a different number of token ranges assigned. It
will give that node a different amount of data to be responsible for. For
example, if the nodes in a datacenter all have num_tokens=256 (current default)
and you add a node to that datacenter with num_tokens=32 (new default), it will
only claim 1/8th of the token ranges and data as the other nodes in that
datacenter. Fortunately, this is a property that is explicitly defined rather
than implicit like some of the table settings. Also most if not all operators
will upgrade the existing nodes to that new version before trying to add a node
with that new version. So if there is a different number for num_tokens on the
existing nodes, they'll be aware of it immediately.

In any case, this is a long proposal for what will be a small change in the
cassandra.yaml and something in the release notes, that is, changing the
default num_tokens value from 256 to 32.

was:
The default for num_tokens or the number of token ranges assigned to a node
using virtual nodes is way too high. 256 token ranges makes repair painful.
Since it's a default, someone new to Cassandra won't know better and will have
to live with it or perform a migration to a new datacenter with a lower number.

So generally it would be good to move to a more sane value and to align with
testing so users are more confident that the defaults have a lot of testing
behind them.

[jira] [Created] (CASSANDRA-15521) Update default for num_tokens from 256 to something more reasonable

2020-01-23 Thread Jeremy Hanna (Jira)

Jeremy Hanna created CASSANDRA-15521:


 Summary: Update default for num_tokens from 256 to something more 
reasonable
 Key: CASSANDRA-15521
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15521
 Project: Cassandra
  Issue Type: Improvement
  Components: Feature/Virtual Nodes
Reporter: Jeremy Hanna
Assignee: Jeremy Hanna


The default for num_tokens or the number of token ranges assigned to a node 
using virtual nodes is way too high.  256 token ranges makes repair painful.  
Since it's a default, someone new to Cassandra won't know better and will have 
to live with it or perform a migration to a new datacenter with a lower number.

At the same time, going too low with the default allocation algorithm can 
hotspot nodes to have more tokens assigned than others.  There is a new token 
allocation algorithm introduced but it's not default.

The proposal of this ticket is to set the default to something more reasonable 
to align with best practices without using the new token algorithm or giving it 
specific token values as some do.  32 is a good compromise and is what the 
project uses in a lot of the tests that are done.

So generally it would be good to move to a more sane value and to align with 
testing so users are more confident that the defaults have a lot of testing 
behind them.

As discussed on the dev mailing list, we want to make sure this change to the 
default doesn't come as an unpleasant surprise to cluster operators.  For 
num_tokens specifically, if you were to upgrade to a version with the new 
default and the user didn't change it to the existing value, the node would not 
start, saying you can't change the num_tokens on an existing node.  So we will 
want to put a release note to indicate that when upgrading, make a note of the 
num_tokens change when looking at the new configuration.

Along with not being able to start nodes, which is fail-fast, there is the 
matter of adding new nodes to the cluster.  You can certainly add a new node to 
a cluster or datacenter with a different number of token ranges assigned.  It 
will give that node a different amount of data to be responsible for.  For 
example, if the nodes in a datacenter all have num_tokens=256 (current default) 
and you add a node to that datacenter with num_tokens=32 (new default), it will 
only claim 1/8th of the token ranges and data as the other nodes in that 
datacenter.  Fortunately, this is a property that is explicitly defined rather 
than implicit like some of the table settings.  Also most if not all operators 
will upgrade the existing nodes to that new version before trying to add a node 
with that new version.  So if there is a different number for num_tokens on the 
existing nodes, they'll be aware of it immediately.

In any case, this is a long proposal for what will be a small change in the 
cassandra.yaml and something in the release notes, that is, changing the 
default num_tokens value from 256 to 32.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-13019) Improve clearsnapshot to delete the snapshot files slowly

2020-01-01 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-13019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006592#comment-17006592
 ] 

Jeremy Hanna edited comment on CASSANDRA-13019 at 1/2/20 5:18 AM:
--

I like the idea to reduce the effect on the regular server operations when 
performing the snapshot, especially if there is a coordinated snapshot across 
the cluster.

Because it may affect time it takes for operations that call {{snapshot}} 
indirectly, should we make a note of this in the NEWS.txt - both the 
availability of the throttle and that it may affect time to run things like 
{{truncate}} and {{drop}}?


was (Author: jeromatron):
I like the idea to reduce the effect on the regular server operations when 
performing the snapshot, especially if there is a coordinated snapshot across 
the cluster.

Because it may affect time it takes for operations that call `snapshot` 
indirectly, should we make a note of this in the NEWS.txt - both the 
availability of the throttle and that it may affect time to run things like 
`truncate` and `drop`?

> Improve clearsnapshot to delete the snapshot files slowly 
> --
>
> Key: CASSANDRA-13019
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13019
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Core
>Reporter: Dikang Gu
>Assignee: Jeff Jirsa
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.x
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> In our environment, we are creating snapshots for backup, after we finish the 
> backup, we are running {{clearsnapshot}} to delete the snapshot files. At 
> that time we may have thousands of files to delete, and it's causing sudden 
> disk usage spike. As a result, we are experiencing a spike of drop messages 
> from Cassandra.
> I think we should implement something like {{slowrm}} to delete the snapshot 
> files slowly, avoid the sudden disk usage spike.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-13019) Improve clearsnapshot to delete the snapshot files slowly

2020-01-01 Thread Jeremy Hanna (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-13019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006592#comment-17006592
 ] 

Jeremy Hanna commented on CASSANDRA-13019:
--

I like the idea to reduce the effect on the regular server operations when 
performing the snapshot, especially if there is a coordinated snapshot across 
the cluster.

Because it may affect time it takes for operations that call `snapshot` 
indirectly, should we make a note of this in the NEWS.txt - both the 
availability of the throttle and that it may affect time to run things like 
`truncate` and `drop`?

> Improve clearsnapshot to delete the snapshot files slowly 
> --
>
> Key: CASSANDRA-13019
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13019
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Core
>Reporter: Dikang Gu
>Assignee: Jeff Jirsa
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.x
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> In our environment, we are creating snapshots for backup, after we finish the 
> backup, we are running {{clearsnapshot}} to delete the snapshot files. At 
> that time we may have thousands of files to delete, and it's causing sudden 
> disk usage spike. As a result, we are experiencing a spike of drop messages 
> from Cassandra.
> I think we should implement something like {{slowrm}} to delete the snapshot 
> files slowly, avoid the sudden disk usage spike.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-15336) LegacyLayout RangeTombstoneList throws IndexOutOfBoundsException When Running Range Queries

2019-10-30 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-15336:
-
Description: 
Hi All, 

This bug is similar to CASSANDRA-15172 but relates specifically to range 
queries running over range tombstones. 

 

 

*+Steps to Reproduce:
 +*

CREATE KEYSPACE ks1 WITH replication = \{'class': 'NetworkTopologyStrategy', 
'DC1': '3'} AND durable_writes = true;

+*TABLE:*+ 
 CREATE TABLE ks1.table1 (
 col1 text,
 col2 text,
 col3 text,
 col4 text,
 col5 text,
 col6 timestamp,
 data text,
 PRIMARY KEY ((col1, col2, col3), col4, col5, col6)
 );

 

Inserted ~4 million rows and created range tombstones by deleting ~1 million 
rows.

 

+*Create Data*+

_insert into ks1.table1 (col1, col2 , col3 , col4 , col5 , col6 , data ) VALUES 
( '1', '11', '21', '1', 'a', 1231231230, 'data');_
 _insert into ks1.table1 (col1, col2 , col3 , col4 , col5 , col6 , data ) 
VALUES ( '1', '11', '21', '2', 'a', 1231231230, 'data');_
 _insert into ks1.table1 (col1, col2 , col3 , col4 , col5 , col6 , data ) 
VALUES ( '1', '11', '21', '3', 'a', 1231231230, 'data');_
 _insert into ks1.table1 (col1, col2 , col3 , col4 , col5 , col6 , data ) 
VALUES ( '1', '11', '21', '4', 'a', 1231231230, 'data');_
 _insert into ks1.table1 (col1, col2 , col3 , col4 , col5 , col6 , data ) 
VALUES ( '1', '11', '21', '5', 'a', 1231231230, 'data');_

 

+*Create Range Tombstones*+

delete from ks1.table1 where col1='1' and col2='11' and col3='21' and col4='1';

 

+*Query Live Rows (no tombstones)*+

_select * from ks1.table1 where col1='1' and col2='201' and col3='21' and 
col4='1' and col5='a' and *col6>1231231230*;_

No issues found, everything is running properly.

 

+*Query Range Tombstones*+

_select * from ks1.table1 where col1='1' and col2='11' and col3='21' and 
col4='1' and col5='a' and *col6=1231231230*;_

No issues found, everything is running properly.

 

+BUT when running range queries:+

_select * from ks1.table1 where col1='1' and col2='11' and col3='21' and 
col4='1' and col5='a' and *col6>1231231220;*_

WARN [ReadStage-1] 2019-09-23 14:17:10,281 
AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread 
Thread[ReadStage-1,5,main]: {}
 java.lang.ArrayIndexOutOfBoundsException: 2
 at 
org.apache.cassandra.db.AbstractBufferClusteringPrefix.get(AbstractBufferClusteringPrefix.java:55)
 at 
org.apache.cassandra.db.LegacyLayout$LegacyRangeTombstoneList.serializedSizeCompound(LegacyLayout.java:2545)
 at 
org.apache.cassandra.db.LegacyLayout$LegacyRangeTombstoneList.serializedSize(LegacyLayout.java:2522)
 at 
org.apache.cassandra.db.LegacyLayout.serializedSizeAsLegacyPartition(LegacyLayout.java:565)
 at 
org.apache.cassandra.db.ReadResponse$Serializer.serializedSize(ReadResponse.java:446)
 at 
org.apache.cassandra.db.ReadResponse$Serializer.serializedSize(ReadResponse.java:352)
 at org.apache.cassandra.net.MessageOut.payloadSize(MessageOut.java:171)
 at 
org.apache.cassandra.net.OutboundTcpConnectionPool.getConnection(OutboundTcpConnectionPool.java:77)
 at 
org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:802)
 at 
org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:953)
 at 
org.apache.cassandra.net.MessagingService.sendReply(MessagingService.java:929)
 at 
org.apache.cassandra.db.ReadCommandVerbHandler.doVerb(ReadCommandVerbHandler.java:62)
 at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:66)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
 at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134)
 at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:114)
 at java.lang.Thread.run(Thread.java:745)

 

This WARN is constantly generated until I stop the range queries script.

Hope this helps..

Thanks!

  was:
Hi All, 

This bug is similar to https://issues.apache.org/jira/browse/CASSANDRA-15172 
but relates specifically to range queries running over range tombstones. 

 

 

*+Steps to Reproduce:
 +*

CREATE KEYSPACE ks1 WITH replication = \{'class': 'NetworkTopologyStrategy', 
'DC1': '3'} AND durable_writes = true;

+*TABLE:*+ 
 CREATE TABLE ks1.table1 (
 col1 text,
 col2 text,
 col3 text,
 col4 text,
 col5 text,
 col6 timestamp,
 data text,
 PRIMARY KEY ((col1, col2, col3), col4, col5, col6)
 );

 

Inserted ~4 million rows and created range tombstones by deleting ~1 million 
rows.

 

+*Create Data*+

_insert into ks1.table1 (col1, col2 , col3 , col4 , col5 , col6 , data ) VALUES 
( '1', '11', '21', '1', 'a', 1231231230, 'data');_
 _insert into ks1.table1 (col1, col2 , col3 , col4 , col5 , col6 , data ) 
VALUES ( '1', '11',

[jira] [Updated] (CASSANDRA-15322) Partition size virtual table

2019-10-29 Thread Jeremy Hanna (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-15322:
-
Labels: virtual-tables  (was: )

> Partition size virtual table
> 
>
> Key: CASSANDRA-15322
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15322
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Normal
>  Labels: virtual-tables
>
> Virtual table to provide on disk size (local) of a given partition. Useful 
> for checking for or verifying issues with wide partitions. This is dependent 
> on the lazy virtual table ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1508 matches

Mail list logo