[jira] [Comment Edited] (CASSANDRA-19448) CommitlogArchiver only has granularity to seconds for restore_point_in_time

2024-04-01 Thread Maxwell Guo (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832992#comment-17832992
 ] 

Maxwell Guo edited comment on CASSANDRA-19448 at 4/2/24 2:23 AM:
-

[~brandon.williams][~tiagomlalves]Thank you very much for your reply.

Regarding the granularity of RIP, my main point is that it mainly depends on 
the needs of the user. My patch allows the user to select seconds, milliseconds 
and microseconds. All I need to do is to ensure that it is consistent with 
Cassandra 's timestamp itself.

bq. What I wonder is, in which scenarios would microsecond-level PIT restore 
would be useful?

[~brandon.williams] may have already described it clearly, but what I may want 
to say again is, if a batch of data is deleted by mistake, then accurate time 
granularity to microseconds is the only prerequisite to ensure that all data 
can be restored in c*. Milliseconds and seconds are not enough, they may lost 
data.

bq.couldn't we detect automatically the granularity of PIT restore based on the 
value

I will update the PR again, and [~brandon.williams][~tiagomlalves] If you are 
willing, can you be the reviewer  :)


And I will prepare PRs for other branch if the fix for trunk is accepted .


was (Author: maxwellguo):
[~brandon.williams][~tiagomlalves]Thank you very much for your reply.

Regarding the granularity of RIP, my main point is that it mainly depends on 
the needs of the user. My patch allows the user to select seconds, milliseconds 
and microseconds. All I need to do is to ensure that it is consistent with 
Cassandra 's timestamp itself.

bq. What I wonder is, in which scenarios would microsecond-level PIT restore 
would be useful?

[~brandon.williams] may have already described it clearly, but what I may want 
to say again is, if a batch of data is deleted by mistake, then accurate time 
granularity to microseconds is the only prerequisite to ensure that all data 
can be restored in c*. Milliseconds and seconds are not enough, they may lost 
data.

bq.couldn't we detect automatically the granularity of PIT restore based on the 
value

I will update the PR again, and [~brandon.williams][~tiagomlalves] If you are 
willing, can you be the reviewer  :)

I will prepare PRs for other branch if the fix for trunk is accepted .

> CommitlogArchiver only has granularity to seconds for restore_point_in_time
> ---
>
> Key: CASSANDRA-19448
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19448
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log
>Reporter: Jeremy Hanna
>Assignee: Maxwell Guo
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Commitlog archiver allows users to backup commitlog files for the purpose of 
> doing point in time restores.  The [configuration 
> file|https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties]
>  gives an example of down to the seconds granularity but then asks what 
> whether the timestamps are microseconds or milliseconds - defaulting to 
> microseconds.  Because the [CommitLogArchiver uses a second based date 
> format|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogArchiver.java#L52],
>  if a user specifies to restore at something at a lower granularity like 
> milliseconds or microseconds, that means that the it will truncate everything 
> after the second and restore to that second.  So say you specify a 
> restore_point_in_time like this:
> restore_point_in_time=2024:01:18 17:01:01.623392
> it will silently truncate everything after the 01 seconds.  So effectively to 
> the user, it is missing updates between 01 and 01.623392.
> This appears to be a bug in the intent.  We should allow users to specify 
> down to the millisecond or even microsecond level. If we allow them to 
> specify down to microseconds for the restore point in time, then it may 
> internally need to change from a long.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19448) CommitlogArchiver only has granularity to seconds for restore_point_in_time

2024-04-01 Thread Maxwell Guo (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832992#comment-17832992
 ] 

Maxwell Guo edited comment on CASSANDRA-19448 at 4/2/24 2:23 AM:
-

[~brandon.williams][~tiagomlalves]Thank you very much for your reply.

Regarding the granularity of RIP, my main point is that it mainly depends on 
the needs of the user. My patch allows the user to select seconds, milliseconds 
and microseconds. All I need to do is to ensure that it is consistent with 
Cassandra 's timestamp itself.

bq. What I wonder is, in which scenarios would microsecond-level PIT restore 
would be useful?

[~brandon.williams] may have already described it clearly, but what I may want 
to say again is, if a batch of data is deleted by mistake, then accurate time 
granularity to microseconds is the only prerequisite to ensure that all data 
can be restored in c*. Milliseconds and seconds are not enough, they may lost 
data.

bq.couldn't we detect automatically the granularity of PIT restore based on the 
value

I will update the PR again, and [~brandon.williams][~tiagomlalves] If you are 
willing, can you be the reviewer  :)

I will prepare PRs for other branch if the fix for trunk is accepted .


was (Author: maxwellguo):
[~brandon.williams][~tiagomlalves]Thank you very much for your reply.

Regarding the granularity of RIP, my main point is that it mainly depends on 
the needs of the user. My patch allows the user to select seconds, milliseconds 
and microseconds. All I need to do is to ensure that it is consistent with 
Cassandra 's timestamp itself.

bq. What I wonder is, in which scenarios would microsecond-level PIT restore 
would be useful?

[~brandon.williams] may have already described it clearly, but what I may want 
to say again is, if a batch of data is deleted by mistake, then accurate time 
granularity to microseconds is the only prerequisite to ensure that all data 
can be restored in c*. Milliseconds and seconds are not enough, they may lost 
data.

bq.couldn't we detect automatically the granularity of PIT restore based on the 
value

I will update the PR again, and [~brandon.williams][~tiagomlalves] If you are 
willing, can you be the reviewer  :)



> CommitlogArchiver only has granularity to seconds for restore_point_in_time
> ---
>
> Key: CASSANDRA-19448
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19448
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log
>Reporter: Jeremy Hanna
>Assignee: Maxwell Guo
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Commitlog archiver allows users to backup commitlog files for the purpose of 
> doing point in time restores.  The [configuration 
> file|https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties]
>  gives an example of down to the seconds granularity but then asks what 
> whether the timestamps are microseconds or milliseconds - defaulting to 
> microseconds.  Because the [CommitLogArchiver uses a second based date 
> format|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogArchiver.java#L52],
>  if a user specifies to restore at something at a lower granularity like 
> milliseconds or microseconds, that means that the it will truncate everything 
> after the second and restore to that second.  So say you specify a 
> restore_point_in_time like this:
> restore_point_in_time=2024:01:18 17:01:01.623392
> it will silently truncate everything after the 01 seconds.  So effectively to 
> the user, it is missing updates between 01 and 01.623392.
> This appears to be a bug in the intent.  We should allow users to specify 
> down to the millisecond or even microsecond level. If we allow them to 
> specify down to microseconds for the restore point in time, then it may 
> internally need to change from a long.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19448) CommitlogArchiver only has granularity to seconds for restore_point_in_time

2024-04-01 Thread Maxwell Guo (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832992#comment-17832992
 ] 

Maxwell Guo edited comment on CASSANDRA-19448 at 4/2/24 2:20 AM:
-

[~brandon.williams][~tiagomlalves]Thank you very much for your reply.

Regarding the granularity of RIP, my main point is that it mainly depends on 
the needs of the user. My patch allows the user to select seconds, milliseconds 
and microseconds. All I need to do is to ensure that it is consistent with 
Cassandra 's timestamp itself.C* provides timestamps up to microseconds, so 
users can restore to a specific piece of data.

bq. What I wonder is, in which scenarios would microsecond-level PIT restore 
would be useful?

[~brandon.williams] may have already described it clearly, but what I may want 
to say again is, if a batch of data is deleted by mistake, then accurate time 
granularity to microseconds is the only prerequisite to ensure that all data 
can be restored in c*. Milliseconds and seconds are not enough, they may lost 
data.

bq.couldn't we detect automatically the granularity of PIT restore based on the 
value

I will update the PR again, and [~brandon.williams][~tiagomlalves] If you are 
willing, can you be the reviewer  :)




was (Author: maxwellguo):
[~brandon.williams][~tiagomlalves]Thank you very much for your reply.

Regarding the granularity of RIP, my main point is that it mainly depends on 
the needs of the user. My patch allows the user to select seconds, milliseconds 
and microseconds. All I need to do is to ensure that it is consistent with 
Cassandra 's timestamp itself.As a said, c* provides timestamps up to 
microseconds, so users can restore to a specific piece of data.

bq. What I wonder is, in which scenarios would microsecond-level PIT restore 
would be useful?

[~brandon.williams] may have already described it clearly, but what I may want 
to say again is, if a batch of data is deleted by mistake, then accurate time 
granularity to microseconds is the only prerequisite to ensure that all data 
can be restored in c*. Milliseconds and seconds are not enough, they may lost 
data.

bq.couldn't we detect automatically the granularity of PIT restore based on the 
value

I will update the PR again, and [~brandon.williams][~tiagomlalves] If you are 
willing, can you be the reviewer  :)



> CommitlogArchiver only has granularity to seconds for restore_point_in_time
> ---
>
> Key: CASSANDRA-19448
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19448
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log
>Reporter: Jeremy Hanna
>Assignee: Maxwell Guo
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Commitlog archiver allows users to backup commitlog files for the purpose of 
> doing point in time restores.  The [configuration 
> file|https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties]
>  gives an example of down to the seconds granularity but then asks what 
> whether the timestamps are microseconds or milliseconds - defaulting to 
> microseconds.  Because the [CommitLogArchiver uses a second based date 
> format|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogArchiver.java#L52],
>  if a user specifies to restore at something at a lower granularity like 
> milliseconds or microseconds, that means that the it will truncate everything 
> after the second and restore to that second.  So say you specify a 
> restore_point_in_time like this:
> restore_point_in_time=2024:01:18 17:01:01.623392
> it will silently truncate everything after the 01 seconds.  So effectively to 
> the user, it is missing updates between 01 and 01.623392.
> This appears to be a bug in the intent.  We should allow users to specify 
> down to the millisecond or even microsecond level. If we allow them to 
> specify down to microseconds for the restore point in time, then it may 
> internally need to change from a long.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19448) CommitlogArchiver only has granularity to seconds for restore_point_in_time

2024-04-01 Thread Maxwell Guo (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832992#comment-17832992
 ] 

Maxwell Guo edited comment on CASSANDRA-19448 at 4/2/24 2:20 AM:
-

[~brandon.williams][~tiagomlalves]Thank you very much for your reply.

Regarding the granularity of RIP, my main point is that it mainly depends on 
the needs of the user. My patch allows the user to select seconds, milliseconds 
and microseconds. All I need to do is to ensure that it is consistent with 
Cassandra 's timestamp itself.

bq. What I wonder is, in which scenarios would microsecond-level PIT restore 
would be useful?

[~brandon.williams] may have already described it clearly, but what I may want 
to say again is, if a batch of data is deleted by mistake, then accurate time 
granularity to microseconds is the only prerequisite to ensure that all data 
can be restored in c*. Milliseconds and seconds are not enough, they may lost 
data.

bq.couldn't we detect automatically the granularity of PIT restore based on the 
value

I will update the PR again, and [~brandon.williams][~tiagomlalves] If you are 
willing, can you be the reviewer  :)




was (Author: maxwellguo):
[~brandon.williams][~tiagomlalves]Thank you very much for your reply.

Regarding the granularity of RIP, my main point is that it mainly depends on 
the needs of the user. My patch allows the user to select seconds, milliseconds 
and microseconds. All I need to do is to ensure that it is consistent with 
Cassandra 's timestamp itself.C* provides timestamps up to microseconds, so 
users can restore to a specific piece of data.

bq. What I wonder is, in which scenarios would microsecond-level PIT restore 
would be useful?

[~brandon.williams] may have already described it clearly, but what I may want 
to say again is, if a batch of data is deleted by mistake, then accurate time 
granularity to microseconds is the only prerequisite to ensure that all data 
can be restored in c*. Milliseconds and seconds are not enough, they may lost 
data.

bq.couldn't we detect automatically the granularity of PIT restore based on the 
value

I will update the PR again, and [~brandon.williams][~tiagomlalves] If you are 
willing, can you be the reviewer  :)



> CommitlogArchiver only has granularity to seconds for restore_point_in_time
> ---
>
> Key: CASSANDRA-19448
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19448
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log
>Reporter: Jeremy Hanna
>Assignee: Maxwell Guo
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Commitlog archiver allows users to backup commitlog files for the purpose of 
> doing point in time restores.  The [configuration 
> file|https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties]
>  gives an example of down to the seconds granularity but then asks what 
> whether the timestamps are microseconds or milliseconds - defaulting to 
> microseconds.  Because the [CommitLogArchiver uses a second based date 
> format|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogArchiver.java#L52],
>  if a user specifies to restore at something at a lower granularity like 
> milliseconds or microseconds, that means that the it will truncate everything 
> after the second and restore to that second.  So say you specify a 
> restore_point_in_time like this:
> restore_point_in_time=2024:01:18 17:01:01.623392
> it will silently truncate everything after the 01 seconds.  So effectively to 
> the user, it is missing updates between 01 and 01.623392.
> This appears to be a bug in the intent.  We should allow users to specify 
> down to the millisecond or even microsecond level. If we allow them to 
> specify down to microseconds for the restore point in time, then it may 
> internally need to change from a long.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19448) CommitlogArchiver only has granularity to seconds for restore_point_in_time

2024-04-01 Thread Maxwell Guo (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832992#comment-17832992
 ] 

Maxwell Guo commented on CASSANDRA-19448:
-

[~brandon.williams][~tiagomlalves]Thank you very much for your reply.

Regarding the granularity of RIP, my main point is that it mainly depends on 
the needs of the user. My patch allows the user to select seconds, milliseconds 
and microseconds. All I need to do is to ensure that it is consistent with 
Cassandra 's timestamp itself.As a said, c* provides timestamps up to 
microseconds, so users can restore to a specific piece of data.

bq. What I wonder is, in which scenarios would microsecond-level PIT restore 
would be useful?

[~brandon.williams] may have already described it clearly, but what I may want 
to say again is, if a batch of data is deleted by mistake, then accurate time 
granularity to microseconds is the only prerequisite to ensure that all data 
can be restored in c*. Milliseconds and seconds are not enough, they may lost 
data.

bq.couldn't we detect automatically the granularity of PIT restore based on the 
value

I will update the PR again, and [~brandon.williams][~tiagomlalves] If you are 
willing, can you be the reviewer  :)



> CommitlogArchiver only has granularity to seconds for restore_point_in_time
> ---
>
> Key: CASSANDRA-19448
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19448
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log
>Reporter: Jeremy Hanna
>Assignee: Maxwell Guo
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Commitlog archiver allows users to backup commitlog files for the purpose of 
> doing point in time restores.  The [configuration 
> file|https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties]
>  gives an example of down to the seconds granularity but then asks what 
> whether the timestamps are microseconds or milliseconds - defaulting to 
> microseconds.  Because the [CommitLogArchiver uses a second based date 
> format|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogArchiver.java#L52],
>  if a user specifies to restore at something at a lower granularity like 
> milliseconds or microseconds, that means that the it will truncate everything 
> after the second and restore to that second.  So say you specify a 
> restore_point_in_time like this:
> restore_point_in_time=2024:01:18 17:01:01.623392
> it will silently truncate everything after the 01 seconds.  So effectively to 
> the user, it is missing updates between 01 and 01.623392.
> This appears to be a bug in the intent.  We should allow users to specify 
> down to the millisecond or even microsecond level. If we allow them to 
> specify down to microseconds for the restore point in time, then it may 
> internally need to change from a long.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19150) Align values in rows in CQLSH right for numbers, left for text

2024-04-01 Thread Arun Ganesh (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832655#comment-17832655
 ] 

Arun Ganesh edited comment on CASSANDRA-19150 at 4/2/24 1:08 AM:
-

[~bschoeni],

By the time we get to calling the {{ljust\(\)}} or {{rjust\(\)}} at 
https://github.com/apache/cassandra/blob/trunk/pylib/cqlshlib/cqlshmain.py#L1025,
 all the values are converted to strings (e.g., 
[here|https://github.com/apache/cassandra/blob/trunk/pylib/cqlshlib/formatting.py:251],
 
[here|https://github.com/apache/cassandra/blob/trunk/pylib/cqlshlib/formatting.py:331],
 
[here|https://github.com/apache/cassandra/blob/trunk/pylib/cqlshlib/formatting.py:349],
 and so on).


was (Author: JIRAUSER303038):
[~bschoeni],

It's a catch-22. To use {{print("\{0:x\}".format\(i\))}} to align values in the 
column, we should know {{x}}, which should be the length of the longest value 
in the column (either the longest value or the column header). And, you cannot 
know {{x}} without converting all values in the column to strings in the first 
place.

I believe that's why we don't use {{"\{\}".format\(\)}} in cqlsh. We can do it, 
but it would make the code complex. Instead, we convert all values to strings 
first (e.g., 
[here|https://github.com/apache/cassandra/blob/trunk/pylib/cqlshlib/formatting.py:251],
 
[here|https://github.com/apache/cassandra/blob/trunk/pylib/cqlshlib/formatting.py:331],
 
[here|https://github.com/apache/cassandra/blob/trunk/pylib/cqlshlib/formatting.py:349],
 and so on), and later pad it manually to the max-width of each column.

> Align values in rows in CQLSH right for numbers, left for text
> --
>
> Key: CASSANDRA-19150
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19150
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL/Interpreter
>Reporter: Stefan Miklosovic
>Assignee: Arun Ganesh
>Priority: Low
> Fix For: 5.x
>
> Attachments: Screenshot 2023-12-04 at 00.38.16.png, Screenshot 
> 2023-12-09 at 16.58.25.png, signature.asc, test_output.txt, 
> test_output_old.txt
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> *Updated* Jan 17 2024 after dev discussion
> Change CQLSH to left-align text while continue to right-align numbers.  This 
> will match how Postgres shell and Excel treat alignment of text and number.
> -
> *Original*
> We need to make this
> [https://github.com/apache/cassandra/blob/trunk/pylib/cqlshlib/cqlshmain.py#L1101]
> configurable so values in columns are either all on left or on right side of 
> the column (basically change col.rjust to col.ljust).
> By default, it would be like it is now but there would be configuration 
> property in cqlsh for that as well as a corresponding CQLSH command 
> (optional), something like
> {code:java}
> ALIGNMENT LEFT|RIGHT
> {code}
> cc [~bschoeni]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19340) Analytics writer should support writing UDTs

2024-04-01 Thread Francisco Guerrero (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francisco Guerrero updated CASSANDRA-19340:
---
Status: Ready to Commit  (was: Review In Progress)

> Analytics writer should support writing UDTs
> 
>
> Key: CASSANDRA-19340
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19340
> Project: Cassandra
>  Issue Type: Bug
>  Components: Analytics Library
>Reporter: Doug Rohrer
>Assignee: Doug Rohrer
>Priority: Normal
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> {{CQLSSTableWriter}} supports writing UDTs. We should support converting from 
> Map-type fields or nested Spark structs to UDTs in the Analytics Bulk Writer, 
> similar to the way Bulk Reader converts UDTs to maps.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19340) Analytics writer should support writing UDTs

2024-04-01 Thread Francisco Guerrero (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832978#comment-17832978
 ] 

Francisco Guerrero commented on CASSANDRA-19340:


+1 thanks for the patch 

> Analytics writer should support writing UDTs
> 
>
> Key: CASSANDRA-19340
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19340
> Project: Cassandra
>  Issue Type: Bug
>  Components: Analytics Library
>Reporter: Doug Rohrer
>Assignee: Doug Rohrer
>Priority: Normal
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> {{CQLSSTableWriter}} supports writing UDTs. We should support converting from 
> Map-type fields or nested Spark structs to UDTs in the Analytics Bulk Writer, 
> similar to the way Bulk Reader converts UDTs to maps.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19508) Getting tons of msgs "Failed to get peer certificates for peer /x.x.x.x:45796" when require_client_auth is set to false

2024-04-01 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832935#comment-17832935
 ] 

Brandon Williams edited comment on CASSANDRA-19508 at 4/1/24 11:06 PM:
---

Looks good to me, let's check CI:

||Branch||CI||
|[4.0|https://github.com/driftx/cassandra/tree/CASSANDRA-19508-4.0]|[j8|https://app.circleci.com/pipelines/github/driftx/cassandra/1557/workflows/c1ae034b-e2e5-441a-9e5d-cfbfedf092fb],
 
[j11|https://app.circleci.com/pipelines/github/driftx/cassandra/1557/workflows/e537d55a-3cc7-482a-aaeb-51d804c7b6b5]|
|[4.1|https://github.com/driftx/cassandra/tree/CASSANDRA-19508-4.1]|[j8|https://app.circleci.com/pipelines/github/driftx/cassandra/1554/workflows/5a09fafb-756a-4c7f-af8b-8e8fc7707721],
 
[j11|https://app.circleci.com/pipelines/github/driftx/cassandra/1554/workflows/e6874734-01db-4eaa-867b-d38d9fdd6eeb]|
|[5.0|https://github.com/driftx/cassandra/tree/CASSANDRA-19508-5.0]|[j11|https://app.circleci.com/pipelines/github/driftx/cassandra/1555/workflows/d0c49666-4804-4e96-bce8-ff25945697ee],
 
[j17|https://app.circleci.com/pipelines/github/driftx/cassandra/1555/workflows/1a3573c1-2cc7-4276-bf45-ce6a39237f6f]|
|[trunk|https://github.com/driftx/cassandra/tree/CASSANDRA-19508-trunk]|[j11|https://app.circleci.com/pipelines/github/driftx/cassandra/1559/workflows/6a0dd911-3a09-459a-aae8-f88bde32d3b0],
 
[j17|https://app.circleci.com/pipelines/github/driftx/cassandra/1559/workflows/ad7ac503-2785-4f6d-9c41-ee5f00a7fdb2]|


was (Author: brandon.williams):
Looks good to me, let's check CI:

||Branch||CI||
|[4.0|https://github.com/driftx/cassandra/tree/CASSANDRA-19508-4.0]|[j8|https://app.circleci.com/pipelines/github/driftx/cassandra/1557/workflows/c1ae034b-e2e5-441a-9e5d-cfbfedf092fb],
 
[j11|https://app.circleci.com/pipelines/github/driftx/cassandra/1557/workflows/e537d55a-3cc7-482a-aaeb-51d804c7b6b5]|
|[4.1|https://github.com/driftx/cassandra/tree/CASSANDRA-19508-4.1]|[j8|https://app.circleci.com/pipelines/github/driftx/cassandra/1554/workflows/5a09fafb-756a-4c7f-af8b-8e8fc7707721],
 
[j11|https://app.circleci.com/pipelines/github/driftx/cassandra/1554/workflows/e6874734-01db-4eaa-867b-d38d9fdd6eeb]|
|[5.0|https://github.com/driftx/cassandra/tree/CASSANDRA-19508-5.0]|[j11|https://app.circleci.com/pipelines/github/driftx/cassandra/1555/workflows/d0c49666-4804-4e96-bce8-ff25945697ee],
 
[j17|https://app.circleci.com/pipelines/github/driftx/cassandra/1555/workflows/1a3573c1-2cc7-4276-bf45-ce6a39237f6f]|
|[trunk|https://github.com/driftx/cassandra/tree/CASSANDRA-19508-trunk]|[j11|https://app.circleci.com/pipelines/github/driftx/cassandra/1558/workflows/f240ca5d-5589-45f0-9118-42741a23dc7d],
 
[j17|https://app.circleci.com/pipelines/github/driftx/cassandra/1558/workflows/1dcd8409-5f26-4570-9ee0-024f1bee718b]|


> Getting tons of msgs "Failed to get peer certificates for peer 
> /x.x.x.x:45796" when require_client_auth is set to false
> ---
>
> Key: CASSANDRA-19508
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19508
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Encryption
>Reporter: Mohammad Aburadeh
>Assignee: Mohammad Aburadeh
>Priority: Urgent
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> We recently upgraded our production clusters from 3.11.15 to 4.1.4. We 
> started seeing thousands of msgs "Failed to get peer certificates for peer 
> /x.x.x.x:45796". SSL is enabled but require_client_auth is disabled.  This is 
> causing a huge problem for us because cassandra log files are growing very 
> fast as our connections are short live connections, we open more than 1K 
> connections per second and they stay live for 1-2 seconds. 
> {code:java}
> DEBUG [Native-Transport-Requests-2] 2024-03-31 21:26:38,026 
> ServerConnection.java:140 - Failed to get peer certificates for peer 
> /172.31.2.23:45796
> javax.net.ssl.SSLPeerUnverifiedException: peer not verified
>         at 
> io.netty.handler.ssl.ReferenceCountedOpenSslEngine$DefaultOpenSslSession.getPeerCertificateChain(ReferenceCountedOpenSslEngine.java:2414)
>         at 
> io.netty.handler.ssl.ExtendedOpenSslSession.getPeerCertificateChain(ExtendedOpenSslSession.java:140)
>         at 
> org.apache.cassandra.transport.ServerConnection.certificates(ServerConnection.java:136)
>         at 
> org.apache.cassandra.transport.ServerConnection.getSaslNegotiator(ServerConnection.java:120)
>         at 
> org.apache.cassandra.transport.messages.AuthResponse.execute(AuthResponse.java:76)
>         at 
> org.apache.cassandra.transport.Message$Request.execute(Message.java:255)
>         at 
> org.apache.cassandra.transport.Dispatcher.processRequest(Dispatcher.java:166)
>         at 
> 

[jira] [Comment Edited] (CASSANDRA-19508) Getting tons of msgs "Failed to get peer certificates for peer /x.x.x.x:45796" when require_client_auth is set to false

2024-04-01 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832935#comment-17832935
 ] 

Brandon Williams edited comment on CASSANDRA-19508 at 4/1/24 10:32 PM:
---

Looks good to me, let's check CI:

||Branch||CI||
|[4.0|https://github.com/driftx/cassandra/tree/CASSANDRA-19508-4.0]|[j8|https://app.circleci.com/pipelines/github/driftx/cassandra/1557/workflows/c1ae034b-e2e5-441a-9e5d-cfbfedf092fb],
 
[j11|https://app.circleci.com/pipelines/github/driftx/cassandra/1557/workflows/e537d55a-3cc7-482a-aaeb-51d804c7b6b5]|
|[4.1|https://github.com/driftx/cassandra/tree/CASSANDRA-19508-4.1]|[j8|https://app.circleci.com/pipelines/github/driftx/cassandra/1554/workflows/5a09fafb-756a-4c7f-af8b-8e8fc7707721],
 
[j11|https://app.circleci.com/pipelines/github/driftx/cassandra/1554/workflows/e6874734-01db-4eaa-867b-d38d9fdd6eeb]|
|[5.0|https://github.com/driftx/cassandra/tree/CASSANDRA-19508-5.0]|[j11|https://app.circleci.com/pipelines/github/driftx/cassandra/1555/workflows/d0c49666-4804-4e96-bce8-ff25945697ee],
 
[j17|https://app.circleci.com/pipelines/github/driftx/cassandra/1555/workflows/1a3573c1-2cc7-4276-bf45-ce6a39237f6f]|
|[trunk|https://github.com/driftx/cassandra/tree/CASSANDRA-19508-trunk]|[j11|https://app.circleci.com/pipelines/github/driftx/cassandra/1558/workflows/f240ca5d-5589-45f0-9118-42741a23dc7d],
 
[j17|https://app.circleci.com/pipelines/github/driftx/cassandra/1558/workflows/1dcd8409-5f26-4570-9ee0-024f1bee718b]|



was (Author: brandon.williams):
Looks good to me, let's check CI:

||Branch||CI||
|[4.0|https://github.com/driftx/cassandra/tree/CASSANDRA-19508-4.0]|[j8|https://app.circleci.com/pipelines/github/driftx/cassandra/1557/workflows/c1ae034b-e2e5-441a-9e5d-cfbfedf092fb],
 
[j11|https://app.circleci.com/pipelines/github/driftx/cassandra/1557/workflows/e537d55a-3cc7-482a-aaeb-51d804c7b6b5]|
|[4.1|https://github.com/driftx/cassandra/tree/CASSANDRA-19508-4.1]|[j8|https://app.circleci.com/pipelines/github/driftx/cassandra/1554/workflows/5a09fafb-756a-4c7f-af8b-8e8fc7707721],
 
[j11|https://app.circleci.com/pipelines/github/driftx/cassandra/1554/workflows/e6874734-01db-4eaa-867b-d38d9fdd6eeb]|
|[5.0|https://github.com/driftx/cassandra/tree/CASSANDRA-19508-5.0]|[j11|https://app.circleci.com/pipelines/github/driftx/cassandra/1555/workflows/d0c49666-4804-4e96-bce8-ff25945697ee],
 
[j17|https://app.circleci.com/pipelines/github/driftx/cassandra/1555/workflows/1a3573c1-2cc7-4276-bf45-ce6a39237f6f]|
|[trunk|https://github.com/driftx/cassandra/tree/CASSANDRA-19508-trunk]|[j11|https://app.circleci.com/pipelines/github/driftx/cassandra/1556/workflows/ab0e8423-e1d0-480f-91e7-cad66509453c],
 
[j17|https://app.circleci.com/pipelines/github/driftx/cassandra/1556/workflows/380ed9fb-8c0f-4c24-9b66-d66442d84ad5]|

> Getting tons of msgs "Failed to get peer certificates for peer 
> /x.x.x.x:45796" when require_client_auth is set to false
> ---
>
> Key: CASSANDRA-19508
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19508
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Encryption
>Reporter: Mohammad Aburadeh
>Assignee: Mohammad Aburadeh
>Priority: Urgent
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> We recently upgraded our production clusters from 3.11.15 to 4.1.4. We 
> started seeing thousands of msgs "Failed to get peer certificates for peer 
> /x.x.x.x:45796". SSL is enabled but require_client_auth is disabled.  This is 
> causing a huge problem for us because cassandra log files are growing very 
> fast as our connections are short live connections, we open more than 1K 
> connections per second and they stay live for 1-2 seconds. 
> {code:java}
> DEBUG [Native-Transport-Requests-2] 2024-03-31 21:26:38,026 
> ServerConnection.java:140 - Failed to get peer certificates for peer 
> /172.31.2.23:45796
> javax.net.ssl.SSLPeerUnverifiedException: peer not verified
>         at 
> io.netty.handler.ssl.ReferenceCountedOpenSslEngine$DefaultOpenSslSession.getPeerCertificateChain(ReferenceCountedOpenSslEngine.java:2414)
>         at 
> io.netty.handler.ssl.ExtendedOpenSslSession.getPeerCertificateChain(ExtendedOpenSslSession.java:140)
>         at 
> org.apache.cassandra.transport.ServerConnection.certificates(ServerConnection.java:136)
>         at 
> org.apache.cassandra.transport.ServerConnection.getSaslNegotiator(ServerConnection.java:120)
>         at 
> org.apache.cassandra.transport.messages.AuthResponse.execute(AuthResponse.java:76)
>         at 
> org.apache.cassandra.transport.Message$Request.execute(Message.java:255)
>         at 
> org.apache.cassandra.transport.Dispatcher.processRequest(Dispatcher.java:166)
>         at 
> 

[jira] [Commented] (CASSANDRA-19428) Clean up KeyRangeIterator classes

2024-04-01 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832956#comment-17832956
 ] 

Ekaterina Dimitrova commented on CASSANDRA-19428:
-

Latest branch that fixes the tests:

[https://github.com/ekaterinadimitrova2/cassandra/tree/19428-5.0-5]

Applied also to trunk:

[https://github.com/ekaterinadimitrova2/cassandra/pull/new/19428-trunk-2]

I am running CI now.

 

> Clean up KeyRangeIterator classes
> -
>
> Key: CASSANDRA-19428
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19428
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/2i Index
>Reporter: Ekaterina Dimitrova
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
> Attachments: 
> Make_sure_the_builders_attach_the_onClose_hook_when_there_is_only_a_single_sub-iterator.patch
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Remove KeyRangeIterator.current and simplify



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19457: Object reference in Micrometer metrics prevent GC from reclaiming Session instances [cassandra-java-driver]

2024-04-01 Thread via GitHub


SiyaoIsHiding commented on PR #1916:
URL: 
https://github.com/apache/cassandra-java-driver/pull/1916#issuecomment-2030578292

   Questions:
   1. I tried hard, but I could not find an example of how to use MicroProfile 
with our driver, especially how to instantiate a MicroProfile registry. May I 
confirm, is it the user's responsibility to implement and pass in a 
MicroProfile `MetricRegistry`? In this case, is it the user's responsibility to 
ensure there is no memory leak?
   2. Do we only clear the metrics when a session is closed? Or do we also 
clear the node metrics when a node goes down?
   
   Progress:
   1. Dropwizard is not leaking memory in the way that Micrometer does.
   2. I refactored `clearMetrics()` and put it into DefaultSession's 
`closeAsync` and `forceCloseAsync`. Micrometer does not leak memory now. It is 
now pending for your review. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19340 - Support writing UDTs [cassandra-analytics]

2024-04-01 Thread via GitHub


frankgh commented on code in PR #45:
URL: 
https://github.com/apache/cassandra-analytics/pull/45#discussion_r1546825765


##
cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/bulkwriter/CassandraBulkWriterContext.java:
##
@@ -92,11 +92,17 @@ protected CassandraBulkWriterContext(@NotNull BulkSparkConf 
conf,
 Set udts = CqlUtils.extractUdts(keyspaceSchema, keyspace);
 ReplicationFactor replicationFactor = 
CqlUtils.extractReplicationFactor(keyspaceSchema, keyspace);
 int indexCount = CqlUtils.extractIndexCount(keyspaceSchema, keyspace, 
table);
-CqlTable cqlTable = bridge.buildSchema(createTableSchema, keyspace, 
replicationFactor, partitioner, udts, null, indexCount);
+CqlTable cqlTable = bridge().buildSchema(createTableSchema, keyspace, 
replicationFactor, partitioner, udts, null, indexCount);
 
 TableInfoProvider tableInfoProvider = new 
CqlTableInfoProvider(createTableSchema, cqlTable);
 TableSchema tableSchema = initializeTableSchema(conf, dfSchema, 
tableInfoProvider, lowestCassandraVersion);
-schemaInfo = new CassandraSchemaInfo(tableSchema);
+schemaInfo = new CassandraSchemaInfo(tableSchema, udts, cqlTable);
+}
+
+@Override
+public CassandraBridge bridge()
+{
+return this.bridge;

Review Comment:
   if this is serialized to executors, and it's declared as transient, we'll 
need to initialize it for the executors at some point. I suggest we add a field 
called `private final String lowestCassandraVersion;` and we initialize it in 
the constructor (line 72). I think at the executor level we'll see NPEs for the 
bridge



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19340 - Support writing UDTs [cassandra-analytics]

2024-04-01 Thread via GitHub


frankgh commented on code in PR #45:
URL: 
https://github.com/apache/cassandra-analytics/pull/45#discussion_r1541520438


##
cassandra-analytics-core/src/test/java/org/apache/cassandra/spark/bulkwriter/MockBulkWriterContext.java:
##
@@ -77,6 +79,7 @@ public class MockBulkWriterContext implements 
BulkWriterContext, ClusterInfo, Jo
 private ConsistencyLevel.CL consistencyLevel;
 private int sstableDataSizeInMB = 128;
 private int sstableWriteBatchSize = 2;
+private CassandraBridge bridge = 
CassandraBridgeFactory.get(CassandraVersion.FOURZERO);

Review Comment:
   should this be passed to the mock BW context? I wonder what tests for 50, 51 
will look like if we use 40 here always?



##
cassandra-bridge/src/main/java/org/apache/cassandra/spark/data/BridgeUdtValue.java:
##
@@ -0,0 +1,69 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.cassandra.spark.data;
+
+import java.io.Serializable;
+import java.util.Map;
+import java.util.Objects;
+
+/**
+ * The BridgeUdtValue class exists because the Cassandra values produced 
(UDTValue) are not serializable
+ * because they come from the classloader inside the bridge, and therefore 
can't be passed around
+ * from one Spark phase to another. Therefore, we build a map of these 
instances (potentially nested)
+ * and return them from the conversion stage for later use when the writer 
actually writes them.
+ */
+public class BridgeUdtValue implements Serializable
+{
+public final String name;
+public final Map udtMap;
+
+public BridgeUdtValue(String name, Map valueMap)
+{
+this.name = name;
+this.udtMap = valueMap;
+}
+
+public boolean equals(Object o)
+{
+if (this == o)
+{
+return true;
+}
+if (o == null || getClass() != o.getClass())
+{
+return false;
+}
+BridgeUdtValue udtValue = (BridgeUdtValue) o;
+return Objects.equals(name, udtValue.name) && Objects.equals(udtMap, 
udtValue.udtMap);
+}
+
+public int hashCode()

Review Comment:
   NIT, override annotation



##
cassandra-bridge/src/main/java/org/apache/cassandra/spark/data/BridgeUdtValue.java:
##
@@ -0,0 +1,69 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.cassandra.spark.data;
+
+import java.io.Serializable;
+import java.util.Map;
+import java.util.Objects;
+
+/**
+ * The BridgeUdtValue class exists because the Cassandra values produced 
(UDTValue) are not serializable
+ * because they come from the classloader inside the bridge, and therefore 
can't be passed around
+ * from one Spark phase to another. Therefore, we build a map of these 
instances (potentially nested)
+ * and return them from the conversion stage for later use when the writer 
actually writes them.
+ */
+public class BridgeUdtValue implements Serializable
+{
+public final String name;
+public final Map udtMap;
+
+public BridgeUdtValue(String name, Map valueMap)
+{
+this.name = name;
+this.udtMap = valueMap;
+}
+
+public boolean equals(Object o)
+{
+if (this == o)
+{
+return true;
+}
+if (o == null || getClass() != o.getClass())
+{
+return false;
+}
+BridgeUdtValue udtValue = (BridgeUdtValue) o;
+return Objects.equals(name, udtValue.name) 

[jira] [Updated] (CASSANDRA-19340) Analytics writer should support writing UDTs

2024-04-01 Thread Yifan Cai (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifan Cai updated CASSANDRA-19340:
--
Reviewers: Yifan Cai  (was: Yifan Cai, Yifan Cai)

> Analytics writer should support writing UDTs
> 
>
> Key: CASSANDRA-19340
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19340
> Project: Cassandra
>  Issue Type: Bug
>  Components: Analytics Library
>Reporter: Doug Rohrer
>Assignee: Doug Rohrer
>Priority: Normal
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> {{CQLSSTableWriter}} supports writing UDTs. We should support converting from 
> Map-type fields or nested Spark structs to UDTs in the Analytics Bulk Writer, 
> similar to the way Bulk Reader converts UDTs to maps.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19340 - Support writing UDTs [cassandra-analytics]

2024-04-01 Thread via GitHub


JeetKunDoug commented on PR #45:
URL: 
https://github.com/apache/cassandra-analytics/pull/45#issuecomment-2030455448

   Rebased/squashed/fixed CHANGES.txt conflict (no other changes/conflicts)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19511) Update Install instructions with later Ubuntu & Debian versions

2024-04-01 Thread Lorina Poland (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lorina Poland updated CASSANDRA-19511:
--
Description: 
Update 
[https://cassandra.apache.org/doc/stable/cassandra/getting_started/installing.html]
 with later Ubuntu versions (20 and 22 possibly).

Also Debian versions should be updated, too.

  was:Update 
[https://cassandra.apache.org/doc/stable/cassandra/getting_started/installing.html]
 with later Ubuntu versions (20 and 22 possibly).


> Update Install instructions with later Ubuntu  & Debian versions
> 
>
> Key: CASSANDRA-19511
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19511
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation
>Reporter: Lorina Poland
>Assignee: Lorina Poland
>Priority: Normal
>
> Update 
> [https://cassandra.apache.org/doc/stable/cassandra/getting_started/installing.html]
>  with later Ubuntu versions (20 and 22 possibly).
> Also Debian versions should be updated, too.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19511) Update Install instructions with later Ubuntu & Debian versions

2024-04-01 Thread Lorina Poland (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lorina Poland updated CASSANDRA-19511:
--
Summary: Update Install instructions with later Ubuntu  & Debian versions  
(was: Update Install instructions with later Ubuntu versions)

> Update Install instructions with later Ubuntu  & Debian versions
> 
>
> Key: CASSANDRA-19511
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19511
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation
>Reporter: Lorina Poland
>Assignee: Lorina Poland
>Priority: Normal
>
> Update 
> [https://cassandra.apache.org/doc/stable/cassandra/getting_started/installing.html]
>  with later Ubuntu versions (20 and 22 possibly).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19489) Guardrail to warn clients about possible transient incorrect responses for filtering queries against multiple mutable columns

2024-04-01 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-19489:

Test and Documentation Plan: There are new tests here around the new 
guardrail. They borrow a bit from the existing in-JVM {{GuardrailTester}} work.
 Status: Patch Available  (was: In Progress)

5.0 patch: https://github.com/apache/cassandra/pull/3220

Results from first CI run coming soon...

> Guardrail to warn clients about possible transient incorrect responses for 
> filtering queries against multiple mutable columns
> -
>
> Key: CASSANDRA-19489
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19489
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Coordination, CQL/Semantics, Messaging/Client
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 5.0-rc, 5.1
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Given we may not have time to fully resolve CASSANDRA-19007 before we release 
> 5.0, it would still be helpful to have, at the very minimum, a client warning 
> for cases where a user filters on two or more mutable (static or regular) 
> columns at consistency levels that require coordinator reconciliation. We may 
> also want the option to fail these queries outright, although that need not 
> be the default.
> The only art involved in this is deciding what we want to say in the 
> warning/error message. It's probably reasonable to mention there that this 
> only happens when we have unrepaired data. It's also worth noting that SAI 
> queries are no longer vulnerable to this after the resolution of 
> CASSANDRA-19018.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-19511) Update Install instructions with later Ubuntu versions

2024-04-01 Thread Lorina Poland (Jira)
Lorina Poland created CASSANDRA-19511:
-

 Summary: Update Install instructions with later Ubuntu versions
 Key: CASSANDRA-19511
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19511
 Project: Cassandra
  Issue Type: Task
  Components: Documentation
Reporter: Lorina Poland
Assignee: Lorina Poland


Update 
[https://cassandra.apache.org/doc/stable/cassandra/getting_started/installing.html]
 with later Ubuntu versions (20 and 22 possibly).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19508) Getting tons of msgs "Failed to get peer certificates for peer /x.x.x.x:45796" when require_client_auth is set to false

2024-04-01 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832935#comment-17832935
 ] 

Brandon Williams commented on CASSANDRA-19508:
--

Looks good to me, let's check CI:

||Branch||CI||
|[4.0|https://github.com/driftx/cassandra/tree/CASSANDRA-19508-4.0]|[j8|https://app.circleci.com/pipelines/github/driftx/cassandra/1557/workflows/c1ae034b-e2e5-441a-9e5d-cfbfedf092fb],
 
[j11|https://app.circleci.com/pipelines/github/driftx/cassandra/1557/workflows/e537d55a-3cc7-482a-aaeb-51d804c7b6b5]|
|[4.1|https://github.com/driftx/cassandra/tree/CASSANDRA-19508-4.1]|[j8|https://app.circleci.com/pipelines/github/driftx/cassandra/1554/workflows/5a09fafb-756a-4c7f-af8b-8e8fc7707721],
 
[j11|https://app.circleci.com/pipelines/github/driftx/cassandra/1554/workflows/e6874734-01db-4eaa-867b-d38d9fdd6eeb]|
|[5.0|https://github.com/driftx/cassandra/tree/CASSANDRA-19508-5.0]|[j11|https://app.circleci.com/pipelines/github/driftx/cassandra/1555/workflows/d0c49666-4804-4e96-bce8-ff25945697ee],
 
[j17|https://app.circleci.com/pipelines/github/driftx/cassandra/1555/workflows/1a3573c1-2cc7-4276-bf45-ce6a39237f6f]|
|[trunk|https://github.com/driftx/cassandra/tree/CASSANDRA-19508-trunk]|[j11|https://app.circleci.com/pipelines/github/driftx/cassandra/1556/workflows/ab0e8423-e1d0-480f-91e7-cad66509453c],
 
[j17|https://app.circleci.com/pipelines/github/driftx/cassandra/1556/workflows/380ed9fb-8c0f-4c24-9b66-d66442d84ad5]|

> Getting tons of msgs "Failed to get peer certificates for peer 
> /x.x.x.x:45796" when require_client_auth is set to false
> ---
>
> Key: CASSANDRA-19508
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19508
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Encryption
>Reporter: Mohammad Aburadeh
>Assignee: Mohammad Aburadeh
>Priority: Urgent
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> We recently upgraded our production clusters from 3.11.15 to 4.1.4. We 
> started seeing thousands of msgs "Failed to get peer certificates for peer 
> /x.x.x.x:45796". SSL is enabled but require_client_auth is disabled.  This is 
> causing a huge problem for us because cassandra log files are growing very 
> fast as our connections are short live connections, we open more than 1K 
> connections per second and they stay live for 1-2 seconds. 
> {code:java}
> DEBUG [Native-Transport-Requests-2] 2024-03-31 21:26:38,026 
> ServerConnection.java:140 - Failed to get peer certificates for peer 
> /172.31.2.23:45796
> javax.net.ssl.SSLPeerUnverifiedException: peer not verified
>         at 
> io.netty.handler.ssl.ReferenceCountedOpenSslEngine$DefaultOpenSslSession.getPeerCertificateChain(ReferenceCountedOpenSslEngine.java:2414)
>         at 
> io.netty.handler.ssl.ExtendedOpenSslSession.getPeerCertificateChain(ExtendedOpenSslSession.java:140)
>         at 
> org.apache.cassandra.transport.ServerConnection.certificates(ServerConnection.java:136)
>         at 
> org.apache.cassandra.transport.ServerConnection.getSaslNegotiator(ServerConnection.java:120)
>         at 
> org.apache.cassandra.transport.messages.AuthResponse.execute(AuthResponse.java:76)
>         at 
> org.apache.cassandra.transport.Message$Request.execute(Message.java:255)
>         at 
> org.apache.cassandra.transport.Dispatcher.processRequest(Dispatcher.java:166)
>         at 
> org.apache.cassandra.transport.Dispatcher.processRequest(Dispatcher.java:185)
>         at 
> org.apache.cassandra.transport.Dispatcher.processRequest(Dispatcher.java:212)
>         at 
> org.apache.cassandra.transport.Dispatcher$RequestProcessor.run(Dispatcher.java:109)
>         at 
> org.apache.cassandra.concurrent.FutureTask$1.call(FutureTask.java:96)
>         at org.apache.cassandra.concurrent.FutureTask.call(FutureTask.java:61)
>         at org.apache.cassandra.concurrent.FutureTask.run(FutureTask.java:71)
>         at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:142)
>         at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>  {code}
> *Our SSL config:*
> {code:java}
> client_encryption_options:
>   enabled: true
>   keystore: /path/to/keystore
>   keystore_password: x
>   optional: false
>   require_client_auth: false {code}
>  
> We should stop throwing this msg when require_client_auth is set to false. Or 
> at least it should be logged in TRACE not DEBUG. 
> I'm working on preparing a PR. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19507) [Analytics] Fix bulk reads of multiple tables that potentially have the same data file name

2024-04-01 Thread Francisco Guerrero (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francisco Guerrero updated CASSANDRA-19507:
---
Status: Ready to Commit  (was: Review In Progress)

> [Analytics] Fix bulk reads of multiple tables that potentially have the same 
> data file name
> ---
>
> Key: CASSANDRA-19507
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19507
> Project: Cassandra
>  Issue Type: Bug
>  Components: Analytics Library
>Reporter: Francisco Guerrero
>Assignee: Francisco Guerrero
>Priority: Normal
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When reading multiple data frames using bulk reader from different tables, it 
> is possible to encounter a data file name being retrieved from the same 
> Sidecar instance. Because the {{SSTable}}s are cached in the 
> {{SSTableCache}}, it is possible that the 
> {{org.apache.cassandra.spark.reader.SSTableReader}} is using the incorrect 
> {{SSTable}} if it was cached with the same {{#hashCode}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19507) [Analytics] Fix bulk reads of multiple tables that potentially have the same data file name

2024-04-01 Thread Francisco Guerrero (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francisco Guerrero updated CASSANDRA-19507:
---
Reviewers: Yifan Cai, Francisco Guerrero
   Yifan Cai, Francisco Guerrero  (was: Francisco Guerrero, Yifan 
Cai)
   Status: Review In Progress  (was: Patch Available)

> [Analytics] Fix bulk reads of multiple tables that potentially have the same 
> data file name
> ---
>
> Key: CASSANDRA-19507
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19507
> Project: Cassandra
>  Issue Type: Bug
>  Components: Analytics Library
>Reporter: Francisco Guerrero
>Assignee: Francisco Guerrero
>Priority: Normal
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When reading multiple data frames using bulk reader from different tables, it 
> is possible to encounter a data file name being retrieved from the same 
> Sidecar instance. Because the {{SSTable}}s are cached in the 
> {{SSTableCache}}, it is possible that the 
> {{org.apache.cassandra.spark.reader.SSTableReader}} is using the incorrect 
> {{SSTable}} if it was cached with the same {{#hashCode}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra-analytics) branch trunk updated: CASSANDRA-19507 Fix bulk reads of multiple tables that potentially have the same data file name (#47)

2024-04-01 Thread frankgh
This is an automated email from the ASF dual-hosted git repository.

frankgh pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra-analytics.git


The following commit(s) were added to refs/heads/trunk by this push:
 new c00c454  CASSANDRA-19507 Fix bulk reads of multiple tables that 
potentially have the same data file name (#47)
c00c454 is described below

commit c00c454d698e5a29caf58e61ed52ab48d08fd7fe
Author: Francisco Guerrero 
AuthorDate: Mon Apr 1 12:11:52 2024 -0700

CASSANDRA-19507 Fix bulk reads of multiple tables that potentially have the 
same data file name (#47)

When reading multiple data frames using bulk reader from different tables, 
it is possible to encounter a data
file name being retrieved from the same Sidecar instance. Because the 
`SSTable`s are cached in the `SSTableCache`,
it is possible that the `org.apache.cassandra.spark.reader.SSTableReader` 
uses the incorrect `SSTable` if it was
cached with the same `#hashCode`.

In this patch, the equality takes into account the keyspace, table, and 
snapshot name.

Additionally, we implement the `hashCode` and `equals` method in 
`org.apache.cassandra.clients.SidecarInstanceImpl` to utilize the 
`SSTableCache` correctly. Once the methods are implemented, the issue 
originally described in JIRA is surfaced.

Patch by Francisco Guerrero; Reviewed by Yifan Cai for CASSANDRA-19507
---
 CHANGES.txt|   1 +
 .../cassandra/clients/SidecarInstanceImpl.java |  21 +++
 .../spark/data/SidecarProvisionedSSTable.java  |  29 ++--
 .../spark/data/SidecarProvisionedSSTableTest.java  | 170 +
 .../analytics/QuoteIdentifiersReadTest.java|  26 +++-
 .../analytics/ReadDifferentTablesTest.java | 123 +++
 6 files changed, 354 insertions(+), 16 deletions(-)

diff --git a/CHANGES.txt b/CHANGES.txt
index 718e1d4..914d933 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 1.0.0
+ * Fix bulk reads of multiple tables that potentially have the same data file 
name (CASSANDRA-19507)
  * Fix XXHash32Digest calculated digest value (CASSANDRA-19500)
  * Report additional bulk analytics job stats for instrumentation 
(CASSANDRA-19418)
  * Add certificate expiry check to start up validations done in Cassandra 
Analytics library (CASSANDRA-19424)
diff --git 
a/cassandra-analytics-core/src/main/java/org/apache/cassandra/clients/SidecarInstanceImpl.java
 
b/cassandra-analytics-core/src/main/java/org/apache/cassandra/clients/SidecarInstanceImpl.java
index bb2020a..d73dc2e 100644
--- 
a/cassandra-analytics-core/src/main/java/org/apache/cassandra/clients/SidecarInstanceImpl.java
+++ 
b/cassandra-analytics-core/src/main/java/org/apache/cassandra/clients/SidecarInstanceImpl.java
@@ -86,6 +86,27 @@ public class SidecarInstanceImpl implements Serializable, 
SidecarInstance
 return String.format("SidecarInstanceImpl{hostname='%s', port=%d}", 
hostname, port);
 }
 
+@Override
+public boolean equals(Object object)
+{
+if (this == object)
+{
+return true;
+}
+if (object == null || getClass() != object.getClass())
+{
+return false;
+}
+SidecarInstanceImpl that = (SidecarInstanceImpl) object;
+return port == that.port && Objects.equals(hostname, that.hostname);
+}
+
+@Override
+public int hashCode()
+{
+return Objects.hash(port, hostname);
+}
+
 // JDK Serialization
 
 private void readObject(ObjectInputStream in) throws IOException, 
ClassNotFoundException
diff --git 
a/cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/SidecarProvisionedSSTable.java
 
b/cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/SidecarProvisionedSSTable.java
index 648c74f..6e4ff0f 100644
--- 
a/cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/SidecarProvisionedSSTable.java
+++ 
b/cassandra-analytics-core/src/main/java/org/apache/cassandra/spark/data/SidecarProvisionedSSTable.java
@@ -243,30 +243,39 @@ public class SidecarProvisionedSSTable extends SSTable
 @Override
 public String toString()
 {
-return String.format("{\"hostname\"=\"%s\", \"port\"=\"%d\", 
\"dataFileName\"=\"%s\", \"partitionId\"=\"%d\"}",
- instance.hostname(), instance.port(), 
dataFileName, partitionId);
+return "SidecarProvisionedSSTable{" +
+   "hostname='" + instance.hostname() + '\'' +
+   ", port=" + instance.port() +
+   ", keyspace='" + keyspace + '\'' +
+   ", table='" + table + '\'' +
+   ", snapshotName='" + snapshotName + '\'' +
+   ", dataFileName='" + dataFileName + '\'' +
+   ", partitionId=" + partitionId +
+   '}';
 }
 
 @Override
 public int hashCode()
 {
-

[jira] [Updated] (CASSANDRA-19507) [Analytics] Fix bulk reads of multiple tables that potentially have the same data file name

2024-04-01 Thread Francisco Guerrero (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francisco Guerrero updated CASSANDRA-19507:
---
  Fix Version/s: NA
  Since Version: NA
Source Control Link: 
https://github.com/apache/cassandra-analytics/commit/c00c454d698e5a29caf58e61ed52ab48d08fd7fe
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> [Analytics] Fix bulk reads of multiple tables that potentially have the same 
> data file name
> ---
>
> Key: CASSANDRA-19507
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19507
> Project: Cassandra
>  Issue Type: Bug
>  Components: Analytics Library
>Reporter: Francisco Guerrero
>Assignee: Francisco Guerrero
>Priority: Normal
> Fix For: NA
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When reading multiple data frames using bulk reader from different tables, it 
> is possible to encounter a data file name being retrieved from the same 
> Sidecar instance. Because the {{SSTable}}s are cached in the 
> {{SSTableCache}}, it is possible that the 
> {{org.apache.cassandra.spark.reader.SSTableReader}} is using the incorrect 
> {{SSTable}} if it was cached with the same {{#hashCode}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19507 Fix bulk reads of multiple tables that potentially have the same data file name [cassandra-analytics]

2024-04-01 Thread via GitHub


frankgh merged PR #47:
URL: https://github.com/apache/cassandra-analytics/pull/47


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19507) [Analytics] Fix bulk reads of multiple tables that potentially have the same data file name

2024-04-01 Thread Yifan Cai (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832932#comment-17832932
 ] 

Yifan Cai commented on CASSANDRA-19507:
---

+1 on the patch!

> [Analytics] Fix bulk reads of multiple tables that potentially have the same 
> data file name
> ---
>
> Key: CASSANDRA-19507
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19507
> Project: Cassandra
>  Issue Type: Bug
>  Components: Analytics Library
>Reporter: Francisco Guerrero
>Assignee: Francisco Guerrero
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When reading multiple data frames using bulk reader from different tables, it 
> is possible to encounter a data file name being retrieved from the same 
> Sidecar instance. Because the {{SSTable}}s are cached in the 
> {{SSTableCache}}, it is possible that the 
> {{org.apache.cassandra.spark.reader.SSTableReader}} is using the incorrect 
> {{SSTable}} if it was cached with the same {{#hashCode}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19508) Getting tons of msgs "Failed to get peer certificates for peer /x.x.x.x:45796" when require_client_auth is set to false

2024-04-01 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-19508:
-
Reviewers: Brandon Williams

> Getting tons of msgs "Failed to get peer certificates for peer 
> /x.x.x.x:45796" when require_client_auth is set to false
> ---
>
> Key: CASSANDRA-19508
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19508
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Encryption
>Reporter: Mohammad Aburadeh
>Assignee: Mohammad Aburadeh
>Priority: Urgent
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> We recently upgraded our production clusters from 3.11.15 to 4.1.4. We 
> started seeing thousands of msgs "Failed to get peer certificates for peer 
> /x.x.x.x:45796". SSL is enabled but require_client_auth is disabled.  This is 
> causing a huge problem for us because cassandra log files are growing very 
> fast as our connections are short live connections, we open more than 1K 
> connections per second and they stay live for 1-2 seconds. 
> {code:java}
> DEBUG [Native-Transport-Requests-2] 2024-03-31 21:26:38,026 
> ServerConnection.java:140 - Failed to get peer certificates for peer 
> /172.31.2.23:45796
> javax.net.ssl.SSLPeerUnverifiedException: peer not verified
>         at 
> io.netty.handler.ssl.ReferenceCountedOpenSslEngine$DefaultOpenSslSession.getPeerCertificateChain(ReferenceCountedOpenSslEngine.java:2414)
>         at 
> io.netty.handler.ssl.ExtendedOpenSslSession.getPeerCertificateChain(ExtendedOpenSslSession.java:140)
>         at 
> org.apache.cassandra.transport.ServerConnection.certificates(ServerConnection.java:136)
>         at 
> org.apache.cassandra.transport.ServerConnection.getSaslNegotiator(ServerConnection.java:120)
>         at 
> org.apache.cassandra.transport.messages.AuthResponse.execute(AuthResponse.java:76)
>         at 
> org.apache.cassandra.transport.Message$Request.execute(Message.java:255)
>         at 
> org.apache.cassandra.transport.Dispatcher.processRequest(Dispatcher.java:166)
>         at 
> org.apache.cassandra.transport.Dispatcher.processRequest(Dispatcher.java:185)
>         at 
> org.apache.cassandra.transport.Dispatcher.processRequest(Dispatcher.java:212)
>         at 
> org.apache.cassandra.transport.Dispatcher$RequestProcessor.run(Dispatcher.java:109)
>         at 
> org.apache.cassandra.concurrent.FutureTask$1.call(FutureTask.java:96)
>         at org.apache.cassandra.concurrent.FutureTask.call(FutureTask.java:61)
>         at org.apache.cassandra.concurrent.FutureTask.run(FutureTask.java:71)
>         at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:142)
>         at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>  {code}
> *Our SSL config:*
> {code:java}
> client_encryption_options:
>   enabled: true
>   keystore: /path/to/keystore
>   keystore_password: x
>   optional: false
>   require_client_auth: false {code}
>  
> We should stop throwing this msg when require_client_auth is set to false. Or 
> at least it should be logged in TRACE not DEBUG. 
> I'm working on preparing a PR. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-19508) Getting tons of msgs "Failed to get peer certificates for peer /x.x.x.x:45796" when require_client_auth is set to false

2024-04-01 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams reassigned CASSANDRA-19508:


Assignee: Mohammad Aburadeh

> Getting tons of msgs "Failed to get peer certificates for peer 
> /x.x.x.x:45796" when require_client_auth is set to false
> ---
>
> Key: CASSANDRA-19508
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19508
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Encryption
>Reporter: Mohammad Aburadeh
>Assignee: Mohammad Aburadeh
>Priority: Urgent
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> We recently upgraded our production clusters from 3.11.15 to 4.1.4. We 
> started seeing thousands of msgs "Failed to get peer certificates for peer 
> /x.x.x.x:45796". SSL is enabled but require_client_auth is disabled.  This is 
> causing a huge problem for us because cassandra log files are growing very 
> fast as our connections are short live connections, we open more than 1K 
> connections per second and they stay live for 1-2 seconds. 
> {code:java}
> DEBUG [Native-Transport-Requests-2] 2024-03-31 21:26:38,026 
> ServerConnection.java:140 - Failed to get peer certificates for peer 
> /172.31.2.23:45796
> javax.net.ssl.SSLPeerUnverifiedException: peer not verified
>         at 
> io.netty.handler.ssl.ReferenceCountedOpenSslEngine$DefaultOpenSslSession.getPeerCertificateChain(ReferenceCountedOpenSslEngine.java:2414)
>         at 
> io.netty.handler.ssl.ExtendedOpenSslSession.getPeerCertificateChain(ExtendedOpenSslSession.java:140)
>         at 
> org.apache.cassandra.transport.ServerConnection.certificates(ServerConnection.java:136)
>         at 
> org.apache.cassandra.transport.ServerConnection.getSaslNegotiator(ServerConnection.java:120)
>         at 
> org.apache.cassandra.transport.messages.AuthResponse.execute(AuthResponse.java:76)
>         at 
> org.apache.cassandra.transport.Message$Request.execute(Message.java:255)
>         at 
> org.apache.cassandra.transport.Dispatcher.processRequest(Dispatcher.java:166)
>         at 
> org.apache.cassandra.transport.Dispatcher.processRequest(Dispatcher.java:185)
>         at 
> org.apache.cassandra.transport.Dispatcher.processRequest(Dispatcher.java:212)
>         at 
> org.apache.cassandra.transport.Dispatcher$RequestProcessor.run(Dispatcher.java:109)
>         at 
> org.apache.cassandra.concurrent.FutureTask$1.call(FutureTask.java:96)
>         at org.apache.cassandra.concurrent.FutureTask.call(FutureTask.java:61)
>         at org.apache.cassandra.concurrent.FutureTask.run(FutureTask.java:71)
>         at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:142)
>         at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>  {code}
> *Our SSL config:*
> {code:java}
> client_encryption_options:
>   enabled: true
>   keystore: /path/to/keystore
>   keystore_password: x
>   optional: false
>   require_client_auth: false {code}
>  
> We should stop throwing this msg when require_client_auth is set to false. Or 
> at least it should be logged in TRACE not DEBUG. 
> I'm working on preparing a PR. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19507) [Analytics] Fix bulk reads of multiple tables that potentially have the same data file name

2024-04-01 Thread Francisco Guerrero (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832930#comment-17832930
 ] 

Francisco Guerrero commented on CASSANDRA-19507:


Green CI after addressing comments: 
https://app.circleci.com/pipelines/github/frankgh/cassandra-analytics/153/workflows/4c5d820e-b453-4e9c-a2f3-40f190260a5c

> [Analytics] Fix bulk reads of multiple tables that potentially have the same 
> data file name
> ---
>
> Key: CASSANDRA-19507
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19507
> Project: Cassandra
>  Issue Type: Bug
>  Components: Analytics Library
>Reporter: Francisco Guerrero
>Assignee: Francisco Guerrero
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When reading multiple data frames using bulk reader from different tables, it 
> is possible to encounter a data file name being retrieved from the same 
> Sidecar instance. Because the {{SSTable}}s are cached in the 
> {{SSTableCache}}, it is possible that the 
> {{org.apache.cassandra.spark.reader.SSTableReader}} is using the incorrect 
> {{SSTable}} if it was cached with the same {{#hashCode}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19508) Getting tons of msgs "Failed to get peer certificates for peer /x.x.x.x:45796" when require_client_auth is set to false

2024-04-01 Thread Mohammad Aburadeh (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832929#comment-17832929
 ] 

Mohammad Aburadeh edited comment on CASSANDRA-19508 at 4/1/24 7:06 PM:
---

I submitted PR for this issue. I tested it on our clusters, it worked well.
My fix is simple, it just disabled checking peer certificates if 
require_client_auth is disabled. The other option is logging the msg in TRACE 
not DEBUG. 
||Branch||
|[4.0|https://github.com/apache/cassandra/pull/3219]|
|[4.1\|\|[https://github.com/apache/cassandra/pull/3216]]|
|[5.0\|[https://github.com/apache/cassandra/pull/3216]7]|
|[Trunk\|[https://github.com/apache/cassandra/pull/3218|https://github.com/apache/cassandra/pull/3216]]|
| |

Please review and let me know. 

 


was (Author: JIRAUSER287854):
I submitted PR for this issue. I tested it on our clusters, it worked well.
My fix is simple, it just disabled checking peer certificates if 
require_client_auth is disabled. The other option is logging the msg in TRACE 
not DEBUG. 
||Branch||
|[4.0|https://github.com/apache/cassandra/pull/3219]|
|[4.1\|[https://github.com/apache/cassandra/pull/3216]]|
|[5.0\|[https://github.com/apache/cassandra/pull/3216]7]|
|[Trunk\|[https://github.com/apache/cassandra/pull/3218|https://github.com/apache/cassandra/pull/3216]]|
| |

Please review and let me know. 

 

> Getting tons of msgs "Failed to get peer certificates for peer 
> /x.x.x.x:45796" when require_client_auth is set to false
> ---
>
> Key: CASSANDRA-19508
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19508
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Encryption
>Reporter: Mohammad Aburadeh
>Priority: Urgent
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> We recently upgraded our production clusters from 3.11.15 to 4.1.4. We 
> started seeing thousands of msgs "Failed to get peer certificates for peer 
> /x.x.x.x:45796". SSL is enabled but require_client_auth is disabled.  This is 
> causing a huge problem for us because cassandra log files are growing very 
> fast as our connections are short live connections, we open more than 1K 
> connections per second and they stay live for 1-2 seconds. 
> {code:java}
> DEBUG [Native-Transport-Requests-2] 2024-03-31 21:26:38,026 
> ServerConnection.java:140 - Failed to get peer certificates for peer 
> /172.31.2.23:45796
> javax.net.ssl.SSLPeerUnverifiedException: peer not verified
>         at 
> io.netty.handler.ssl.ReferenceCountedOpenSslEngine$DefaultOpenSslSession.getPeerCertificateChain(ReferenceCountedOpenSslEngine.java:2414)
>         at 
> io.netty.handler.ssl.ExtendedOpenSslSession.getPeerCertificateChain(ExtendedOpenSslSession.java:140)
>         at 
> org.apache.cassandra.transport.ServerConnection.certificates(ServerConnection.java:136)
>         at 
> org.apache.cassandra.transport.ServerConnection.getSaslNegotiator(ServerConnection.java:120)
>         at 
> org.apache.cassandra.transport.messages.AuthResponse.execute(AuthResponse.java:76)
>         at 
> org.apache.cassandra.transport.Message$Request.execute(Message.java:255)
>         at 
> org.apache.cassandra.transport.Dispatcher.processRequest(Dispatcher.java:166)
>         at 
> org.apache.cassandra.transport.Dispatcher.processRequest(Dispatcher.java:185)
>         at 
> org.apache.cassandra.transport.Dispatcher.processRequest(Dispatcher.java:212)
>         at 
> org.apache.cassandra.transport.Dispatcher$RequestProcessor.run(Dispatcher.java:109)
>         at 
> org.apache.cassandra.concurrent.FutureTask$1.call(FutureTask.java:96)
>         at org.apache.cassandra.concurrent.FutureTask.call(FutureTask.java:61)
>         at org.apache.cassandra.concurrent.FutureTask.run(FutureTask.java:71)
>         at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:142)
>         at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>  {code}
> *Our SSL config:*
> {code:java}
> client_encryption_options:
>   enabled: true
>   keystore: /path/to/keystore
>   keystore_password: x
>   optional: false
>   require_client_auth: false {code}
>  
> We should stop throwing this msg when require_client_auth is set to false. Or 
> at least it should be logged in TRACE not DEBUG. 
> I'm working on preparing a PR. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19508) Getting tons of msgs "Failed to get peer certificates for peer /x.x.x.x:45796" when require_client_auth is set to false

2024-04-01 Thread Mohammad Aburadeh (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832929#comment-17832929
 ] 

Mohammad Aburadeh edited comment on CASSANDRA-19508 at 4/1/24 7:05 PM:
---

I submitted PR for this issue. I tested it on our clusters, it worked well.
My fix is simple, it just disabled checking peer certificates if 
require_client_auth is disabled. The other option is logging the msg in TRACE 
not DEBUG. 
||Branch||
|[4.0|https://github.com/apache/cassandra/pull/3219]|
|[4.1\|[https://github.com/apache/cassandra/pull/3216]]|
|[5.0\|[https://github.com/apache/cassandra/pull/3216]7]|
|[Trunk\|[https://github.com/apache/cassandra/pull/3218|https://github.com/apache/cassandra/pull/3216]]|
| |

Please review and let me know. 

 


was (Author: JIRAUSER287854):
I submitted PR for this issue. I tested it on our clusters, it worked well.
My fix is simple, it just disabled checking peer certificates if 
require_client_auth is disabled. The other option is logging the msg in TRACE 
not DEBUG. 
||Branch||
|[4.0\|https://github.com/apache/cassandra/pull/3219]|
|[4.1\|[https://github.com/apache/cassandra/pull/3216]]|
|[5.0\|[https://github.com/apache/cassandra/pull/3216]7]|
|[Trunk\|[https://github.com/apache/cassandra/pull/3218|https://github.com/apache/cassandra/pull/3216]]|
| |

Please review and let me know. 

 

> Getting tons of msgs "Failed to get peer certificates for peer 
> /x.x.x.x:45796" when require_client_auth is set to false
> ---
>
> Key: CASSANDRA-19508
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19508
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Encryption
>Reporter: Mohammad Aburadeh
>Priority: Urgent
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> We recently upgraded our production clusters from 3.11.15 to 4.1.4. We 
> started seeing thousands of msgs "Failed to get peer certificates for peer 
> /x.x.x.x:45796". SSL is enabled but require_client_auth is disabled.  This is 
> causing a huge problem for us because cassandra log files are growing very 
> fast as our connections are short live connections, we open more than 1K 
> connections per second and they stay live for 1-2 seconds. 
> {code:java}
> DEBUG [Native-Transport-Requests-2] 2024-03-31 21:26:38,026 
> ServerConnection.java:140 - Failed to get peer certificates for peer 
> /172.31.2.23:45796
> javax.net.ssl.SSLPeerUnverifiedException: peer not verified
>         at 
> io.netty.handler.ssl.ReferenceCountedOpenSslEngine$DefaultOpenSslSession.getPeerCertificateChain(ReferenceCountedOpenSslEngine.java:2414)
>         at 
> io.netty.handler.ssl.ExtendedOpenSslSession.getPeerCertificateChain(ExtendedOpenSslSession.java:140)
>         at 
> org.apache.cassandra.transport.ServerConnection.certificates(ServerConnection.java:136)
>         at 
> org.apache.cassandra.transport.ServerConnection.getSaslNegotiator(ServerConnection.java:120)
>         at 
> org.apache.cassandra.transport.messages.AuthResponse.execute(AuthResponse.java:76)
>         at 
> org.apache.cassandra.transport.Message$Request.execute(Message.java:255)
>         at 
> org.apache.cassandra.transport.Dispatcher.processRequest(Dispatcher.java:166)
>         at 
> org.apache.cassandra.transport.Dispatcher.processRequest(Dispatcher.java:185)
>         at 
> org.apache.cassandra.transport.Dispatcher.processRequest(Dispatcher.java:212)
>         at 
> org.apache.cassandra.transport.Dispatcher$RequestProcessor.run(Dispatcher.java:109)
>         at 
> org.apache.cassandra.concurrent.FutureTask$1.call(FutureTask.java:96)
>         at org.apache.cassandra.concurrent.FutureTask.call(FutureTask.java:61)
>         at org.apache.cassandra.concurrent.FutureTask.run(FutureTask.java:71)
>         at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:142)
>         at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>  {code}
> *Our SSL config:*
> {code:java}
> client_encryption_options:
>   enabled: true
>   keystore: /path/to/keystore
>   keystore_password: x
>   optional: false
>   require_client_auth: false {code}
>  
> We should stop throwing this msg when require_client_auth is set to false. Or 
> at least it should be logged in TRACE not DEBUG. 
> I'm working on preparing a PR. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19508) Getting tons of msgs "Failed to get peer certificates for peer /x.x.x.x:45796" when require_client_auth is set to false

2024-04-01 Thread Mohammad Aburadeh (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832929#comment-17832929
 ] 

Mohammad Aburadeh edited comment on CASSANDRA-19508 at 4/1/24 7:05 PM:
---

I submitted PR for this issue. I tested it on our clusters, it worked well.
My fix is simple, it just disabled checking peer certificates if 
require_client_auth is disabled. The other option is logging the msg in TRACE 
not DEBUG. 
||Branch||
|[4.0\|https://github.com/apache/cassandra/pull/3219]|
|[4.1\|[https://github.com/apache/cassandra/pull/3216]]|
|[5.0\|[https://github.com/apache/cassandra/pull/3216]7]|
|[Trunk\|[https://github.com/apache/cassandra/pull/3218|https://github.com/apache/cassandra/pull/3216]]|
| |

Please review and let me know. 

 


was (Author: JIRAUSER287854):
I submitted PR for this issue. I tested it on our clusters, it worked well.
My fix is simple, it just disabled checking peer certificates if 
require_client_auth is disabled. The other option is logging the msg in TRACE 
not DEBUG. 
||Branch||
|[4.1\|[https://github.com/apache/cassandra/pull/3216]]|
|[5.0\|[https://github.com/apache/cassandra/pull/3216]7]|
|[Trunk\|[https://github.com/apache/cassandra/pull/3218|https://github.com/apache/cassandra/pull/3216]]|
| |

Please review and let me know. 

 

> Getting tons of msgs "Failed to get peer certificates for peer 
> /x.x.x.x:45796" when require_client_auth is set to false
> ---
>
> Key: CASSANDRA-19508
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19508
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Encryption
>Reporter: Mohammad Aburadeh
>Priority: Urgent
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> We recently upgraded our production clusters from 3.11.15 to 4.1.4. We 
> started seeing thousands of msgs "Failed to get peer certificates for peer 
> /x.x.x.x:45796". SSL is enabled but require_client_auth is disabled.  This is 
> causing a huge problem for us because cassandra log files are growing very 
> fast as our connections are short live connections, we open more than 1K 
> connections per second and they stay live for 1-2 seconds. 
> {code:java}
> DEBUG [Native-Transport-Requests-2] 2024-03-31 21:26:38,026 
> ServerConnection.java:140 - Failed to get peer certificates for peer 
> /172.31.2.23:45796
> javax.net.ssl.SSLPeerUnverifiedException: peer not verified
>         at 
> io.netty.handler.ssl.ReferenceCountedOpenSslEngine$DefaultOpenSslSession.getPeerCertificateChain(ReferenceCountedOpenSslEngine.java:2414)
>         at 
> io.netty.handler.ssl.ExtendedOpenSslSession.getPeerCertificateChain(ExtendedOpenSslSession.java:140)
>         at 
> org.apache.cassandra.transport.ServerConnection.certificates(ServerConnection.java:136)
>         at 
> org.apache.cassandra.transport.ServerConnection.getSaslNegotiator(ServerConnection.java:120)
>         at 
> org.apache.cassandra.transport.messages.AuthResponse.execute(AuthResponse.java:76)
>         at 
> org.apache.cassandra.transport.Message$Request.execute(Message.java:255)
>         at 
> org.apache.cassandra.transport.Dispatcher.processRequest(Dispatcher.java:166)
>         at 
> org.apache.cassandra.transport.Dispatcher.processRequest(Dispatcher.java:185)
>         at 
> org.apache.cassandra.transport.Dispatcher.processRequest(Dispatcher.java:212)
>         at 
> org.apache.cassandra.transport.Dispatcher$RequestProcessor.run(Dispatcher.java:109)
>         at 
> org.apache.cassandra.concurrent.FutureTask$1.call(FutureTask.java:96)
>         at org.apache.cassandra.concurrent.FutureTask.call(FutureTask.java:61)
>         at org.apache.cassandra.concurrent.FutureTask.run(FutureTask.java:71)
>         at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:142)
>         at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>  {code}
> *Our SSL config:*
> {code:java}
> client_encryption_options:
>   enabled: true
>   keystore: /path/to/keystore
>   keystore_password: x
>   optional: false
>   require_client_auth: false {code}
>  
> We should stop throwing this msg when require_client_auth is set to false. Or 
> at least it should be logged in TRACE not DEBUG. 
> I'm working on preparing a PR. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19508) Getting tons of msgs "Failed to get peer certificates for peer /x.x.x.x:45796" when require_client_auth is set to false

2024-04-01 Thread Mohammad Aburadeh (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832929#comment-17832929
 ] 

Mohammad Aburadeh edited comment on CASSANDRA-19508 at 4/1/24 7:00 PM:
---

I submitted PR for this issue. I tested it on our clusters, it worked well.
My fix is simple, it just disabled checking peer certificates if 
require_client_auth is disabled. The other option is logging the msg in TRACE 
not DEBUG. 
||Branch||
|[4.1\|[https://github.com/apache/cassandra/pull/3216]]|
|[5.0\|[https://github.com/apache/cassandra/pull/3216]7]|
|[Trunk\|[https://github.com/apache/cassandra/pull/3218|https://github.com/apache/cassandra/pull/3216]]|
| |

Please review and let me know. 

 


was (Author: JIRAUSER287854):
I submitted PR for this issue. I tested it on our clusters, it worked well.
My fix is simple, it just disabled checking peer certificates if 
require_client_auth is disabled. The other option is logging the msg in TRACE 
not DEBUG. 


||Branch||
|[4.1\|[https://github.com/apache/cassandra/pull/3216]|
|[5.0\|https://github.com/apache/cassandra/pull/3217]|
|[Trunk\|https://github.com/apache/cassandra/pull/3218]|


Please review and let me know. 

 

> Getting tons of msgs "Failed to get peer certificates for peer 
> /x.x.x.x:45796" when require_client_auth is set to false
> ---
>
> Key: CASSANDRA-19508
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19508
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Encryption
>Reporter: Mohammad Aburadeh
>Priority: Urgent
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> We recently upgraded our production clusters from 3.11.15 to 4.1.4. We 
> started seeing thousands of msgs "Failed to get peer certificates for peer 
> /x.x.x.x:45796". SSL is enabled but require_client_auth is disabled.  This is 
> causing a huge problem for us because cassandra log files are growing very 
> fast as our connections are short live connections, we open more than 1K 
> connections per second and they stay live for 1-2 seconds. 
> {code:java}
> DEBUG [Native-Transport-Requests-2] 2024-03-31 21:26:38,026 
> ServerConnection.java:140 - Failed to get peer certificates for peer 
> /172.31.2.23:45796
> javax.net.ssl.SSLPeerUnverifiedException: peer not verified
>         at 
> io.netty.handler.ssl.ReferenceCountedOpenSslEngine$DefaultOpenSslSession.getPeerCertificateChain(ReferenceCountedOpenSslEngine.java:2414)
>         at 
> io.netty.handler.ssl.ExtendedOpenSslSession.getPeerCertificateChain(ExtendedOpenSslSession.java:140)
>         at 
> org.apache.cassandra.transport.ServerConnection.certificates(ServerConnection.java:136)
>         at 
> org.apache.cassandra.transport.ServerConnection.getSaslNegotiator(ServerConnection.java:120)
>         at 
> org.apache.cassandra.transport.messages.AuthResponse.execute(AuthResponse.java:76)
>         at 
> org.apache.cassandra.transport.Message$Request.execute(Message.java:255)
>         at 
> org.apache.cassandra.transport.Dispatcher.processRequest(Dispatcher.java:166)
>         at 
> org.apache.cassandra.transport.Dispatcher.processRequest(Dispatcher.java:185)
>         at 
> org.apache.cassandra.transport.Dispatcher.processRequest(Dispatcher.java:212)
>         at 
> org.apache.cassandra.transport.Dispatcher$RequestProcessor.run(Dispatcher.java:109)
>         at 
> org.apache.cassandra.concurrent.FutureTask$1.call(FutureTask.java:96)
>         at org.apache.cassandra.concurrent.FutureTask.call(FutureTask.java:61)
>         at org.apache.cassandra.concurrent.FutureTask.run(FutureTask.java:71)
>         at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:142)
>         at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>  {code}
> *Our SSL config:*
> {code:java}
> client_encryption_options:
>   enabled: true
>   keystore: /path/to/keystore
>   keystore_password: x
>   optional: false
>   require_client_auth: false {code}
>  
> We should stop throwing this msg when require_client_auth is set to false. Or 
> at least it should be logged in TRACE not DEBUG. 
> I'm working on preparing a PR. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19508) Getting tons of msgs "Failed to get peer certificates for peer /x.x.x.x:45796" when require_client_auth is set to false

2024-04-01 Thread Mohammad Aburadeh (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832929#comment-17832929
 ] 

Mohammad Aburadeh commented on CASSANDRA-19508:
---

I submitted PR for this issue. I tested it on our clusters, it worked well.
My fix is simple, it just disabled checking peer certificates if 
require_client_auth is disabled. The other option is logging the msg in TRACE 
not DEBUG. 


||Branch||
|[4.1\|[https://github.com/apache/cassandra/pull/3216]|
|[5.0\|https://github.com/apache/cassandra/pull/3217]|
|[Trunk\|https://github.com/apache/cassandra/pull/3218]|


Please review and let me know. 

 

> Getting tons of msgs "Failed to get peer certificates for peer 
> /x.x.x.x:45796" when require_client_auth is set to false
> ---
>
> Key: CASSANDRA-19508
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19508
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Encryption
>Reporter: Mohammad Aburadeh
>Priority: Urgent
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> We recently upgraded our production clusters from 3.11.15 to 4.1.4. We 
> started seeing thousands of msgs "Failed to get peer certificates for peer 
> /x.x.x.x:45796". SSL is enabled but require_client_auth is disabled.  This is 
> causing a huge problem for us because cassandra log files are growing very 
> fast as our connections are short live connections, we open more than 1K 
> connections per second and they stay live for 1-2 seconds. 
> {code:java}
> DEBUG [Native-Transport-Requests-2] 2024-03-31 21:26:38,026 
> ServerConnection.java:140 - Failed to get peer certificates for peer 
> /172.31.2.23:45796
> javax.net.ssl.SSLPeerUnverifiedException: peer not verified
>         at 
> io.netty.handler.ssl.ReferenceCountedOpenSslEngine$DefaultOpenSslSession.getPeerCertificateChain(ReferenceCountedOpenSslEngine.java:2414)
>         at 
> io.netty.handler.ssl.ExtendedOpenSslSession.getPeerCertificateChain(ExtendedOpenSslSession.java:140)
>         at 
> org.apache.cassandra.transport.ServerConnection.certificates(ServerConnection.java:136)
>         at 
> org.apache.cassandra.transport.ServerConnection.getSaslNegotiator(ServerConnection.java:120)
>         at 
> org.apache.cassandra.transport.messages.AuthResponse.execute(AuthResponse.java:76)
>         at 
> org.apache.cassandra.transport.Message$Request.execute(Message.java:255)
>         at 
> org.apache.cassandra.transport.Dispatcher.processRequest(Dispatcher.java:166)
>         at 
> org.apache.cassandra.transport.Dispatcher.processRequest(Dispatcher.java:185)
>         at 
> org.apache.cassandra.transport.Dispatcher.processRequest(Dispatcher.java:212)
>         at 
> org.apache.cassandra.transport.Dispatcher$RequestProcessor.run(Dispatcher.java:109)
>         at 
> org.apache.cassandra.concurrent.FutureTask$1.call(FutureTask.java:96)
>         at org.apache.cassandra.concurrent.FutureTask.call(FutureTask.java:61)
>         at org.apache.cassandra.concurrent.FutureTask.run(FutureTask.java:71)
>         at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:142)
>         at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>  {code}
> *Our SSL config:*
> {code:java}
> client_encryption_options:
>   enabled: true
>   keystore: /path/to/keystore
>   keystore_password: x
>   optional: false
>   require_client_auth: false {code}
>  
> We should stop throwing this msg when require_client_auth is set to false. Or 
> at least it should be logged in TRACE not DEBUG. 
> I'm working on preparing a PR. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



Re: [PR] CASSANDRA-19457: Object reference in Micrometer metrics prevent GC from reclaiming Session instances [cassandra-java-driver]

2024-04-01 Thread via GitHub


SiyaoIsHiding commented on PR #1916:
URL: 
https://github.com/apache/cassandra-java-driver/pull/1916#issuecomment-2030236325

   I just realized there is already such a method doing the same thing:
   
https://github.com/apache/cassandra-java-driver/blob/98e25040f5e69db1092ccafb6665d8e92779cc46/metrics/micrometer/src/main/java/com/datastax/oss/driver/internal/metrics/micrometer/MicrometerMetricUpdater.java#L86-L91
   I will try to use this instead


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra) branch cep-15-accord updated: AccordGens.rangeDeps did not enforce unique ranges, which caused tests to fail

2024-04-01 Thread dcapwell
This is an automated email from the ASF dual-hosted git repository.

dcapwell pushed a commit to branch cep-15-accord
in repository https://gitbox.apache.org/repos/asf/cassandra.git


The following commit(s) were added to refs/heads/cep-15-accord by this push:
 new 9c35158f6a AccordGens.rangeDeps did not enforce unique ranges, which 
caused tests to fail
9c35158f6a is described below

commit 9c35158f6a82ddda5bb9a2297855aaf03abe99bd
Author: David Capwell 
AuthorDate: Mon Apr 1 10:31:57 2024 -0700

AccordGens.rangeDeps did not enforce unique ranges, which caused tests to 
fail
---
 modules/accord | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/modules/accord b/modules/accord
index 1a5cb4f100..8b4f3895cb 16
--- a/modules/accord
+++ b/modules/accord
@@ -1 +1 @@
-Subproject commit 1a5cb4f10002fb3650ad464b3a77664f18e2a901
+Subproject commit 8b4f3895cb926f937450676b1db2e23d01a8b820


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra-accord) branch trunk updated: AccordGens.rangeDeps did not enforce unique ranges, which caused tests to fail

2024-04-01 Thread dcapwell
This is an automated email from the ASF dual-hosted git repository.

dcapwell pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra-accord.git


The following commit(s) were added to refs/heads/trunk by this push:
 new 8b4f389  AccordGens.rangeDeps did not enforce unique ranges, which 
caused tests to fail
8b4f389 is described below

commit 8b4f3895cb926f937450676b1db2e23d01a8b820
Author: David Capwell 
AuthorDate: Mon Apr 1 10:31:33 2024 -0700

AccordGens.rangeDeps did not enforce unique ranges, which caused tests to 
fail
---
 accord-core/src/test/java/accord/utils/AccordGens.java | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/accord-core/src/test/java/accord/utils/AccordGens.java 
b/accord-core/src/test/java/accord/utils/AccordGens.java
index d87c2d5..971c70e 100644
--- a/accord-core/src/test/java/accord/utils/AccordGens.java
+++ b/accord-core/src/test/java/accord/utils/AccordGens.java
@@ -403,9 +403,10 @@ public class AccordGens
 return rs -> {
 if (rs.decide(emptyProb)) return RangeDeps.NONE;
 RangeDeps.Builder builder = RangeDeps.builder();
-for (int i = 0, numKeys = rs.nextInt(1, 10); i < numKeys; i++)
+List uniqRanges = 
Gens.lists(rangeGen).uniqueBestEffort().ofSize(rs.nextInt(1, 10)).next(rs);
+for (Range range : uniqRanges)
 {
-builder.nextKey(rangeGen.next(rs));
+builder.nextKey(range);
 for (int j = 0, numTxn = rs.nextInt(1, 10); j < numTxn; j++)
 builder.add(idGen.next(rs));
 }


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19503) (Accord) Cassandra bootstrap no longer using the range txn and instead uses the sync point empty txn for reads

2024-04-01 Thread David Capwell (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-19503:
--
  Fix Version/s: 5.1
 (was: 5.x)
  Since Version: NA
Source Control Link: 
https://github.com/apache/cassandra/commit/63f39562ec2f1da182034c24eeb1e7bef29749ec
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> (Accord) Cassandra bootstrap no longer using the range txn and instead uses 
> the sync point empty txn for reads
> --
>
> Key: CASSANDRA-19503
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19503
> Project: Cassandra
>  Issue Type: Bug
>  Components: Accord
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 5.1
>
>
> Ephemeral reads made a change to ReadData which caused Bootstrap to no longer 
> use the range txn it generates and instead uses the empty txn from the sync 
> point.  This was not detected in Accord due to ListRead supporting ranges, 
> and failed in Cassandra as we don’t support range reads.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



(cassandra) branch cep-15-accord updated: (Accord) Cassandra bootstrap no longer using the range txn and instead uses the sync point empty txn for reads

2024-04-01 Thread dcapwell
This is an automated email from the ASF dual-hosted git repository.

dcapwell pushed a commit to branch cep-15-accord
in repository https://gitbox.apache.org/repos/asf/cassandra.git


The following commit(s) were added to refs/heads/cep-15-accord by this push:
 new 63f39562ec (Accord) Cassandra bootstrap no longer using the range txn 
and instead uses the sync point empty txn for reads
63f39562ec is described below

commit 63f39562ec2f1da182034c24eeb1e7bef29749ec
Author: David Capwell 
AuthorDate: Mon Apr 1 10:16:27 2024 -0700

(Accord) Cassandra bootstrap no longer using the range txn and instead uses 
the sync point empty txn for reads

patch by David Capwell; reviewed by Blake Eggleston for CASSANDRA-19503
---
 modules/accord |  2 +-
 .../cassandra/service/accord/AccordJournal.java| 10 ++
 .../service/accord/AccordMessageSink.java  |  2 ++
 .../test/accord/AccordBootstrapTest.java   |  2 +-
 .../cassandra/service/accord/MockJournal.java  | 39 --
 5 files changed, 42 insertions(+), 13 deletions(-)

diff --git a/modules/accord b/modules/accord
index f78d1da27b..1a5cb4f100 16
--- a/modules/accord
+++ b/modules/accord
@@ -1 +1 @@
-Subproject commit f78d1da27b09f89417dd29bde0529f12cd744e3d
+Subproject commit 1a5cb4f10002fb3650ad464b3a77664f18e2a901
diff --git a/src/java/org/apache/cassandra/service/accord/AccordJournal.java 
b/src/java/org/apache/cassandra/service/accord/AccordJournal.java
index 0562da1139..b659cf4733 100644
--- a/src/java/org/apache/cassandra/service/accord/AccordJournal.java
+++ b/src/java/org/apache/cassandra/service/accord/AccordJournal.java
@@ -1408,6 +1408,7 @@ public class AccordJournal implements IJournal, 
Shutdownable
 return presentMessages;
 }
 
+@Override
 public Set all()
 {
 Set types = EnumSet.allOf(Type.class);
@@ -1514,6 +1515,15 @@ public class AccordJournal implements IJournal, 
Shutdownable
 return confirmed;
 }
 
+@Override
+public Set all()
+{
+logger.debug("Checking all messages for {}", txnId);
+Set confirmed = provider.all();
+logger.debug("Confirmed {} messages for {}", confirmed, txnId);
+return confirmed;
+}
+
 @Override
 public PreAccept preAccept()
 {
diff --git 
a/src/java/org/apache/cassandra/service/accord/AccordMessageSink.java 
b/src/java/org/apache/cassandra/service/accord/AccordMessageSink.java
index d72644811a..5a514219e3 100644
--- a/src/java/org/apache/cassandra/service/accord/AccordMessageSink.java
+++ b/src/java/org/apache/cassandra/service/accord/AccordMessageSink.java
@@ -126,6 +126,8 @@ public class AccordMessageSink implements MessageSink
 builder.put(MessageType.GET_DEPS_RSP, 
Verb.ACCORD_GET_DEPS_RSP);
 builder.put(MessageType.GET_EPHEMERAL_READ_DEPS_REQ,  
Verb.ACCORD_GET_EPHMRL_READ_DEPS_REQ);
 builder.put(MessageType.GET_EPHEMERAL_READ_DEPS_RSP,  
Verb.ACCORD_GET_EPHMRL_READ_DEPS_RSP);
+builder.put(MessageType.GET_MAX_CONFLICT_REQ, 
Verb.ACCORD_GET_MAX_CONFLICT_REQ);
+builder.put(MessageType.GET_MAX_CONFLICT_RSP, 
Verb.ACCORD_GET_MAX_CONFLICT_RSP);
 builder.put(MessageType.COMMIT_SLOW_PATH_REQ, 
Verb.ACCORD_COMMIT_REQ);
 builder.put(MessageType.COMMIT_MAXIMAL_REQ,   
Verb.ACCORD_COMMIT_REQ);
 builder.put(MessageType.STABLE_FAST_PATH_REQ, 
Verb.ACCORD_COMMIT_REQ);
diff --git 
a/test/distributed/org/apache/cassandra/distributed/test/accord/AccordBootstrapTest.java
 
b/test/distributed/org/apache/cassandra/distributed/test/accord/AccordBootstrapTest.java
index f040e9d4db..2241a8c911 100644
--- 
a/test/distributed/org/apache/cassandra/distributed/test/accord/AccordBootstrapTest.java
+++ 
b/test/distributed/org/apache/cassandra/distributed/test/accord/AccordBootstrapTest.java
@@ -91,7 +91,7 @@ public class AccordBootstrapTest extends TestBaseImpl
 //withProperty(BOOTSTRAP_SCHEMA_DELAY_MS.getKey(), Integer.toString(90 
* 1000),
 // () -> withProperty("cassandra.join_ring", false, () -> 
newInstance.startup(cluster)));
 //newInstance.nodetoolResult("join").asserts().success();
-newInstance.nodetoolResult("describecms").asserts().success(); // just 
make sure we're joined, remove later
+newInstance.nodetoolResult("cms", "describe").asserts().success(); // 
just make sure we're joined, remove later
 }
 
 private static AccordService service()
diff --git a/test/unit/org/apache/cassandra/service/accord/MockJournal.java 
b/test/unit/org/apache/cassandra/service/accord/MockJournal.java
index 575b996e1e..8a68163ede 100644
--- 

(cassandra-accord) branch trunk updated: (Accord) Cassandra bootstrap no longer using the range txn and instead uses the sync point empty txn for reads

2024-04-01 Thread dcapwell
This is an automated email from the ASF dual-hosted git repository.

dcapwell pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra-accord.git


The following commit(s) were added to refs/heads/trunk by this push:
 new 1a5cb4f  (Accord) Cassandra bootstrap no longer using the range txn 
and instead uses the sync point empty txn for reads
1a5cb4f is described below

commit 1a5cb4f10002fb3650ad464b3a77664f18e2a901
Author: David Capwell 
AuthorDate: Mon Apr 1 10:14:04 2024 -0700

(Accord) Cassandra bootstrap no longer using the range txn and instead uses 
the sync point empty txn for reads

patch by David Capwell; reviewed by Blake Eggleston for CASSANDRA-19503
---
 .../java/accord/coordinate/FetchMaxConflict.java   |  2 +-
 .../java/accord/impl/AbstractFetchCoordinator.java | 30 +++--
 .../main/java/accord/local/SerializerSupport.java  | 33 ++
 .../main/java/accord/messages/GetMaxConflict.java  |  4 +-
 .../src/main/java/accord/messages/MessageType.java |  2 +
 .../messages/WaitUntilAppliedAndReadData.java  | 51 --
 .../main/java/accord/topology/TopologyManager.java |  7 ++-
 .../src/main/java/accord/utils/Invariants.java |  7 ++-
 .../main/java/accord/utils/async/AsyncResults.java |  7 +--
 .../src/test/java/accord/impl/list/ListRead.java   |  4 ++
 10 files changed, 77 insertions(+), 70 deletions(-)

diff --git a/accord-core/src/main/java/accord/coordinate/FetchMaxConflict.java 
b/accord-core/src/main/java/accord/coordinate/FetchMaxConflict.java
index 0793ae7..5963ebd 100644
--- a/accord-core/src/main/java/accord/coordinate/FetchMaxConflict.java
+++ b/accord-core/src/main/java/accord/coordinate/FetchMaxConflict.java
@@ -85,7 +85,7 @@ public class FetchMaxConflict extends 
AbstractCoordinatePreAccept nodes, Topologies topologies, 
Callback callback)
 {
-node.send(nodes, to -> new GetMaxConflict(to, topologies, route, 
keysOrRanges, executionEpoch));
+node.send(nodes, to -> new GetMaxConflict(to, topologies, route, 
keysOrRanges, executionEpoch), callback);
 }
 
 @Override
diff --git 
a/accord-core/src/main/java/accord/impl/AbstractFetchCoordinator.java 
b/accord-core/src/main/java/accord/impl/AbstractFetchCoordinator.java
index 23de1e2..a88c541 100644
--- a/accord-core/src/main/java/accord/impl/AbstractFetchCoordinator.java
+++ b/accord-core/src/main/java/accord/impl/AbstractFetchCoordinator.java
@@ -23,6 +23,9 @@ import java.util.HashMap;
 import java.util.List;
 import java.util.Map;
 
+import accord.local.SafeCommandStore;
+import accord.messages.ReadData;
+import accord.utils.async.AsyncChain;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
@@ -37,7 +40,6 @@ import accord.messages.MessageType;
 import accord.messages.ReadData.CommitOrReadNack;
 import accord.messages.ReadData.ReadOk;
 import accord.messages.ReadData.ReadReply;
-import accord.messages.WaitUntilAppliedAndReadData;
 import accord.primitives.PartialDeps;
 import accord.primitives.PartialTxn;
 import accord.primitives.Ranges;
@@ -50,6 +52,7 @@ import accord.utils.async.AsyncResult;
 import accord.utils.async.AsyncResults;
 import javax.annotation.Nullable;
 
+import static accord.local.SaveStatus.Applied;
 import static accord.messages.ReadData.CommitOrReadNack.Insufficient;
 import static accord.primitives.Routables.Slice.Minimal;
 
@@ -231,16 +234,37 @@ public abstract class AbstractFetchCoordinator extends 
FetchCoordinator
 // TODO (expected): implement abort
 }
 
-public static class FetchRequest extends WaitUntilAppliedAndReadData
+public static class FetchRequest extends ReadData
 {
+private static final ExecuteOn EXECUTE_ON = new ExecuteOn(Applied, 
Applied);
+public final PartialTxn read;
+
 public final PartialDeps partialDeps;
 
 public FetchRequest(long sourceEpoch, TxnId syncId, Ranges ranges, 
PartialDeps partialDeps, PartialTxn partialTxn)
 {
-super(syncId, ranges, sourceEpoch, partialTxn);
+super(syncId, ranges, sourceEpoch);
+this.read = partialTxn;
 this.partialDeps = partialDeps;
 }
 
+@Override
+protected ExecuteOn executeOn()
+{
+return EXECUTE_ON;
+}
+
+@Override
+public ReadType kind()
+{
+return ReadType.waitUntilApplied;
+}
+
+@Override
+protected AsyncChain beginRead(SafeCommandStore safeStore, 
Timestamp executeAt, PartialTxn txn, Ranges unavailable) {
+return read.read(safeStore, executeAt, unavailable);
+}
+
 @Override
 protected void readComplete(CommandStore commandStore, Data result, 
Ranges unavailable)
 {
diff --git a/accord-core/src/main/java/accord/local/SerializerSupport.java 
b/accord-core/src/main/java/accord/local/SerializerSupport.java
index 962e7d9..7ca59e0 100644
--- 

[jira] [Updated] (CASSANDRA-19503) (Accord) Cassandra bootstrap no longer using the range txn and instead uses the sync point empty txn for reads

2024-04-01 Thread Blake Eggleston (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-19503:

Status: Ready to Commit  (was: Review In Progress)

+1

> (Accord) Cassandra bootstrap no longer using the range txn and instead uses 
> the sync point empty txn for reads
> --
>
> Key: CASSANDRA-19503
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19503
> Project: Cassandra
>  Issue Type: Bug
>  Components: Accord
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 5.x
>
>
> Ephemeral reads made a change to ReadData which caused Bootstrap to no longer 
> use the range txn it generates and instead uses the empty txn from the sync 
> point.  This was not detected in Accord due to ListRead supporting ranges, 
> and failed in Cassandra as we don’t support range reads.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19503) (Accord) Cassandra bootstrap no longer using the range txn and instead uses the sync point empty txn for reads

2024-04-01 Thread Blake Eggleston (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-19503:

Reviewers: Blake Eggleston, Blake Eggleston  (was: Blake Eggleston)
   Blake Eggleston, Blake Eggleston  (was: Blake Eggleston)
   Status: Review In Progress  (was: Patch Available)

> (Accord) Cassandra bootstrap no longer using the range txn and instead uses 
> the sync point empty txn for reads
> --
>
> Key: CASSANDRA-19503
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19503
> Project: Cassandra
>  Issue Type: Bug
>  Components: Accord
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 5.x
>
>
> Ephemeral reads made a change to ReadData which caused Bootstrap to no longer 
> use the range txn it generates and instead uses the empty txn from the sync 
> point.  This was not detected in Accord due to ListRead supporting ranges, 
> and failed in Cassandra as we don’t support range reads.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19448) CommitlogArchiver only has granularity to seconds for restore_point_in_time

2024-04-01 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-19448:
-
Status: Open  (was: Patch Available)

> CommitlogArchiver only has granularity to seconds for restore_point_in_time
> ---
>
> Key: CASSANDRA-19448
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19448
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log
>Reporter: Jeremy Hanna
>Assignee: Maxwell Guo
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Commitlog archiver allows users to backup commitlog files for the purpose of 
> doing point in time restores.  The [configuration 
> file|https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties]
>  gives an example of down to the seconds granularity but then asks what 
> whether the timestamps are microseconds or milliseconds - defaulting to 
> microseconds.  Because the [CommitLogArchiver uses a second based date 
> format|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogArchiver.java#L52],
>  if a user specifies to restore at something at a lower granularity like 
> milliseconds or microseconds, that means that the it will truncate everything 
> after the second and restore to that second.  So say you specify a 
> restore_point_in_time like this:
> restore_point_in_time=2024:01:18 17:01:01.623392
> it will silently truncate everything after the 01 seconds.  So effectively to 
> the user, it is missing updates between 01 and 01.623392.
> This appears to be a bug in the intent.  We should allow users to specify 
> down to the millisecond or even microsecond level. If we allow them to 
> specify down to microseconds for the restore point in time, then it may 
> internally need to change from a long.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19428) Clean up KeyRangeIterator classes

2024-04-01 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-19428:

Attachment: 
Make_sure_the_builders_attach_the_onClose_hook_when_there_is_only_a_single_sub-iterator.patch

> Clean up KeyRangeIterator classes
> -
>
> Key: CASSANDRA-19428
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19428
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/2i Index
>Reporter: Ekaterina Dimitrova
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
> Attachments: 
> Make_sure_the_builders_attach_the_onClose_hook_when_there_is_only_a_single_sub-iterator.patch
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Remove KeyRangeIterator.current and simplify



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19444) AccordRepairJob should be async like CassandraRepairJob

2024-04-01 Thread Ariel Weisberg (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832886#comment-17832886
 ] 

Ariel Weisberg commented on CASSANDRA-19444:


Blake will be fixing this in CASSANDRA-19472

> AccordRepairJob should be async like CassandraRepairJob
> ---
>
> Key: CASSANDRA-19444
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19444
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Ariel Weisberg
>Priority: Normal
>
> The thread that manages repairs needs to be available and not block.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19444) AccordRepairJob should be async like CassandraRepairJob

2024-04-01 Thread Ariel Weisberg (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ariel Weisberg updated CASSANDRA-19444:
---
Resolution: Fixed
Status: Resolved  (was: Triage Needed)

> AccordRepairJob should be async like CassandraRepairJob
> ---
>
> Key: CASSANDRA-19444
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19444
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Ariel Weisberg
>Priority: Normal
>
> The thread that manages repairs needs to be available and not block.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19448) CommitlogArchiver only has granularity to seconds for restore_point_in_time

2024-04-01 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832879#comment-17832879
 ] 

Brandon Williams commented on CASSANDRA-19448:
--

bq. What I wonder is, in which scenarios would microsecond-level PIT restore 
would be useful?

I think there could be a  DR scenario where a delete was accidentally issued 
and to easily excise it the restore could be done to just before the delete and 
then just after it.  There may be other scenarios as well, but my main point 
here is time synchronization may not be necessary for the precision level.

bq. couldn't we detect automatically the granularity of PIT restore based on 
the value 

This seems viable to me and the most user-friendly.


> CommitlogArchiver only has granularity to seconds for restore_point_in_time
> ---
>
> Key: CASSANDRA-19448
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19448
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log
>Reporter: Jeremy Hanna
>Assignee: Maxwell Guo
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Commitlog archiver allows users to backup commitlog files for the purpose of 
> doing point in time restores.  The [configuration 
> file|https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties]
>  gives an example of down to the seconds granularity but then asks what 
> whether the timestamps are microseconds or milliseconds - defaulting to 
> microseconds.  Because the [CommitLogArchiver uses a second based date 
> format|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogArchiver.java#L52],
>  if a user specifies to restore at something at a lower granularity like 
> milliseconds or microseconds, that means that the it will truncate everything 
> after the second and restore to that second.  So say you specify a 
> restore_point_in_time like this:
> restore_point_in_time=2024:01:18 17:01:01.623392
> it will silently truncate everything after the 01 seconds.  So effectively to 
> the user, it is missing updates between 01 and 01.623392.
> This appears to be a bug in the intent.  We should allow users to specify 
> down to the millisecond or even microsecond level. If we allow them to 
> specify down to microseconds for the restore point in time, then it may 
> internally need to change from a long.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19448) CommitlogArchiver only has granularity to seconds for restore_point_in_time

2024-04-01 Thread Tiago L. Alves (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832876#comment-17832876
 ] 

Tiago L. Alves commented on CASSANDRA-19448:


[~maxwellguo] Thanks for providing a more general patch. I understand that C* 
timestamp allows microsecond granularity and aligning all those timestamps 
granularity sounds good.

What I wonder is, in which scenarios would microsecond-level PIT restore would 
be useful?

According to 
[https://stackoverflow.com/questions/97853/whats-the-best-way-to-synchronize-times-to-millisecond-accuracy-and-precision-b]
 it's possible to synchronize time to tenths of milliseconds. Wouldn't this 
mean that millisecond-level PIT restore would suffice? On the other hand, 
Amazon claims that's possible according to: 
[https://aws.amazon.com/blogs/compute/its-about-time-microsecond-accurate-clocks-on-amazon-ec2-instances/
 
|https://aws.amazon.com/blogs/compute/its-about-time-microsecond-accurate-clocks-on-amazon-ec2-instances/]
 opening the door for microsecond-level PIT restore.

I would be happy to learn from your thoughts on those.

Regarding the code changes, couldn't we detect automatically the granularity of 
PIT restore based on the value an user specifies? We could make our parsing 
more lenient by just doing `DateTimeFormatter.ofPattern(":MM:dd 
HH:mm:ss[.[SS][SSS]]")` allowing to parse all seconds, milliseconds, and 
microseconds. Also, if we could leverage on `Instant` datatype instead of 
`long` in `CommitLogArchiver` and postpone conversions to `CommitLogRestorer`. 
Wdyt?

> CommitlogArchiver only has granularity to seconds for restore_point_in_time
> ---
>
> Key: CASSANDRA-19448
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19448
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log
>Reporter: Jeremy Hanna
>Assignee: Maxwell Guo
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Commitlog archiver allows users to backup commitlog files for the purpose of 
> doing point in time restores.  The [configuration 
> file|https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties]
>  gives an example of down to the seconds granularity but then asks what 
> whether the timestamps are microseconds or milliseconds - defaulting to 
> microseconds.  Because the [CommitLogArchiver uses a second based date 
> format|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogArchiver.java#L52],
>  if a user specifies to restore at something at a lower granularity like 
> milliseconds or microseconds, that means that the it will truncate everything 
> after the second and restore to that second.  So say you specify a 
> restore_point_in_time like this:
> restore_point_in_time=2024:01:18 17:01:01.623392
> it will silently truncate everything after the 01 seconds.  So effectively to 
> the user, it is missing updates between 01 and 01.623392.
> This appears to be a bug in the intent.  We should allow users to specify 
> down to the millisecond or even microsecond level. If we allow them to 
> specify down to microseconds for the restore point in time, then it may 
> internally need to change from a long.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18165) Investigate removing PriorityQueue usage from KeyRangeConcatIterator

2024-04-01 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832874#comment-17832874
 ] 

Ekaterina Dimitrova commented on CASSANDRA-18165:
-

Indeed, agreed with [~maedhroz], this will be done as part of CASSANDRA-19428. 
To be committed soon  

> Investigate removing PriorityQueue usage from KeyRangeConcatIterator
> 
>
> Key: CASSANDRA-18165
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18165
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/SAI
>Reporter: Mike Adamson
>Assignee: Mike Adamson
>Priority: Normal
>  Labels: SAI
> Fix For: 5.0, 5.1
>
>
> It has been identified during the review of CASSANDRA-18058 that the 
> KeyRangeConcatIterator could potentially stop using a PriorityQueue to 
> maintain it's active list of sorted KeyRangeIterators.
> The code suggested by [~maedhroz] is as follows:
> {code:java}
> private int i = 0;
> ...
> protected void performSkipTo(PrimaryKey primaryKey)
> {
> while (i < toRelease.size())
> {
> RangeIterator currentIterator = toRelease.get(i);
> if (currentIterator.getCurrent().compareTo(primaryKey) >= 0)
> break;
> if (currentIterator.getMaximum().compareTo(primaryKey) >= 0)
> {
> currentIterator.skipTo(primaryKey);
> break;
> }
> i++;
> }
> }
> ...
> protected PrimaryKey computeNext()
> {
> while (i < toRelease.size())
> {
> RangeIterator currentIterator = toRelease.get(i);
> 
> if (currentIterator.hasNext())
> return currentIterator.next();
> 
> i++;
> }
> return endOfData();
> }
> {code}
> It was decided that this change would need performance and correctness 
> testing in it's own right would not be included in the original SAI CEP 
> ticket.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19495) Hints not stored after node goes down for the second time

2024-04-01 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832865#comment-17832865
 ] 

Brandon Williams edited comment on CASSANDRA-19495 at 4/1/24 3:18 PM:
--

bq. I see there are 3 instances of failing tests in multiplexer for trunk

I restarted them again 
[here|https://app.circleci.com/pipelines/github/driftx/cassandra/1549/workflows/eb577e9a-0937-4735-a8c9-dd932c54ec6f/jobs/80742],
 let's see what happens.

edit: they all passed, I don't think there's an issue.


was (Author: brandon.williams):
bq. I see there are 3 instances of failing tests in multiplexer for trunk

I restarted them again 
[here|https://app.circleci.com/pipelines/github/driftx/cassandra/1549/workflows/eb577e9a-0937-4735-a8c9-dd932c54ec6f/jobs/80742],
 let's see what happens.

> Hints not stored after node goes down for the second time
> -
>
> Key: CASSANDRA-19495
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19495
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Hints
>Reporter: Paul Chandler
>Assignee: Stefan Miklosovic
>Priority: Urgent
> Fix For: 4.1.x, 5.0-rc, 5.x
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> I have a scenario where a node goes down, hints are recorded on the second 
> node and replayed, as expected. If the first node goes down for a second time 
> and time span between the first time it stopped and the second time it 
> stopped is more than the max_hint_window then the hint is not recorded, no 
> hint file is created, and the mutation never arrives at the node after it 
> comes up again.
> I have debugged this and it appears to due to the way hint window is 
> persisted after https://issues.apache.org/jira/browse/CASSANDRA-14309
> The code here: 
> [https://github.com/apache/cassandra/blame/cassandra-4.1/src/java/org/apache/cassandra/service/StorageProxy.java#L2402]
>  uses the time stored in the HintsBuffer.earliestHintByHost map.  This map is 
> based on the UUID of the host, but this does not seem to be cleared when the 
> node is back up, and I think this is what is causing the problem.
>  
> This is in cassandra 4.1.5



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19495) Hints not stored after node goes down for the second time

2024-04-01 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832856#comment-17832856
 ] 

Stefan Miklosovic edited comment on CASSANDRA-19495 at 4/1/24 3:13 PM:
---

BTW this ticket ... while it does what it should, if one starts to dig deeper, 
there are some not-so-obvious consequences of how hints delivery / deletion is 
done. For example, if one calls HintsServiceMBean.deleteAllHints (or 
deleteAllHintsForEndpoint), it will indeed delete the hints, but only these 
which are already on disk and for which writer is closed. That means that it 
will not delete hints which are in buffers or for which the writer is not 
closed yet. I consider this to be a bug. If a user invokes "delete all hints", 
it should indeed delete them _ALL_, in buffers included.

The behavior described above, when it comes to this ticket, might have a 
consequence that if buffers are not cleared on deleting all hints, then once it 
goes to check if we should hint, it will look into "what is the oldest hint" in 
order to decide if we violated the threshold or not, and there will be no 
descriptors, indeed, that is fine, but buffers will not be cleared so it will 
resolve the oldest hint in buffers. So by deletion of hints, it wont be 100% 
"reset".  On the other hand, when a node comes up online, all hints will be 
delivered and earliest hint record will be removed from the underlying map. 

edit: deletion / cleanup of all hints is done in HintsService.excise(uuid), it 
does close the writer hence flushes the buffers etc ... I think this should be 
applied to normal deletion too.


was (Author: smiklosovic):
BTW this ticket ... while it does what it should, if one starts to dig deeper, 
there are some not-so-obvious consequences of how hints delivery / deletion is 
done. For example, if one calls HintsServiceMBean.deleteAllHints (or 
deleteAllHintsForEndpoint), it will indeed delete the hints, but only these 
which are already on disk and for which writer is closed. That means that it 
will not delete hints which are in buffers or for which the writer is not 
closed yet. I consider this to be a bug. If a user invokes "delete all hints", 
it should indeed delete them _ALL_, in buffers included.

The behavior described above, when it comes to this ticket, might have a 
consequence that if buffers are not cleared on deleting all hints, then once it 
goes to check if we should hint, it will look into "what is the oldest hint" in 
order to decide if we violated the threshold or not, and there will be no 
descriptors, indeed, that is fine, but buffers will not be cleared so it will 
resolve the oldest hint in buffers. So by deletion of hints, it wont be 100% 
"reset".  On the other hand, when a node comes up online, all hints will be 
delivered and earliest hint record will be removed from the underlying map. 

> Hints not stored after node goes down for the second time
> -
>
> Key: CASSANDRA-19495
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19495
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Hints
>Reporter: Paul Chandler
>Assignee: Stefan Miklosovic
>Priority: Urgent
> Fix For: 4.1.x, 5.0-rc, 5.x
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> I have a scenario where a node goes down, hints are recorded on the second 
> node and replayed, as expected. If the first node goes down for a second time 
> and time span between the first time it stopped and the second time it 
> stopped is more than the max_hint_window then the hint is not recorded, no 
> hint file is created, and the mutation never arrives at the node after it 
> comes up again.
> I have debugged this and it appears to due to the way hint window is 
> persisted after https://issues.apache.org/jira/browse/CASSANDRA-14309
> The code here: 
> [https://github.com/apache/cassandra/blame/cassandra-4.1/src/java/org/apache/cassandra/service/StorageProxy.java#L2402]
>  uses the time stored in the HintsBuffer.earliestHintByHost map.  This map is 
> based on the UUID of the host, but this does not seem to be cleared when the 
> node is back up, and I think this is what is causing the problem.
>  
> This is in cassandra 4.1.5



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19448) CommitlogArchiver only has granularity to seconds for restore_point_in_time

2024-04-01 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832870#comment-17832870
 ] 

Brandon Williams commented on CASSANDRA-19448:
--

bq. I feel that the timestamp granularity should be aligned with the timestamp 
of c*

I agree, and I think it can be argued it's a bug if it is not possible to 
restore with the same granularity as the data was created.

> CommitlogArchiver only has granularity to seconds for restore_point_in_time
> ---
>
> Key: CASSANDRA-19448
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19448
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log
>Reporter: Jeremy Hanna
>Assignee: Maxwell Guo
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Commitlog archiver allows users to backup commitlog files for the purpose of 
> doing point in time restores.  The [configuration 
> file|https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties]
>  gives an example of down to the seconds granularity but then asks what 
> whether the timestamps are microseconds or milliseconds - defaulting to 
> microseconds.  Because the [CommitLogArchiver uses a second based date 
> format|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogArchiver.java#L52],
>  if a user specifies to restore at something at a lower granularity like 
> milliseconds or microseconds, that means that the it will truncate everything 
> after the second and restore to that second.  So say you specify a 
> restore_point_in_time like this:
> restore_point_in_time=2024:01:18 17:01:01.623392
> it will silently truncate everything after the 01 seconds.  So effectively to 
> the user, it is missing updates between 01 and 01.623392.
> This appears to be a bug in the intent.  We should allow users to specify 
> down to the millisecond or even microsecond level. If we allow them to 
> specify down to microseconds for the restore point in time, then it may 
> internally need to change from a long.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18165) Investigate removing PriorityQueue usage from KeyRangeConcatIterator

2024-04-01 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-18165:

Fix Version/s: 5.0
   5.1
   (was: 5.x)

> Investigate removing PriorityQueue usage from KeyRangeConcatIterator
> 
>
> Key: CASSANDRA-18165
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18165
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/SAI
>Reporter: Mike Adamson
>Assignee: Mike Adamson
>Priority: Normal
>  Labels: SAI
> Fix For: 5.0, 5.1
>
>
> It has been identified during the review of CASSANDRA-18058 that the 
> KeyRangeConcatIterator could potentially stop using a PriorityQueue to 
> maintain it's active list of sorted KeyRangeIterators.
> The code suggested by [~maedhroz] is as follows:
> {code:java}
> private int i = 0;
> ...
> protected void performSkipTo(PrimaryKey primaryKey)
> {
> while (i < toRelease.size())
> {
> RangeIterator currentIterator = toRelease.get(i);
> if (currentIterator.getCurrent().compareTo(primaryKey) >= 0)
> break;
> if (currentIterator.getMaximum().compareTo(primaryKey) >= 0)
> {
> currentIterator.skipTo(primaryKey);
> break;
> }
> i++;
> }
> }
> ...
> protected PrimaryKey computeNext()
> {
> while (i < toRelease.size())
> {
> RangeIterator currentIterator = toRelease.get(i);
> 
> if (currentIterator.hasNext())
> return currentIterator.next();
> 
> i++;
> }
> return endOfData();
> }
> {code}
> It was decided that this change would need performance and correctness 
> testing in it's own right would not be included in the original SAI CEP 
> ticket.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18165) Investigate removing PriorityQueue usage from KeyRangeConcatIterator

2024-04-01 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-18165:

Resolution: Fixed
Status: Resolved  (was: Open)

> Investigate removing PriorityQueue usage from KeyRangeConcatIterator
> 
>
> Key: CASSANDRA-18165
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18165
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/SAI
>Reporter: Mike Adamson
>Assignee: Mike Adamson
>Priority: Normal
>  Labels: SAI
> Fix For: 5.x
>
>
> It has been identified during the review of CASSANDRA-18058 that the 
> KeyRangeConcatIterator could potentially stop using a PriorityQueue to 
> maintain it's active list of sorted KeyRangeIterators.
> The code suggested by [~maedhroz] is as follows:
> {code:java}
> private int i = 0;
> ...
> protected void performSkipTo(PrimaryKey primaryKey)
> {
> while (i < toRelease.size())
> {
> RangeIterator currentIterator = toRelease.get(i);
> if (currentIterator.getCurrent().compareTo(primaryKey) >= 0)
> break;
> if (currentIterator.getMaximum().compareTo(primaryKey) >= 0)
> {
> currentIterator.skipTo(primaryKey);
> break;
> }
> i++;
> }
> }
> ...
> protected PrimaryKey computeNext()
> {
> while (i < toRelease.size())
> {
> RangeIterator currentIterator = toRelease.get(i);
> 
> if (currentIterator.hasNext())
> return currentIterator.next();
> 
> i++;
> }
> return endOfData();
> }
> {code}
> It was decided that this change would need performance and correctness 
> testing in it's own right would not be included in the original SAI CEP 
> ticket.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19489) Guardrail to warn clients about possible transient incorrect responses for filtering queries against multiple mutable columns

2024-04-01 Thread Caleb Rackliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832869#comment-17832869
 ] 

Caleb Rackliffe commented on CASSANDRA-19489:
-

I'm going to try to get a patch up for this today or tomorrow, FWIW

> Guardrail to warn clients about possible transient incorrect responses for 
> filtering queries against multiple mutable columns
> -
>
> Key: CASSANDRA-19489
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19489
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Coordination, CQL/Semantics, Messaging/Client
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 5.0-rc, 5.1
>
>
> Given we may not have time to fully resolve CASSANDRA-19007 before we release 
> 5.0, it would still be helpful to have, at the very minimum, a client warning 
> for cases where a user filters on two or more mutable (static or regular) 
> columns at consistency levels that require coordinator reconciliation. We may 
> also want the option to fail these queries outright, although that need not 
> be the default.
> The only art involved in this is deciding what we want to say in the 
> warning/error message. It's probably reasonable to mention there that this 
> only happens when we have unrepaired data. It's also worth noting that SAI 
> queries are no longer vulnerable to this after the resolution of 
> CASSANDRA-19018.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19508) Getting tons of msgs "Failed to get peer certificates for peer /x.x.x.x:45796" when require_client_auth is set to false

2024-04-01 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-19508:
-
  Component/s: Feature/Encryption
Discovered By: User Report
Fix Version/s: 4.0.x
   4.1.x
   5.0.x
   5.x
 Severity: Normal  (was: Critical)
   Status: Open  (was: Triage Needed)

> Getting tons of msgs "Failed to get peer certificates for peer 
> /x.x.x.x:45796" when require_client_auth is set to false
> ---
>
> Key: CASSANDRA-19508
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19508
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Encryption
>Reporter: Mohammad Aburadeh
>Priority: Urgent
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>
> We recently upgraded our production clusters from 3.11.15 to 4.1.4. We 
> started seeing thousands of msgs "Failed to get peer certificates for peer 
> /x.x.x.x:45796". SSL is enabled but require_client_auth is disabled.  This is 
> causing a huge problem for us because cassandra log files are growing very 
> fast as our connections are short live connections, we open more than 1K 
> connections per second and they stay live for 1-2 seconds. 
> {code:java}
> DEBUG [Native-Transport-Requests-2] 2024-03-31 21:26:38,026 
> ServerConnection.java:140 - Failed to get peer certificates for peer 
> /172.31.2.23:45796
> javax.net.ssl.SSLPeerUnverifiedException: peer not verified
>         at 
> io.netty.handler.ssl.ReferenceCountedOpenSslEngine$DefaultOpenSslSession.getPeerCertificateChain(ReferenceCountedOpenSslEngine.java:2414)
>         at 
> io.netty.handler.ssl.ExtendedOpenSslSession.getPeerCertificateChain(ExtendedOpenSslSession.java:140)
>         at 
> org.apache.cassandra.transport.ServerConnection.certificates(ServerConnection.java:136)
>         at 
> org.apache.cassandra.transport.ServerConnection.getSaslNegotiator(ServerConnection.java:120)
>         at 
> org.apache.cassandra.transport.messages.AuthResponse.execute(AuthResponse.java:76)
>         at 
> org.apache.cassandra.transport.Message$Request.execute(Message.java:255)
>         at 
> org.apache.cassandra.transport.Dispatcher.processRequest(Dispatcher.java:166)
>         at 
> org.apache.cassandra.transport.Dispatcher.processRequest(Dispatcher.java:185)
>         at 
> org.apache.cassandra.transport.Dispatcher.processRequest(Dispatcher.java:212)
>         at 
> org.apache.cassandra.transport.Dispatcher$RequestProcessor.run(Dispatcher.java:109)
>         at 
> org.apache.cassandra.concurrent.FutureTask$1.call(FutureTask.java:96)
>         at org.apache.cassandra.concurrent.FutureTask.call(FutureTask.java:61)
>         at org.apache.cassandra.concurrent.FutureTask.run(FutureTask.java:71)
>         at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:142)
>         at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>  {code}
> *Our SSL config:*
> {code:java}
> client_encryption_options:
>   enabled: true
>   keystore: /path/to/keystore
>   keystore_password: x
>   optional: false
>   require_client_auth: false {code}
>  
> We should stop throwing this msg when require_client_auth is set to false. Or 
> at least it should be logged in TRACE not DEBUG. 
> I'm working on preparing a PR. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-19510) Analytics Writer should support all valid partition/clustering key types

2024-04-01 Thread Doug Rohrer (Jira)
Doug Rohrer created CASSANDRA-19510:
---

 Summary: Analytics Writer should support all valid 
partition/clustering key types
 Key: CASSANDRA-19510
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19510
 Project: Cassandra
  Issue Type: Improvement
Reporter: Doug Rohrer


Today, the Analytics Writer supports a subset of all valid CQL types as 
partition/clustering key values as it cannot properly serialize all of them in 
the `Tokenizer` class, which means they can’t be used as part of the 
partition/clustering key when sorting data in Spark. We should support all 
valid partition/clustering key types, including but not limited to frozen UDTs 
and collections.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19428) Clean up KeyRangeIterator classes

2024-04-01 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-19428:

Priority: Normal  (was: Low)

> Clean up KeyRangeIterator classes
> -
>
> Key: CASSANDRA-19428
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19428
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/2i Index
>Reporter: Ekaterina Dimitrova
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Remove KeyRangeIterator.current and simplify



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19428) Clean up KeyRangeIterator classes

2024-04-01 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-19428:

Change Category: Operability  (was: Code Clarity)

> Clean up KeyRangeIterator classes
> -
>
> Key: CASSANDRA-19428
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19428
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/2i Index
>Reporter: Ekaterina Dimitrova
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Remove KeyRangeIterator.current and simplify



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19495) Hints not stored after node goes down for the second time

2024-04-01 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832865#comment-17832865
 ] 

Brandon Williams commented on CASSANDRA-19495:
--

bq. I see there are 3 instances of failing tests in multiplexer for trunk

I restarted them again 
[here|https://app.circleci.com/pipelines/github/driftx/cassandra/1549/workflows/eb577e9a-0937-4735-a8c9-dd932c54ec6f/jobs/80742],
 let's see what happens.

> Hints not stored after node goes down for the second time
> -
>
> Key: CASSANDRA-19495
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19495
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Hints
>Reporter: Paul Chandler
>Assignee: Stefan Miklosovic
>Priority: Urgent
> Fix For: 4.1.x, 5.0-rc, 5.x
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> I have a scenario where a node goes down, hints are recorded on the second 
> node and replayed, as expected. If the first node goes down for a second time 
> and time span between the first time it stopped and the second time it 
> stopped is more than the max_hint_window then the hint is not recorded, no 
> hint file is created, and the mutation never arrives at the node after it 
> comes up again.
> I have debugged this and it appears to due to the way hint window is 
> persisted after https://issues.apache.org/jira/browse/CASSANDRA-14309
> The code here: 
> [https://github.com/apache/cassandra/blame/cassandra-4.1/src/java/org/apache/cassandra/service/StorageProxy.java#L2402]
>  uses the time stored in the HintsBuffer.earliestHintByHost map.  This map is 
> based on the UUID of the host, but this does not seem to be cleared when the 
> node is back up, and I think this is what is causing the problem.
>  
> This is in cassandra 4.1.5



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19489) Guardrail to warn clients about possible transient incorrect responses for filtering queries against multiple mutable columns

2024-04-01 Thread Jeremiah Jordan (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832864#comment-17832864
 ] 

Jeremiah Jordan commented on CASSANDRA-19489:
-

This is a long standing issue in C*, not sure we need to block 5.0 release for 
adding a warning?  I think we can add a warning when ever?

> Guardrail to warn clients about possible transient incorrect responses for 
> filtering queries against multiple mutable columns
> -
>
> Key: CASSANDRA-19489
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19489
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Coordination, CQL/Semantics, Messaging/Client
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 5.0-rc, 5.1
>
>
> Given we may not have time to fully resolve CASSANDRA-19007 before we release 
> 5.0, it would still be helpful to have, at the very minimum, a client warning 
> for cases where a user filters on two or more mutable (static or regular) 
> columns at consistency levels that require coordinator reconciliation. We may 
> also want the option to fail these queries outright, although that need not 
> be the default.
> The only art involved in this is deciding what we want to say in the 
> warning/error message. It's probably reasonable to mention there that this 
> only happens when we have unrepaired data. It's also worth noting that SAI 
> queries are no longer vulnerable to this after the resolution of 
> CASSANDRA-19018.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19448) CommitlogArchiver only has granularity to seconds for restore_point_in_time

2024-04-01 Thread Maxwell Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maxwell Guo updated CASSANDRA-19448:

Test and Documentation Plan: UT
 Status: Patch Available  (was: Open)

> CommitlogArchiver only has granularity to seconds for restore_point_in_time
> ---
>
> Key: CASSANDRA-19448
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19448
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log
>Reporter: Jeremy Hanna
>Assignee: Maxwell Guo
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Commitlog archiver allows users to backup commitlog files for the purpose of 
> doing point in time restores.  The [configuration 
> file|https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties]
>  gives an example of down to the seconds granularity but then asks what 
> whether the timestamps are microseconds or milliseconds - defaulting to 
> microseconds.  Because the [CommitLogArchiver uses a second based date 
> format|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogArchiver.java#L52],
>  if a user specifies to restore at something at a lower granularity like 
> milliseconds or microseconds, that means that the it will truncate everything 
> after the second and restore to that second.  So say you specify a 
> restore_point_in_time like this:
> restore_point_in_time=2024:01:18 17:01:01.623392
> it will silently truncate everything after the 01 seconds.  So effectively to 
> the user, it is missing updates between 01 and 01.623392.
> This appears to be a bug in the intent.  We should allow users to specify 
> down to the millisecond or even microsecond level. If we allow them to 
> specify down to microseconds for the restore point in time, then it may 
> internally need to change from a long.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19495) Hints not stored after node goes down for the second time

2024-04-01 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832856#comment-17832856
 ] 

Stefan Miklosovic commented on CASSANDRA-19495:
---

BTW this ticket ... while it does what it should, if one starts to dig deeper, 
there are some not-so-obvious consequences of how hints delivery / deletion is 
done. For example, if one calls HintsServiceMBean.deleteAllHints (or 
deleteAllHintsForEndpoint), it will indeed delete the hints, but only these 
which are already on disk and for which writer is closed. That means that it 
will not delete hints which are in buffers or for which the writer is not 
closed yet. I consider this to be a bug. If a user invokes "delete all hints", 
it should indeed delete them _ALL_, in buffers included.

The behavior described above, when it comes to this ticket, might have a 
consequence that if buffers are not cleared on deleting all hints, then once it 
goes to check if we should hint, it will look into "what is the oldest hint" in 
order to decide if we violated the threshold or not, and there will be no 
descriptors, indeed, that is fine, but buffers will not be cleared so it will 
resolve the oldest hint in buffers. So by deletion of hints, it wont be 100% 
"reset".  On the other hand, when a node comes up online, all hints will be 
delivered and earliest hint record will be removed from the underlying map. 

> Hints not stored after node goes down for the second time
> -
>
> Key: CASSANDRA-19495
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19495
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Hints
>Reporter: Paul Chandler
>Assignee: Stefan Miklosovic
>Priority: Urgent
> Fix For: 4.1.x, 5.0-rc, 5.x
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> I have a scenario where a node goes down, hints are recorded on the second 
> node and replayed, as expected. If the first node goes down for a second time 
> and time span between the first time it stopped and the second time it 
> stopped is more than the max_hint_window then the hint is not recorded, no 
> hint file is created, and the mutation never arrives at the node after it 
> comes up again.
> I have debugged this and it appears to due to the way hint window is 
> persisted after https://issues.apache.org/jira/browse/CASSANDRA-14309
> The code here: 
> [https://github.com/apache/cassandra/blame/cassandra-4.1/src/java/org/apache/cassandra/service/StorageProxy.java#L2402]
>  uses the time stored in the HintsBuffer.earliestHintByHost map.  This map is 
> based on the UUID of the host, but this does not seem to be cleared when the 
> node is back up, and I think this is what is causing the problem.
>  
> This is in cassandra 4.1.5



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19495) Hints not stored after node goes down for the second time

2024-04-01 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832854#comment-17832854
 ] 

Stefan Miklosovic commented on CASSANDRA-19495:
---

[~aleksey] do you want to have a look?

> Hints not stored after node goes down for the second time
> -
>
> Key: CASSANDRA-19495
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19495
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Hints
>Reporter: Paul Chandler
>Assignee: Stefan Miklosovic
>Priority: Urgent
> Fix For: 4.1.x, 5.0-rc, 5.x
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> I have a scenario where a node goes down, hints are recorded on the second 
> node and replayed, as expected. If the first node goes down for a second time 
> and time span between the first time it stopped and the second time it 
> stopped is more than the max_hint_window then the hint is not recorded, no 
> hint file is created, and the mutation never arrives at the node after it 
> comes up again.
> I have debugged this and it appears to due to the way hint window is 
> persisted after https://issues.apache.org/jira/browse/CASSANDRA-14309
> The code here: 
> [https://github.com/apache/cassandra/blame/cassandra-4.1/src/java/org/apache/cassandra/service/StorageProxy.java#L2402]
>  uses the time stored in the HintsBuffer.earliestHintByHost map.  This map is 
> based on the UUID of the host, but this does not seem to be cleared when the 
> node is back up, and I think this is what is causing the problem.
>  
> This is in cassandra 4.1.5



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19495) Hints not stored after node goes down for the second time

2024-04-01 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832853#comment-17832853
 ] 

Brandon Williams commented on CASSANDRA-19495:
--

I think they were just environmental since they didn't fail on the other 
branches.

> Hints not stored after node goes down for the second time
> -
>
> Key: CASSANDRA-19495
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19495
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Hints
>Reporter: Paul Chandler
>Assignee: Stefan Miklosovic
>Priority: Urgent
> Fix For: 4.1.x, 5.0-rc, 5.x
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> I have a scenario where a node goes down, hints are recorded on the second 
> node and replayed, as expected. If the first node goes down for a second time 
> and time span between the first time it stopped and the second time it 
> stopped is more than the max_hint_window then the hint is not recorded, no 
> hint file is created, and the mutation never arrives at the node after it 
> comes up again.
> I have debugged this and it appears to due to the way hint window is 
> persisted after https://issues.apache.org/jira/browse/CASSANDRA-14309
> The code here: 
> [https://github.com/apache/cassandra/blame/cassandra-4.1/src/java/org/apache/cassandra/service/StorageProxy.java#L2402]
>  uses the time stored in the HintsBuffer.earliestHintByHost map.  This map is 
> based on the UUID of the host, but this does not seem to be cleared when the 
> node is back up, and I think this is what is causing the problem.
>  
> This is in cassandra 4.1.5



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19495) Hints not stored after node goes down for the second time

2024-04-01 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832849#comment-17832849
 ] 

Stefan Miklosovic commented on CASSANDRA-19495:
---

I see there are 3 instances of failing tests in multiplexer for trunk but I do 
not know what is the cause of it, does not happen on lower branches. 

It timeouts on 

"at 
org.apache.cassandra.cql3.statements.ModificationStatement.executeWithoutCondition(ModificationStatement.java:529)"

ModificationStatement is either DeleteStatement or UpdateStatement, we are not 
deleting anything, only selecting and inserting so insertion it is (as it also 
throws WriteTimeoutException). I could probably just wrap the insertion 
execution in try-catch and catch WriteTimeoutException and just ignore it but I 
am not sure if that solves anything ... Some writes might obviously timeout as 
we populate the db 

https://app.circleci.com/pipelines/github/driftx/cassandra/1549/workflows/e0e763fa-e847-45c0-82ed-34f394758cd9/jobs/80339/tests

> Hints not stored after node goes down for the second time
> -
>
> Key: CASSANDRA-19495
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19495
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Hints
>Reporter: Paul Chandler
>Assignee: Stefan Miklosovic
>Priority: Urgent
> Fix For: 4.1.x, 5.0-rc, 5.x
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> I have a scenario where a node goes down, hints are recorded on the second 
> node and replayed, as expected. If the first node goes down for a second time 
> and time span between the first time it stopped and the second time it 
> stopped is more than the max_hint_window then the hint is not recorded, no 
> hint file is created, and the mutation never arrives at the node after it 
> comes up again.
> I have debugged this and it appears to due to the way hint window is 
> persisted after https://issues.apache.org/jira/browse/CASSANDRA-14309
> The code here: 
> [https://github.com/apache/cassandra/blame/cassandra-4.1/src/java/org/apache/cassandra/service/StorageProxy.java#L2402]
>  uses the time stored in the HintsBuffer.earliestHintByHost map.  This map is 
> based on the UUID of the host, but this does not seem to be cleared when the 
> node is back up, and I think this is what is causing the problem.
>  
> This is in cassandra 4.1.5



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19448) CommitlogArchiver only has granularity to seconds for restore_point_in_time

2024-04-01 Thread Maxwell Guo (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832369#comment-17832369
 ] 

Maxwell Guo edited comment on CASSANDRA-19448 at 4/1/24 1:45 PM:
-

Hi Sorry for the late reply. I was in a hurry recently so I was away for a 
while. [~brandon.williams][~tiagomlalves] [~jeromatron], I have also made a PR 
for this ,And I might want a more general solution to this problem, whether rip 
is seconds, milliseconds, or microseconds. 
I have just prepare the pr but have not been tested very complete, here is the 
pr , and I will update the CI after they finished. 
Besides, I used to pay more attention to whether this category is a bug or an 
improvement, because it is related to whether to prepare a copy for 4.x.


||Heading 1||Heading 2||
|PR for trunk |[pr|https://github.com/apache/cassandra/pull/3215/]|

And the ci are running ,will check if the they finished.

||Heading 1||Heading 2||
|java11|[java11|https://app.circleci.com/pipelines/github/Maxwell-Guo/cassandra/549/workflows/f1110dc1-b08e-4db5-97bf-e945658dc28b]|
|java17|[java17|https://app.circleci.com/pipelines/github/Maxwell-Guo/cassandra/549/workflows/e9cf230a-9b78-4568-9ae4-14c0e68510a4]|


Update:
fix build error for unused imports 



was (Author: maxwellguo):
Hi Sorry for the late reply. I was in a hurry recently so I was away for a 
while. [~brandon.williams][~tiagomlalves] [~jeromatron], I have also made a PR 
for this ,And I might want a more general solution to this problem, whether rip 
is seconds, milliseconds, or microseconds. 
I have just prepare the pr but have not been tested very complete, here is the 
pr , and I will update the CI after they finished. 
Besides, I used to pay more attention to whether this category is a bug or an 
improvement, because it is related to whether to prepare a copy for 4.x.


||Heading 1||Heading 2||
|PR for trunk |[pr|https://github.com/apache/cassandra/pull/3215/]|

And the ci are running ,will check if the they finished.

||Heading 1||Heading 2||
|java11|[java11|https://app.circleci.com/pipelines/github/Maxwell-Guo/cassandra/547/workflows/8102b770-a623-4417-a5ea-5c602b7e6949]|
|java17|[java17|https://app.circleci.com/pipelines/github/Maxwell-Guo/cassandra/547/workflows/82eb1c09-ca27-46b0-996c-7d841e062423]|



> CommitlogArchiver only has granularity to seconds for restore_point_in_time
> ---
>
> Key: CASSANDRA-19448
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19448
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log
>Reporter: Jeremy Hanna
>Assignee: Maxwell Guo
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Commitlog archiver allows users to backup commitlog files for the purpose of 
> doing point in time restores.  The [configuration 
> file|https://github.com/apache/cassandra/blob/trunk/conf/commitlog_archiving.properties]
>  gives an example of down to the seconds granularity but then asks what 
> whether the timestamps are microseconds or milliseconds - defaulting to 
> microseconds.  Because the [CommitLogArchiver uses a second based date 
> format|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogArchiver.java#L52],
>  if a user specifies to restore at something at a lower granularity like 
> milliseconds or microseconds, that means that the it will truncate everything 
> after the second and restore to that second.  So say you specify a 
> restore_point_in_time like this:
> restore_point_in_time=2024:01:18 17:01:01.623392
> it will silently truncate everything after the 01 seconds.  So effectively to 
> the user, it is missing updates between 01 and 01.623392.
> This appears to be a bug in the intent.  We should allow users to specify 
> down to the millisecond or even microsecond level. If we allow them to 
> specify down to microseconds for the restore point in time, then it may 
> internally need to change from a long.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19495) Hints not stored after node goes down for the second time

2024-04-01 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832847#comment-17832847
 ] 

Stefan Miklosovic commented on CASSANDRA-19495:
---

[CASSANDRA-19495-4.1|https://github.com/instaclustr/cassandra/tree/CASSANDRA-19495-4.1]
{noformat}
java11_pre-commit_tests 
  ✓ j11_build1m 45s
  ✓ j11_cqlsh_dtests_py3 5m 31s
  ✓ j11_cqlsh_dtests_py311   6m 18s
  ✓ j11_cqlsh_dtests_py311_vnode 6m 14s
  ✓ j11_cqlsh_dtests_py385m 50s
  ✓ j11_cqlsh_dtests_py38_vnode  5m 58s
  ✓ j11_cqlsh_dtests_py3_vnode   5m 35s
  ✓ j11_cqlshlib_cython_tests7m 34s
  ✓ j11_cqlshlib_tests   8m 35s
  ✓ j11_dtests  33m 30s
  ✓ j11_dtests_vnode36m 31s
  ✓ j11_jvm_dtests  17m 43s
  ✓ j11_jvm_dtests_repeat   35m 54s
  ✓ j11_jvm_dtests_vnode11m 57s
  ✓ j11_jvm_dtests_vnode_repeat 35m 40s
  ✕ j11_unit_tests8m 3s
  org.apache.cassandra.cql3.MemtableSizeTest testSize[skiplist]
java11_separate_tests
java8_pre-commit_tests  
  ✓ j8_build 7m 13s
  ✓ j8_cqlsh_dtests_py3  6m 10s
  ✓ j8_cqlsh_dtests_py3117m 43s
  ✓ j8_cqlsh_dtests_py311_vnode  9m 56s
  ✓ j8_cqlsh_dtests_py38 8m 16s
  ✓ j8_cqlsh_dtests_py38_vnode   6m 38s
  ✓ j8_cqlsh_dtests_py3_vnode8m 52s
  ✓ j8_cqlshlib_cython_tests12m 26s
  ✓ j8_cqlshlib_tests   12m 25s
  ✓ j8_dtests   34m 21s
  ✓ j8_dtests_vnode 37m 57s
  ✓ j8_jvm_dtests   18m 17s
  ✓ j8_jvm_dtests_repeat38m 45s
  ✓ j8_jvm_dtests_vnode 12m 50s
  ✓ j8_jvm_dtests_vnode_repeat  37m 28s
  ✓ j8_simulator_dtests  5m 18s
  ✓ j11_jvm_dtests_vnode_repeat  35m 8s
  ✓ j11_jvm_dtests_vnode11m 41s
  ✓ j11_jvm_dtests_repeat   35m 23s
  ✓ j11_jvm_dtests  17m 38s
  ✓ j11_dtests_vnode 36m 3s
  ✓ j11_dtests   34m 1s
  ✓ j11_cqlshlib_tests   8m 22s
  ✓ j11_cqlshlib_cython_tests6m 51s
  ✓ j11_cqlsh_dtests_py3_vnode   5m 55s
  ✓ j11_cqlsh_dtests_py38_vnode  6m 15s
  ✓ j11_cqlsh_dtests_py386m 12s
  ✓ j11_cqlsh_dtests_py311_vnode 5m 57s
  ✓ j11_cqlsh_dtests_py311   5m 48s
  ✓ j11_cqlsh_dtests_py3 5m 25s
  ✕ j8_unit_tests9m 32s
  org.apache.cassandra.cql3.MemtableSizeTest testSize[skiplist]
  ✕ j8_utests_system_keyspace_directory 10m 16s
  org.apache.cassandra.cql3.MemtableSizeTest testSize[skiplist]
  ✕ j11_unit_tests   8m 37s
  org.apache.cassandra.cql3.MemtableSizeTest testSize[skiplist]
java8_separate_tests 
{noformat}

[java11_pre-commit_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/4098/workflows/53289114-3af8-4a95-bbf2-03ec435d828d]
[java11_separate_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/4098/workflows/270b6da9-47a5-45a3-8cde-c00c175a2a7d]
[java8_pre-commit_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/4098/workflows/d0852ed4-be80-4d24-a560-e9e7f57d93af]
[java8_separate_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/4098/workflows/9a14c4ef-2d6e-4c57-aa8e-fb00093416ad]


> Hints not stored after node goes down for the second time
> -
>
> Key: CASSANDRA-19495
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19495
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Hints
>Reporter: Paul Chandler
>Assignee: Stefan Miklosovic
>Priority: Urgent
> Fix For: 4.1.x, 5.0-rc, 5.x
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> I have a scenario where a node goes down, hints are recorded on the second 
> node and replayed, as 

[jira] [Commented] (CASSANDRA-19495) Hints not stored after node goes down for the second time

2024-04-01 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832846#comment-17832846
 ] 

Stefan Miklosovic commented on CASSANDRA-19495:
---

[CASSANDRA-19495 
5.0|https://github.com/instaclustr/cassandra/tree/CASSANDRA-19495]
{noformat}
java17_pre-commit_tests 
  ✓ j17_build4m 15s
  ✓ j17_cqlsh_dtests_py3116m 2s
  ✓ j17_cqlsh_dtests_py311_vnode 6m 36s
  ✓ j17_cqlsh_dtests_py385m 57s
  ✓ j17_cqlsh_dtests_py38_vnode   6m 5s
  ✓ j17_cqlshlib_cython_tests7m 55s
  ✓ j17_cqlshlib_tests   6m 35s
  ✓ j17_dtests  33m 31s
  ✓ j17_dtests_latest   32m 42s
  ✓ j17_dtests_vnode34m 28s
  ✓ j17_jvm_dtests   19m 6s
  ✓ j17_jvm_dtests_latest_vnode 15m 12s
  ✓ j17_jvm_dtests_latest_vnode_repeat  28m 55s
  ✓ j17_jvm_dtests_repeat   28m 33s
  ✓ j17_unit_tests  15m 52s
  ✓ j17_utests_latest   16m 15s
  ✓ j17_utests_oa   16m 20s
java17_separate_tests
java11_pre-commit_tests 
  ✓ j11_build6m 25s
  ✓ j11_cqlsh_dtests_py311   10m 1s
  ✓ j11_cqlsh_dtests_py311_vnode 7m 19s
  ✓ j11_cqlsh_dtests_py386m 33s
  ✓ j11_cqlsh_dtests_py38_vnode   8m 3s
  ✓ j11_cqlshlib_cython_tests   11m 22s
  ✓ j11_cqlshlib_tests   7m 56s
  ✓ j11_dtests  33m 47s
  ✓ j11_dtests_latest   35m 46s
  ✓ j11_dtests_vnode39m 29s
  ✓ j11_jvm_dtests  21m 55s
  ✓ j11_jvm_dtests_latest_vnode 13m 42s
  ✓ j11_jvm_dtests_latest_vnode_repeat  31m 53s
  ✓ j11_jvm_dtests_repeat   31m 53s
  ✓ j11_simulator_dtests  2m 8s
  ✓ j11_unit_tests  15m 42s
  ✓ j11_utests_latest   17m 32s
  ✓ j11_utests_oa   16m 47s
  ✓ j11_utests_system_keyspace_directory18m 50s
  ✓ j17_cqlsh_dtests_py311   6m 16s
  ✓ j17_cqlsh_dtests_py311_vnode 6m 30s
  ✓ j17_cqlsh_dtests_py386m 33s
  ✓ j17_cqlsh_dtests_py38_vnode  6m 21s
  ✓ j17_cqlshlib_cython_tests7m 44s
  ✓ j17_cqlshlib_tests   6m 43s
  ✓ j17_dtests  31m 36s
  ✓ j17_dtests_latest   33m 18s
  ✓ j17_dtests_vnode 33m 4s
  ✓ j17_jvm_dtests  18m 52s
  ✓ j17_jvm_dtests_latest_vnode 14m 44s
  ✓ j17_jvm_dtests_latest_vnode_repeat  28m 34s
  ✓ j17_jvm_dtests_repeat   28m 15s
  ✓ j17_utests_latest14m 8s
  ✓ j17_utests_oa   14m 37s
  ✕ j17_unit_tests  15m 35s
  org.apache.cassandra.net.ConnectionTest testTimeout
java11_separate_tests
{noformat}

[java17_pre-commit_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/4097/workflows/93d0a79e-b3d2-42d9-aa30-0255a1dab960]
[java17_separate_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/4097/workflows/ec5296fd-7d3a-4bb1-8047-259db70e789a]
[java11_pre-commit_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/4097/workflows/dbc1290f-6694-42aa-9e1f-f8f93bd2b041]
[java11_separate_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/4097/workflows/4b774ce9-8e90-4fdd-a0d2-555a42685eed]


> Hints not stored after node goes down for the second time
> -
>
> Key: CASSANDRA-19495
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19495
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Hints
>Reporter: Paul Chandler
>Assignee: Stefan Miklosovic
>Priority: Urgent
> Fix For: 4.1.x, 5.0-rc, 5.x
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> I have a scenario where a node goes down, hints are recorded on the second 
> node and replayed, as expected. If the first node goes down 

[jira] [Commented] (CASSANDRA-19428) Clean up KeyRangeIterator classes

2024-04-01 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832844#comment-17832844
 ] 

Ekaterina Dimitrova commented on CASSANDRA-19428:
-

Update: I cannot reproduce locally testSnapshot failure. I thought I did, but 
it was the test failing when being run in isolation which is expected, as 
explained in this comment: 
https://github.com/apache/cassandra/blob/cassandra-5.0/test/unit/org/apache/cassandra/index/sai/functional/SnapshotTest.java#L60-L61

Though I suspect this one could be a timing issue

> Clean up KeyRangeIterator classes
> -
>
> Key: CASSANDRA-19428
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19428
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/2i Index
>Reporter: Ekaterina Dimitrova
>Assignee: Ekaterina Dimitrova
>Priority: Low
> Fix For: 5.0-rc, 5.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Remove KeyRangeIterator.current and simplify



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19428) Clean up KeyRangeIterator classes

2024-04-01 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832843#comment-17832843
 ] 

Ekaterina Dimitrova edited comment on CASSANDRA-19428 at 4/1/24 1:30 PM:
-

I realized I forgot to post the rebased branch - 
[https://github.com/ekaterinadimitrova2/cassandra/tree/19428-5.0-4. 
|https://github.com/ekaterinadimitrova2/cassandra/tree/19428-5.0-4]

Please ignore the last Jenkins commit, it is for CI test purposes


was (Author: e.dimitrova):
I realized I forgot to post the rebased branch - 
https://github.com/ekaterinadimitrova2/cassandra/tree/19428-5.0-4

> Clean up KeyRangeIterator classes
> -
>
> Key: CASSANDRA-19428
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19428
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/2i Index
>Reporter: Ekaterina Dimitrova
>Assignee: Ekaterina Dimitrova
>Priority: Low
> Fix For: 5.0-rc, 5.x
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Remove KeyRangeIterator.current and simplify



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19509) Consider using XOR filters or Ribbon filters instead of bloom filters

2024-04-01 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832822#comment-17832822
 ] 

Brandon Williams commented on CASSANDRA-19509:
--

Since there are tradeoffs with any of the options, we could make this 
configurable, however I think we need to demonstrate there are significant 
enough effects to the system as a whole (not just microbenchmarks) to justify 
any new alternative first.

> Consider using XOR filters or Ribbon filters instead of bloom filters
> -
>
> Key: CASSANDRA-19509
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19509
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Other
>Reporter: Vladimir Sitnikov
>Priority: Normal
> Fix For: 5.x
>
>
> See 
> https://lemire.me/blog/2019/12/19/xor-filters-faster-and-smaller-than-bloom-filters/
> It seems to use less memory (1.23 vs 1.44) for the same false-positive rate 
> at a cost of immutability.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19509) Consider using XOR filters or Ribbon filters instead of bloom filters

2024-04-01 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-19509:
-
Change Category: Performance
 Complexity: Normal
Component/s: Local/Other
  Fix Version/s: 5.x
 Status: Open  (was: Triage Needed)

> Consider using XOR filters or Ribbon filters instead of bloom filters
> -
>
> Key: CASSANDRA-19509
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19509
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Other
>Reporter: Vladimir Sitnikov
>Priority: Normal
> Fix For: 5.x
>
>
> See 
> https://lemire.me/blog/2019/12/19/xor-filters-faster-and-smaller-than-bloom-filters/
> It seems to use less memory (1.23 vs 1.44) for the same false-positive rate 
> at a cost of immutability.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19509) Consider using XOR filters or Ribbon filters instead of bloom filters

2024-04-01 Thread Vladimir Sitnikov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Sitnikov updated CASSANDRA-19509:
--
Summary: Consider using XOR filters or Ribbon filters instead of bloom 
filters  (was: Consider using XOR filters instead of bloom filters)

> Consider using XOR filters or Ribbon filters instead of bloom filters
> -
>
> Key: CASSANDRA-19509
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19509
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Vladimir Sitnikov
>Priority: Normal
>
> See 
> https://lemire.me/blog/2019/12/19/xor-filters-faster-and-smaller-than-bloom-filters/
> It seems to use less memory (1.23 vs 1.44) for the same false-positive rate 
> at a cost of immutability.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-17019) JNA 5.6.0 does not support arm64

2024-04-01 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams reassigned CASSANDRA-17019:


Assignee: Yuqi Gu  (was: Ganesh Raju)

> JNA 5.6.0 does not support arm64
> 
>
> Key: CASSANDRA-17019
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17019
> Project: Cassandra
>  Issue Type: Bug
>  Components: Dependencies
>Reporter: Igamr Palsenberg
>Assignee: Yuqi Gu
>Priority: Normal
> Fix For: 4.1-alpha1, 4.1
>
> Attachments: UTs_snappy_upgrading.txt, cassandra_UTs.txt
>
>
> Cassandra depends on net.java.dev.jna.jna version 5.6.0 to do the native 
> binding into the C library.
> JNA 5.6.0 does not support arm64 architecture (Apple M1 devices), causing 
> cassandra to fail on bootstrap.
>  Bumping the dependency to 5.9.0 adds arm64 support. Will a PR to bump the 
> dependency be acceptable ?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19509) Consider using XOR filters instead of bloom filters

2024-04-01 Thread Vladimir Sitnikov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832807#comment-17832807
 ] 

Vladimir Sitnikov commented on CASSANDRA-19509:
---

{quote}As I got XOR filters consume more CPU compared to Bloom filters.{quote}
Do you mean "membership test time" or "construction time"?

The authors say "but once built, it uses less memory and is about 25% faster", 
so tests against XOR filters would probably take less CPU.

My guess would be that "membership test time" is more important for Cassandra 
than "construction time"



Ribbon filters might be a nice alternative as well.

> Consider using XOR filters instead of bloom filters
> ---
>
> Key: CASSANDRA-19509
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19509
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Vladimir Sitnikov
>Priority: Normal
>
> See 
> https://lemire.me/blog/2019/12/19/xor-filters-faster-and-smaller-than-bloom-filters/
> It seems to use less memory (1.23 vs 1.44) for the same false-positive rate 
> at a cost of immutability.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19509) Consider using XOR filters instead of bloom filters

2024-04-01 Thread Maxwell Guo (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832802#comment-17832802
 ] 

Maxwell Guo edited comment on CASSANDRA-19509 at 4/1/24 10:02 AM:
--

If we really want to discuss filters, I might be more inclined to Ribbon 
filter, see [here |https://arxiv.org/abs/2103.02515] and [here 
|https://rocksdb.org/blog/2021/12/29/ribbon-filter.html].


was (Author: maxwellguo):
If we really want to discuss filters, I might be more inclined to [Ribbon 
filter|https://arxiv.org/abs/2103.02515] and [here 
|https://rocksdb.org/blog/2021/12/29/ribbon-filter.html].

> Consider using XOR filters instead of bloom filters
> ---
>
> Key: CASSANDRA-19509
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19509
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Vladimir Sitnikov
>Priority: Normal
>
> See 
> https://lemire.me/blog/2019/12/19/xor-filters-faster-and-smaller-than-bloom-filters/
> It seems to use less memory (1.23 vs 1.44) for the same false-positive rate 
> at a cost of immutability.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19509) Consider using XOR filters instead of bloom filters

2024-04-01 Thread Maxwell Guo (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832802#comment-17832802
 ] 

Maxwell Guo commented on CASSANDRA-19509:
-

If we really want to discuss filters, I might be more inclined to [Ribbon 
filter|https://arxiv.org/abs/2103.02515] and [here 
|https://rocksdb.org/blog/2021/12/29/ribbon-filter.html].

> Consider using XOR filters instead of bloom filters
> ---
>
> Key: CASSANDRA-19509
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19509
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Vladimir Sitnikov
>Priority: Normal
>
> See 
> https://lemire.me/blog/2019/12/19/xor-filters-faster-and-smaller-than-bloom-filters/
> It seems to use less memory (1.23 vs 1.44) for the same false-positive rate 
> at a cost of immutability.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19509) Consider using XOR filters instead of bloom filters

2024-04-01 Thread Dmitry Konstantinov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832799#comment-17832799
 ] 

Dmitry Konstantinov edited comment on CASSANDRA-19509 at 4/1/24 9:57 AM:
-

A related discussion: https://news.ycombinator.com/item?id=21840821
As I got XOR filters consume more CPU compared to Bloom filters.
As an alternative Ribbon filters can be considered: 
* https://engineering.fb.com/2021/07/09/core-infra/ribbon-filter/
* https://rocksdb.org/blog/2021/12/29/ribbon-filter.html


was (Author: dnk):
As I got XOR filters consume more CPU compared to Bloom filters, as alternative 
Ribbon filters can be considered: 
* https://engineering.fb.com/2021/07/09/core-infra/ribbon-filter/
* https://rocksdb.org/blog/2021/12/29/ribbon-filter.html

> Consider using XOR filters instead of bloom filters
> ---
>
> Key: CASSANDRA-19509
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19509
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Vladimir Sitnikov
>Priority: Normal
>
> See 
> https://lemire.me/blog/2019/12/19/xor-filters-faster-and-smaller-than-bloom-filters/
> It seems to use less memory (1.23 vs 1.44) for the same false-positive rate 
> at a cost of immutability.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19509) Consider using XOR filters instead of bloom filters

2024-04-01 Thread Dmitry Konstantinov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17832799#comment-17832799
 ] 

Dmitry Konstantinov commented on CASSANDRA-19509:
-

As I got XOR filters consume more CPU compared to Bloom filters, as alternative 
Ribbon filters can be considered: 
* https://engineering.fb.com/2021/07/09/core-infra/ribbon-filter/
* https://rocksdb.org/blog/2021/12/29/ribbon-filter.html

> Consider using XOR filters instead of bloom filters
> ---
>
> Key: CASSANDRA-19509
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19509
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Vladimir Sitnikov
>Priority: Normal
>
> See 
> https://lemire.me/blog/2019/12/19/xor-filters-faster-and-smaller-than-bloom-filters/
> It seems to use less memory (1.23 vs 1.44) for the same false-positive rate 
> at a cost of immutability.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-19509) Consider using XOR filters instead of bloom filters

2024-04-01 Thread Vladimir Sitnikov (Jira)
Vladimir Sitnikov created CASSANDRA-19509:
-

 Summary: Consider using XOR filters instead of bloom filters
 Key: CASSANDRA-19509
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19509
 Project: Cassandra
  Issue Type: Improvement
Reporter: Vladimir Sitnikov


See 
https://lemire.me/blog/2019/12/19/xor-filters-faster-and-smaller-than-bloom-filters/
It seems to use less memory (1.23 vs 1.44) for the same false-positive rate at 
a cost of immutability.





--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org