[jira] [Updated] (CASSANDRA-15295) Running into deadlock when do CommitLog initialization

2019-08-30 Thread Zephyr Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zephyr Guo updated CASSANDRA-15295:
---
Description: 
Recently, I found a cassandra(3.11.4) node stuck in STARTING status for a long 
time.
 I used jstack to saw what happened. The main thread stuck in 
*AbstractCommitLogSegmentManager.awaitAvailableSegment*
 !screenshot-1.png! 

The strange thing is COMMIT-LOG-ALLOCATOR thread state was runnable but it was 
not actually running.  
 !screenshot-2.png! 

And then I used pstack to troubleshoot. I found COMMIT-LOG-ALLOCATOR block on 
java class initialization.
  !screenshot-3.png! 

This is a deadlock obviously. CommitLog waits for a CommitLogSegment when 
initializing. In this moment, the CommitLog class is not initialized and the 
main thread holds the class lock. After that, COMMIT-LOG-ALLOCATOR creates a 
CommitLogSegment with exception and call *CommitLog.handleCommitError*(static 
method).  COMMIT-LOG-ALLOCATOR will block on this line because CommitLog class 
is still initializing.



 

 

  was:
Recently, I found a cassandra(3.11.4) node stuck in STARTING status for a long 
time.
 I used jstack to saw what happened. The main thread stuck in 
*AbstractCommitLogSegmentManager.awaitAvailableSegment*
 !screenshot-1.png! 

The strange thing is COMMIT-LOG-ALLOCATOR thread state was runnable but it was 
not actually running.  
 !screenshot-2.png! 

And then I used pstack to troubleshoot. I found COMMIT-LOG-ALLOCATOR block on 
java class initialization.
  !screenshot-3.png! 

This is a deadlock obviously. CommitLog waits for a CommitLogSegment when 
initializing. In this moment, the CommitLog class is not initialized and the 
main thread holds the class lock. After that, COMMIT-LOG-ALLOCATOR creates a 
CommitLogSegment with exception and call *CommitLog.handleCommitError*(static 
method).  COMMIT-LOG-ALLOCATOR will stick on this line because CommitLog class 
is still initializing.



 

 


> Running into deadlock when do CommitLog initialization
> --
>
> Key: CASSANDRA-15295
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15295
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log
>Reporter: Zephyr Guo
>Assignee: Zephyr Guo
>Priority: Normal
> Attachments: jstack.log, pstack.log, screenshot-1.png, 
> screenshot-2.png, screenshot-3.png
>
>
> Recently, I found a cassandra(3.11.4) node stuck in STARTING status for a 
> long time.
>  I used jstack to saw what happened. The main thread stuck in 
> *AbstractCommitLogSegmentManager.awaitAvailableSegment*
>  !screenshot-1.png! 
> The strange thing is COMMIT-LOG-ALLOCATOR thread state was runnable but it 
> was not actually running.  
>  !screenshot-2.png! 
> And then I used pstack to troubleshoot. I found COMMIT-LOG-ALLOCATOR block on 
> java class initialization.
>   !screenshot-3.png! 
> This is a deadlock obviously. CommitLog waits for a CommitLogSegment when 
> initializing. In this moment, the CommitLog class is not initialized and the 
> main thread holds the class lock. After that, COMMIT-LOG-ALLOCATOR creates a 
> CommitLogSegment with exception and call *CommitLog.handleCommitError*(static 
> method).  COMMIT-LOG-ALLOCATOR will block on this line because CommitLog 
> class is still initializing.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15296) ZstdCompressor compression_level setting

2019-08-30 Thread DeepakVohra (Jira)
DeepakVohra created CASSANDRA-15296:
---

 Summary: ZstdCompressor compression_level setting
 Key: CASSANDRA-15296
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15296
 Project: Cassandra
  Issue Type: Bug
  Components: Dependencies, Feature/Compression
Reporter: DeepakVohra



The DEFAULT_COMPRESSION_LEVEL for ZstdCompressor is set to 3, but its range for 
compression_level is indicated to be between -131072 and 2.  The default value 
is outside the range. Is it by design or a bug?

{code:java}

- ``compression_level`` is only applicable for ``ZstdCompressor`` and accepts 
values between ``-131072`` and ``2``.

// Compressor Defaults
public static final int DEFAULT_COMPRESSION_LEVEL = 3;

{code}

https://github.com/apache/cassandra/commit/dccf53061a61e7c632669c60cd94626e405518e9



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15281) help for clearsnapshot needs to be updated to indicate requirement for --all

2019-08-30 Thread Josh Turner (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Turner updated CASSANDRA-15281:

Attachment: 15281.trunk

> help for clearsnapshot needs to be updated to indicate requirement for --all
> 
>
> Key: CASSANDRA-15281
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15281
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/nodetool
>Reporter: DeepakVohra
>Assignee: Josh Turner
>Priority: Normal
> Fix For: 4.x
>
> Attachments: 15281.trunk
>
>
> According to _CASSANDRA-13391_
> _nodetool clearsnapshot should require --all to clear all snapshots_
> But the help for clearsnapshot does not indicate the same.
> {code:java}
> [ec2-user@ip-10-0-2-238 ~]$ nodetool help clearsnapshot
>  NAME nodetool clearsnapshot - Remove the snapshot with the given name from 
> the given keyspaces. If no snapshotName is specified we will remove all 
> snapshots{code}
> The help for clearsnapshot needs to be updated to indicate requirement for 
> --all to remove all snapshots.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15281) help for clearsnapshot needs to be updated to indicate requirement for --all

2019-08-30 Thread Josh Turner (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Turner updated CASSANDRA-15281:

Test and Documentation Plan: Updated help message to match functionality.
 Status: Patch Available  (was: Open)

> help for clearsnapshot needs to be updated to indicate requirement for --all
> 
>
> Key: CASSANDRA-15281
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15281
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/nodetool
>Reporter: DeepakVohra
>Assignee: Josh Turner
>Priority: Normal
> Fix For: 4.x
>
> Attachments: 15281.trunk
>
>
> According to _CASSANDRA-13391_
> _nodetool clearsnapshot should require --all to clear all snapshots_
> But the help for clearsnapshot does not indicate the same.
> {code:java}
> [ec2-user@ip-10-0-2-238 ~]$ nodetool help clearsnapshot
>  NAME nodetool clearsnapshot - Remove the snapshot with the given name from 
> the given keyspaces. If no snapshotName is specified we will remove all 
> snapshots{code}
> The help for clearsnapshot needs to be updated to indicate requirement for 
> --all to remove all snapshots.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15281) help for clearsnapshot needs to be updated to indicate requirement for --all

2019-08-30 Thread Josh Turner (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Turner updated CASSANDRA-15281:

Component/s: (was: CQL/Syntax)
 Tool/nodetool

> help for clearsnapshot needs to be updated to indicate requirement for --all
> 
>
> Key: CASSANDRA-15281
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15281
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/nodetool
>Reporter: DeepakVohra
>Assignee: Josh Turner
>Priority: Normal
> Fix For: 4.x
>
>
> According to _CASSANDRA-13391_
> _nodetool clearsnapshot should require --all to clear all snapshots_
> But the help for clearsnapshot does not indicate the same.
> {code:java}
> [ec2-user@ip-10-0-2-238 ~]$ nodetool help clearsnapshot
>  NAME nodetool clearsnapshot - Remove the snapshot with the given name from 
> the given keyspaces. If no snapshotName is specified we will remove all 
> snapshots{code}
> The help for clearsnapshot needs to be updated to indicate requirement for 
> --all to remove all snapshots.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15281) help for clearsnapshot needs to be updated to indicate requirement for --all

2019-08-30 Thread Josh Turner (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Turner updated CASSANDRA-15281:

 Bug Category: Parent values: Code(13163)Level 1 values: Bug - Unclear 
Impact(13164)
   Complexity: Low Hanging Fruit
Discovered By: User Report
Fix Version/s: 4.x
 Severity: Low
   Status: Open  (was: Triage Needed)

> help for clearsnapshot needs to be updated to indicate requirement for --all
> 
>
> Key: CASSANDRA-15281
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15281
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL/Syntax
>Reporter: DeepakVohra
>Assignee: Josh Turner
>Priority: Normal
> Fix For: 4.x
>
>
> According to _CASSANDRA-13391_
> _nodetool clearsnapshot should require --all to clear all snapshots_
> But the help for clearsnapshot does not indicate the same.
> {code:java}
> [ec2-user@ip-10-0-2-238 ~]$ nodetool help clearsnapshot
>  NAME nodetool clearsnapshot - Remove the snapshot with the given name from 
> the given keyspaces. If no snapshotName is specified we will remove all 
> snapshots{code}
> The help for clearsnapshot needs to be updated to indicate requirement for 
> --all to remove all snapshots.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-15281) help for clearsnapshot needs to be updated to indicate requirement for --all

2019-08-30 Thread Josh Turner (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Turner reassigned CASSANDRA-15281:
---

Assignee: Josh Turner

> help for clearsnapshot needs to be updated to indicate requirement for --all
> 
>
> Key: CASSANDRA-15281
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15281
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL/Syntax
>Reporter: DeepakVohra
>Assignee: Josh Turner
>Priority: Normal
>
> According to _CASSANDRA-13391_
> _nodetool clearsnapshot should require --all to clear all snapshots_
> But the help for clearsnapshot does not indicate the same.
> {code:java}
> [ec2-user@ip-10-0-2-238 ~]$ nodetool help clearsnapshot
>  NAME nodetool clearsnapshot - Remove the snapshot with the given name from 
> the given keyspaces. If no snapshotName is specified we will remove all 
> snapshots{code}
> The help for clearsnapshot needs to be updated to indicate requirement for 
> --all to remove all snapshots.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15295) Running into deadlock when do CommitLog initialization

2019-08-30 Thread Zephyr Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zephyr Guo updated CASSANDRA-15295:
---
Attachment: screenshot-3.png

> Running into deadlock when do CommitLog initialization
> --
>
> Key: CASSANDRA-15295
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15295
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log
>Reporter: Zephyr Guo
>Assignee: Zephyr Guo
>Priority: Normal
> Attachments: jstack.log, pstack.log, screenshot-1.png, 
> screenshot-2.png, screenshot-3.png
>
>
> Recently, I found a cassandra(3.11.4) node stuck in STARTING status for a 
> long time.
>  I used jstack to saw what happened. The main thread stuck in 
> *AbstractCommitLogSegmentManager.awaitAvailableSegment*
>  !image-2019-08-30-21-40-43-748.png|thumbnail! 
> The strange thing is COMMIT-LOG-ALLOCATOR thread state was runnable but it 
> was not actually running.  
>  !image-2019-08-30-21-30-11-683.png|thumbnail! 
> And then I used pstack to troubleshoot. I found COMMIT-LOG-ALLOCATOR block on 
> java class initialization.
>  !image-2019-08-30-21-25-17-638.png|thumbnail! 
> This is a deadlock obviously. CommitLog waits for a CommitLogSegment when 
> initializing. In this moment, the CommitLog class is not initialized and the 
> main thread holds the class lock. After that, COMMIT-LOG-ALLOCATOR creates a 
> CommitLogSegment with exception and call *CommitLog.handleCommitError*(static 
> method).  COMMIT-LOG-ALLOCATOR will stick on this line because CommitLog 
> class is still initializing.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15295) Running into deadlock when do CommitLog initialization

2019-08-30 Thread Zephyr Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zephyr Guo updated CASSANDRA-15295:
---
Description: 
Recently, I found a cassandra(3.11.4) node stuck in STARTING status for a long 
time.
 I used jstack to saw what happened. The main thread stuck in 
*AbstractCommitLogSegmentManager.awaitAvailableSegment*
 !screenshot-1.png! 

The strange thing is COMMIT-LOG-ALLOCATOR thread state was runnable but it was 
not actually running.  
 !screenshot-2.png! 

And then I used pstack to troubleshoot. I found COMMIT-LOG-ALLOCATOR block on 
java class initialization.
  !screenshot-3.png! 

This is a deadlock obviously. CommitLog waits for a CommitLogSegment when 
initializing. In this moment, the CommitLog class is not initialized and the 
main thread holds the class lock. After that, COMMIT-LOG-ALLOCATOR creates a 
CommitLogSegment with exception and call *CommitLog.handleCommitError*(static 
method).  COMMIT-LOG-ALLOCATOR will stick on this line because CommitLog class 
is still initializing.



 

 

  was:
Recently, I found a cassandra(3.11.4) node stuck in STARTING status for a long 
time.
 I used jstack to saw what happened. The main thread stuck in 
*AbstractCommitLogSegmentManager.awaitAvailableSegment*
 !image-2019-08-30-21-40-43-748.png|thumbnail! 

The strange thing is COMMIT-LOG-ALLOCATOR thread state was runnable but it was 
not actually running.  
 !image-2019-08-30-21-30-11-683.png|thumbnail! 

And then I used pstack to troubleshoot. I found COMMIT-LOG-ALLOCATOR block on 
java class initialization.
 !image-2019-08-30-21-25-17-638.png|thumbnail! 

This is a deadlock obviously. CommitLog waits for a CommitLogSegment when 
initializing. In this moment, the CommitLog class is not initialized and the 
main thread holds the class lock. After that, COMMIT-LOG-ALLOCATOR creates a 
CommitLogSegment with exception and call *CommitLog.handleCommitError*(static 
method).  COMMIT-LOG-ALLOCATOR will stick on this line because CommitLog class 
is still initializing.



 

 


> Running into deadlock when do CommitLog initialization
> --
>
> Key: CASSANDRA-15295
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15295
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log
>Reporter: Zephyr Guo
>Assignee: Zephyr Guo
>Priority: Normal
> Attachments: jstack.log, pstack.log, screenshot-1.png, 
> screenshot-2.png, screenshot-3.png
>
>
> Recently, I found a cassandra(3.11.4) node stuck in STARTING status for a 
> long time.
>  I used jstack to saw what happened. The main thread stuck in 
> *AbstractCommitLogSegmentManager.awaitAvailableSegment*
>  !screenshot-1.png! 
> The strange thing is COMMIT-LOG-ALLOCATOR thread state was runnable but it 
> was not actually running.  
>  !screenshot-2.png! 
> And then I used pstack to troubleshoot. I found COMMIT-LOG-ALLOCATOR block on 
> java class initialization.
>   !screenshot-3.png! 
> This is a deadlock obviously. CommitLog waits for a CommitLogSegment when 
> initializing. In this moment, the CommitLog class is not initialized and the 
> main thread holds the class lock. After that, COMMIT-LOG-ALLOCATOR creates a 
> CommitLogSegment with exception and call *CommitLog.handleCommitError*(static 
> method).  COMMIT-LOG-ALLOCATOR will stick on this line because CommitLog 
> class is still initializing.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15295) Running into deadlock when do CommitLog initialization

2019-08-30 Thread Zephyr Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zephyr Guo updated CASSANDRA-15295:
---
Attachment: screenshot-2.png

> Running into deadlock when do CommitLog initialization
> --
>
> Key: CASSANDRA-15295
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15295
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log
>Reporter: Zephyr Guo
>Assignee: Zephyr Guo
>Priority: Normal
> Attachments: jstack.log, pstack.log, screenshot-1.png, 
> screenshot-2.png, screenshot-3.png
>
>
> Recently, I found a cassandra(3.11.4) node stuck in STARTING status for a 
> long time.
>  I used jstack to saw what happened. The main thread stuck in 
> *AbstractCommitLogSegmentManager.awaitAvailableSegment*
>  !image-2019-08-30-21-40-43-748.png|thumbnail! 
> The strange thing is COMMIT-LOG-ALLOCATOR thread state was runnable but it 
> was not actually running.  
>  !image-2019-08-30-21-30-11-683.png|thumbnail! 
> And then I used pstack to troubleshoot. I found COMMIT-LOG-ALLOCATOR block on 
> java class initialization.
>  !image-2019-08-30-21-25-17-638.png|thumbnail! 
> This is a deadlock obviously. CommitLog waits for a CommitLogSegment when 
> initializing. In this moment, the CommitLog class is not initialized and the 
> main thread holds the class lock. After that, COMMIT-LOG-ALLOCATOR creates a 
> CommitLogSegment with exception and call *CommitLog.handleCommitError*(static 
> method).  COMMIT-LOG-ALLOCATOR will stick on this line because CommitLog 
> class is still initializing.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15295) Running into deadlock when do CommitLog initialization

2019-08-30 Thread Zephyr Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zephyr Guo updated CASSANDRA-15295:
---
Attachment: (was: image-2019-08-30-21-30-11-683.png)

> Running into deadlock when do CommitLog initialization
> --
>
> Key: CASSANDRA-15295
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15295
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log
>Reporter: Zephyr Guo
>Assignee: Zephyr Guo
>Priority: Normal
> Attachments: jstack.log, pstack.log, screenshot-1.png
>
>
> Recently, I found a cassandra(3.11.4) node stuck in STARTING status for a 
> long time.
>  I used jstack to saw what happened. The main thread stuck in 
> *AbstractCommitLogSegmentManager.awaitAvailableSegment*
>  !image-2019-08-30-21-40-43-748.png|thumbnail! 
> The strange thing is COMMIT-LOG-ALLOCATOR thread state was runnable but it 
> was not actually running.  
>  !image-2019-08-30-21-30-11-683.png|thumbnail! 
> And then I used pstack to troubleshoot. I found COMMIT-LOG-ALLOCATOR block on 
> java class initialization.
>  !image-2019-08-30-21-25-17-638.png|thumbnail! 
> This is a deadlock obviously. CommitLog waits for a CommitLogSegment when 
> initializing. In this moment, the CommitLog class is not initialized and the 
> main thread holds the class lock. After that, COMMIT-LOG-ALLOCATOR creates a 
> CommitLogSegment with exception and call *CommitLog.handleCommitError*(static 
> method).  COMMIT-LOG-ALLOCATOR will stick on this line because CommitLog 
> class is still initializing.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15295) Running into deadlock when do CommitLog initialization

2019-08-30 Thread Zephyr Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zephyr Guo updated CASSANDRA-15295:
---
Attachment: (was: image-2019-08-30-21-40-43-748.png)

> Running into deadlock when do CommitLog initialization
> --
>
> Key: CASSANDRA-15295
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15295
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log
>Reporter: Zephyr Guo
>Assignee: Zephyr Guo
>Priority: Normal
> Attachments: jstack.log, pstack.log, screenshot-1.png
>
>
> Recently, I found a cassandra(3.11.4) node stuck in STARTING status for a 
> long time.
>  I used jstack to saw what happened. The main thread stuck in 
> *AbstractCommitLogSegmentManager.awaitAvailableSegment*
>  !image-2019-08-30-21-40-43-748.png|thumbnail! 
> The strange thing is COMMIT-LOG-ALLOCATOR thread state was runnable but it 
> was not actually running.  
>  !image-2019-08-30-21-30-11-683.png|thumbnail! 
> And then I used pstack to troubleshoot. I found COMMIT-LOG-ALLOCATOR block on 
> java class initialization.
>  !image-2019-08-30-21-25-17-638.png|thumbnail! 
> This is a deadlock obviously. CommitLog waits for a CommitLogSegment when 
> initializing. In this moment, the CommitLog class is not initialized and the 
> main thread holds the class lock. After that, COMMIT-LOG-ALLOCATOR creates a 
> CommitLogSegment with exception and call *CommitLog.handleCommitError*(static 
> method).  COMMIT-LOG-ALLOCATOR will stick on this line because CommitLog 
> class is still initializing.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15295) Running into deadlock when do CommitLog initialization

2019-08-30 Thread Zephyr Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zephyr Guo updated CASSANDRA-15295:
---
Attachment: screenshot-1.png

> Running into deadlock when do CommitLog initialization
> --
>
> Key: CASSANDRA-15295
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15295
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log
>Reporter: Zephyr Guo
>Assignee: Zephyr Guo
>Priority: Normal
> Attachments: jstack.log, pstack.log, screenshot-1.png
>
>
> Recently, I found a cassandra(3.11.4) node stuck in STARTING status for a 
> long time.
>  I used jstack to saw what happened. The main thread stuck in 
> *AbstractCommitLogSegmentManager.awaitAvailableSegment*
>  !image-2019-08-30-21-40-43-748.png|thumbnail! 
> The strange thing is COMMIT-LOG-ALLOCATOR thread state was runnable but it 
> was not actually running.  
>  !image-2019-08-30-21-30-11-683.png|thumbnail! 
> And then I used pstack to troubleshoot. I found COMMIT-LOG-ALLOCATOR block on 
> java class initialization.
>  !image-2019-08-30-21-25-17-638.png|thumbnail! 
> This is a deadlock obviously. CommitLog waits for a CommitLogSegment when 
> initializing. In this moment, the CommitLog class is not initialized and the 
> main thread holds the class lock. After that, COMMIT-LOG-ALLOCATOR creates a 
> CommitLogSegment with exception and call *CommitLog.handleCommitError*(static 
> method).  COMMIT-LOG-ALLOCATOR will stick on this line because CommitLog 
> class is still initializing.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15295) Running into deadlock when do CommitLog initialization

2019-08-30 Thread Zephyr Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zephyr Guo updated CASSANDRA-15295:
---
Attachment: (was: image-2019-08-30-21-29-48-623.png)

> Running into deadlock when do CommitLog initialization
> --
>
> Key: CASSANDRA-15295
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15295
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log
>Reporter: Zephyr Guo
>Assignee: Zephyr Guo
>Priority: Normal
> Attachments: image-2019-08-30-21-40-43-748.png, jstack.log, pstack.log
>
>
> Recently, I found a cassandra(3.11.4) node stuck in STARTING status for a 
> long time.
>  I used jstack to saw what happened. The main thread stuck in 
> *AbstractCommitLogSegmentManager.awaitAvailableSegment*
>  !image-2019-08-30-21-40-43-748.png|thumbnail! 
> The strange thing is COMMIT-LOG-ALLOCATOR thread state was runnable but it 
> was not actually running.  
>  !image-2019-08-30-21-30-11-683.png|thumbnail! 
> And then I used pstack to troubleshoot. I found COMMIT-LOG-ALLOCATOR block on 
> java class initialization.
>  !image-2019-08-30-21-25-17-638.png|thumbnail! 
> This is a deadlock obviously. CommitLog waits for a CommitLogSegment when 
> initializing. In this moment, the CommitLog class is not initialized and the 
> main thread holds the class lock. After that, COMMIT-LOG-ALLOCATOR creates a 
> CommitLogSegment with exception and call *CommitLog.handleCommitError*(static 
> method).  COMMIT-LOG-ALLOCATOR will stick on this line because CommitLog 
> class is still initializing.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15295) Running into deadlock when do CommitLog initialization

2019-08-30 Thread Zephyr Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zephyr Guo updated CASSANDRA-15295:
---
Attachment: (was: image-2019-08-30-21-25-17-638.png)

> Running into deadlock when do CommitLog initialization
> --
>
> Key: CASSANDRA-15295
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15295
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log
>Reporter: Zephyr Guo
>Assignee: Zephyr Guo
>Priority: Normal
> Attachments: image-2019-08-30-21-40-43-748.png, jstack.log, pstack.log
>
>
> Recently, I found a cassandra(3.11.4) node stuck in STARTING status for a 
> long time.
>  I used jstack to saw what happened. The main thread stuck in 
> *AbstractCommitLogSegmentManager.awaitAvailableSegment*
>  !image-2019-08-30-21-40-43-748.png|thumbnail! 
> The strange thing is COMMIT-LOG-ALLOCATOR thread state was runnable but it 
> was not actually running.  
>  !image-2019-08-30-21-30-11-683.png|thumbnail! 
> And then I used pstack to troubleshoot. I found COMMIT-LOG-ALLOCATOR block on 
> java class initialization.
>  !image-2019-08-30-21-25-17-638.png|thumbnail! 
> This is a deadlock obviously. CommitLog waits for a CommitLogSegment when 
> initializing. In this moment, the CommitLog class is not initialized and the 
> main thread holds the class lock. After that, COMMIT-LOG-ALLOCATOR creates a 
> CommitLogSegment with exception and call *CommitLog.handleCommitError*(static 
> method).  COMMIT-LOG-ALLOCATOR will stick on this line because CommitLog 
> class is still initializing.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15295) Running into deadlock when do CommitLog initialization

2019-08-30 Thread Zephyr Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zephyr Guo updated CASSANDRA-15295:
---
Attachment: pstack.log

> Running into deadlock when do CommitLog initialization
> --
>
> Key: CASSANDRA-15295
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15295
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log
>Reporter: Zephyr Guo
>Assignee: Zephyr Guo
>Priority: Normal
> Attachments: image-2019-08-30-21-25-17-638.png, 
> image-2019-08-30-21-29-48-623.png, image-2019-08-30-21-30-11-683.png, 
> image-2019-08-30-21-40-43-748.png, jstack.log, pstack.log
>
>
> Recently, I found a cassandra(3.11.4) node stuck in STARTING status for a 
> long time.
>  I used jstack to saw what happened. The main thread stuck in 
> *AbstractCommitLogSegmentManager.awaitAvailableSegment*
>  !image-2019-08-30-21-40-43-748.png|thumbnail! 
> The strange thing is COMMIT-LOG-ALLOCATOR thread state was runnable but it 
> was not actually running.  
>  !image-2019-08-30-21-30-11-683.png|thumbnail! 
> And then I used pstack to troubleshoot. I found COMMIT-LOG-ALLOCATOR block on 
> java class initialization.
>  !image-2019-08-30-21-25-17-638.png|thumbnail! 
> This is a deadlock obviously. CommitLog waits for a CommitLogSegment when 
> initializing. In this moment, the CommitLog class is not initialized and the 
> main thread holds the class lock. After that, COMMIT-LOG-ALLOCATOR creates a 
> CommitLogSegment with exception and call *CommitLog.handleCommitError*(static 
> method).  COMMIT-LOG-ALLOCATOR will stick on this line because CommitLog 
> class is still initializing.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15295) Running into deadlock when do CommitLog initialization

2019-08-30 Thread Zephyr Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zephyr Guo updated CASSANDRA-15295:
---
Attachment: jstack.log

> Running into deadlock when do CommitLog initialization
> --
>
> Key: CASSANDRA-15295
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15295
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log
>Reporter: Zephyr Guo
>Assignee: Zephyr Guo
>Priority: Normal
> Attachments: image-2019-08-30-21-25-17-638.png, 
> image-2019-08-30-21-29-48-623.png, image-2019-08-30-21-30-11-683.png, 
> image-2019-08-30-21-40-43-748.png, jstack.log
>
>
> Recently, I found a cassandra(3.11.4) node stuck in STARTING status for a 
> long time.
>  I used jstack to saw what happened. The main thread stuck in 
> *AbstractCommitLogSegmentManager.awaitAvailableSegment*
>  !image-2019-08-30-21-40-43-748.png|thumbnail! 
> The strange thing is COMMIT-LOG-ALLOCATOR thread state was runnable but it 
> was not actually running.  
>  !image-2019-08-30-21-30-11-683.png|thumbnail! 
> And then I used pstack to troubleshoot. I found COMMIT-LOG-ALLOCATOR block on 
> java class initialization.
>  !image-2019-08-30-21-25-17-638.png|thumbnail! 
> This is a deadlock obviously. CommitLog waits for a CommitLogSegment when 
> initializing. In this moment, the CommitLog class is not initialized and the 
> main thread holds the class lock. After that, COMMIT-LOG-ALLOCATOR creates a 
> CommitLogSegment with exception and call *CommitLog.handleCommitError*(static 
> method).  COMMIT-LOG-ALLOCATOR will stick on this line because CommitLog 
> class is still initializing.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15295) Running into deadlock when do CommitLog initialization

2019-08-30 Thread Zephyr Guo (Jira)
Zephyr Guo created CASSANDRA-15295:
--

 Summary: Running into deadlock when do CommitLog initialization
 Key: CASSANDRA-15295
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15295
 Project: Cassandra
  Issue Type: Bug
  Components: Local/Commit Log
Reporter: Zephyr Guo
Assignee: Zephyr Guo
 Attachments: image-2019-08-30-21-25-17-638.png, 
image-2019-08-30-21-29-48-623.png, image-2019-08-30-21-30-11-683.png, 
image-2019-08-30-21-40-43-748.png

Recently, I found a cassandra(3.11.4) node stuck in STARTING status for a long 
time.
 I used jstack to saw what happened. The main thread stuck in 
*AbstractCommitLogSegmentManager.awaitAvailableSegment*
 !image-2019-08-30-21-40-43-748.png|thumbnail! 

The strange thing is COMMIT-LOG-ALLOCATOR thread state was runnable but it was 
not actually running.  
 !image-2019-08-30-21-30-11-683.png|thumbnail! 

And then I used pstack to troubleshoot. I found COMMIT-LOG-ALLOCATOR block on 
java class initialization.
 !image-2019-08-30-21-25-17-638.png|thumbnail! 

This is a deadlock obviously. CommitLog waits for a CommitLogSegment when 
initializing. In this moment, the CommitLog class is not initialized and the 
main thread holds the class lock. After that, COMMIT-LOG-ALLOCATOR creates a 
CommitLogSegment with exception and call *CommitLog.handleCommitError*(static 
method).  COMMIT-LOG-ALLOCATOR will stick on this line because CommitLog class 
is still initializing.



 

 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15223) OutboundTcpConnection leaks direct memory

2019-08-30 Thread Venkata Harikrishna Nukala (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Harikrishna Nukala updated CASSANDRA-15223:
---
Test and Documentation Plan: Ran unit tests and tested basic things using 
ccm with compression enabled.
 Status: Patch Available  (was: Open)

> OutboundTcpConnection leaks direct memory
> -
>
> Key: CASSANDRA-15223
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15223
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Internode
>Reporter: Benedict
>Assignee: Venkata Harikrishna Nukala
>Priority: Normal
> Fix For: 3.0.x, 3.11.x
>
>
> On disconnect we set {{out}} to null without first closing it



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15260) Add `allocate_tokens_for_dc_rf` yaml option for token allocation

2019-08-30 Thread mck (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-15260:

Impacts: Docs
Test and Documentation Plan: unit test, manual testing
 Status: Patch Available  (was: In Progress)

Have added a unit test in BootStrapperTest. Does not do that much, as 
SummaryStatistics is not available when using 
`allocate_tokens_for_local_replication_factor`.



> Add `allocate_tokens_for_dc_rf` yaml option for token allocation
> 
>
> Key: CASSANDRA-15260
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15260
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: mck
>Assignee: mck
>Priority: Normal
> Fix For: 4.x
>
>
> Similar to DSE's option: {{allocate_tokens_for_local_replication_factor}}
> Currently the 
> [ReplicationAwareTokenAllocator|https://www.datastax.com/dev/blog/token-allocation-algorithm]
>  requires a defined keyspace and a replica factor specified in the current 
> datacenter.
> This is problematic in a number of ways. The real keyspace can not be used 
> when adding new datacenters as, in practice, all its nodes need to be up and 
> running before it has the capacity to replicate data into it. New datacenters 
> (or lift-and-shifting a cluster via datacenter migration) therefore has to be 
> done using a dummy keyspace that duplicates the replication strategy+factor 
> of the real keyspace. This gets even more difficult come version 4.0, as the 
> replica factor can not even be defined in new datacenters before those 
> datacenters are up and running. 
> These issues are removed by avoiding the keyspace definition and lookup, and 
> presuming the replica strategy is by datacenter, ie NTS. This can be done 
> with the use of an {{allocate_tokens_for_dc_rf}} option.
> It may also be of value considering whether {{allocate_tokens_for_dc_rf=3}} 
> becomes the default? as this is the replication factor for the vast majority 
> of datacenters in production. I suspect this would be a good improvement over 
> the existing randomly generated tokens algorithm.
> Initial patch is available in 
> [https://github.com/thelastpickle/cassandra/commit/fc4865b0399570e58f11215565ba17dc4a53da97]
> The patch does not remove the existing {{allocate_tokens_for_keyspace}} 
> option, as that provides the codebase for handling different replication 
> strategies.
>  
> fyi [~blambov] [~jay.zhuang] [~chovatia.jayd...@gmail.com] [~alokamvenki] 
> [~alexchueshev]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15260) Add `allocate_tokens_for_dc_rf` yaml option for token allocation

2019-08-30 Thread mck (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16905336#comment-16905336
 ] 

mck edited comment on CASSANDRA-15260 at 8/30/19 10:48 AM:
---

Thanks [~blambov]. The rename is done.


||branch||circleci||asf jenkins testall||
|[CASSANDRA-15260|https://github.com/apache/cassandra/compare/trunk...thelastpickle:mck/trunk__allocate_tokens_for_dc_rf]|[circleci|https://circleci.com/gh/thelastpickle/workflows/cassandra/tree/mck%2Ftrunk__allocate_tokens_for_dc_rf]|[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-testall/45//badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-testall/45/]|

I've opened the ticket, and will 'Submit Patch' it after I get some unit tests 
in.


was (Author: michaelsembwever):
Thanks [~blambov]. The rename is done.


||branch||circleci||asf jenkins testall||
|[CASSANDRA-15260|https://github.com/apache/cassandra/compare/trunk...thelastpickle:mck/trunk__allocate_tokens_for_dc_rf]|[circleci|https://circleci.com/gh/thelastpickle/workflows/cassandra/tree/mck%2Ftrunk__allocate_tokens_for_dc_rf]|[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-testall/43//badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-testall/43/]|

I've opened the ticket, and will 'Submit Patch' it after I get some unit tests 
in.

> Add `allocate_tokens_for_dc_rf` yaml option for token allocation
> 
>
> Key: CASSANDRA-15260
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15260
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: mck
>Assignee: mck
>Priority: Normal
> Fix For: 4.x
>
>
> Similar to DSE's option: {{allocate_tokens_for_local_replication_factor}}
> Currently the 
> [ReplicationAwareTokenAllocator|https://www.datastax.com/dev/blog/token-allocation-algorithm]
>  requires a defined keyspace and a replica factor specified in the current 
> datacenter.
> This is problematic in a number of ways. The real keyspace can not be used 
> when adding new datacenters as, in practice, all its nodes need to be up and 
> running before it has the capacity to replicate data into it. New datacenters 
> (or lift-and-shifting a cluster via datacenter migration) therefore has to be 
> done using a dummy keyspace that duplicates the replication strategy+factor 
> of the real keyspace. This gets even more difficult come version 4.0, as the 
> replica factor can not even be defined in new datacenters before those 
> datacenters are up and running. 
> These issues are removed by avoiding the keyspace definition and lookup, and 
> presuming the replica strategy is by datacenter, ie NTS. This can be done 
> with the use of an {{allocate_tokens_for_dc_rf}} option.
> It may also be of value considering whether {{allocate_tokens_for_dc_rf=3}} 
> becomes the default? as this is the replication factor for the vast majority 
> of datacenters in production. I suspect this would be a good improvement over 
> the existing randomly generated tokens algorithm.
> Initial patch is available in 
> [https://github.com/thelastpickle/cassandra/commit/fc4865b0399570e58f11215565ba17dc4a53da97]
> The patch does not remove the existing {{allocate_tokens_for_keyspace}} 
> option, as that provides the codebase for handling different replication 
> strategies.
>  
> fyi [~blambov] [~jay.zhuang] [~chovatia.jayd...@gmail.com] [~alokamvenki] 
> [~alexchueshev]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14772) Fix issues in audit / full query log interactions

2019-08-30 Thread Benedict (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-14772:
-
Fix Version/s: 4.0-rc

> Fix issues in audit / full query log interactions
> -
>
> Key: CASSANDRA-14772
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14772
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/CQL, Legacy/Tools
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
> Fix For: 4.0, 4.0-rc
>
>
> There are some problems with the audit + full query log code that need to be 
> resolved before 4.0 is released:
> * Fix performance regression in FQL that makes it less usable than it should 
> be.
> * move full query log specific code to a separate package 
> * do some audit log class renames (I keep reading {{BinLogAuditLogger}} vs 
> {{BinAuditLogger}} wrong for example)
> * avoid parsing the CQL queries twice in {{QueryMessage}} when audit log is 
> enabled.
> * add a new tool to dump audit logs (ie, let fqltool be full query log 
> specific). fqltool crashes when pointed to them.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13994) Remove COMPACT STORAGE internals before 4.0 release

2019-08-30 Thread Benedict (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-13994:
-
Fix Version/s: (was: 4.x)
   4.0-rc

> Remove COMPACT STORAGE internals before 4.0 release
> ---
>
> Key: CASSANDRA-13994
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13994
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Local Write-Read Paths
>Reporter: Alex Petrov
>Assignee: Alex Petrov
>Priority: Low
> Fix For: 4.0-rc
>
>
> 4.0 comes without thrift (after [CASSANDRA-5]) and COMPACT STORAGE (after 
> [CASSANDRA-10857]), and since Compact Storage flags are now disabled, all of 
> the related functionality is useless.
> There are still some things to consider:
> 1. One of the system tables (built indexes) was compact. For now, we just 
> added {{value}} column to it to make sure it's backwards-compatible, but we 
> might want to make sure it's just a "normal" table and doesn't have redundant 
> columns.
> 2. Compact Tables were building indexes in {{KEYS}} mode. Removing it is 
> trivial, but this would mean that all built indexes will be defunct. We could 
> log a warning for now and ask users to migrate off those for now and 
> completely remove it from future releases. It's just a couple of classes 
> though.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14801) calculatePendingRanges no longer safe for multiple adjacent range movements

2019-08-30 Thread Benedict (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-14801:
-
Fix Version/s: 4.0

> calculatePendingRanges no longer safe for multiple adjacent range movements
> ---
>
> Key: CASSANDRA-14801
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14801
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Coordination, Legacy/Distributed Metadata
>Reporter: Benedict
>Priority: Normal
> Fix For: 4.0, 4.0-beta
>
>
> Correctness depended upon the narrowing to a {{Set}}, 
> which we no longer do - we maintain a collection of all {{Replica}}.  Our 
> {{RangesAtEndpoint}} collection built by {{getPendingRanges}} can as a result 
> contain the same endpoint multiple times; and our {{EndpointsForToken}} 
> obtained by {{TokenMetadata.pendingEndpointsFor}} may fail to be constructed, 
> resulting in cluster-wide failures for writes to the affected token ranges 
> for the duration of the range movement.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15214) OOMs caught and not rethrown

2019-08-30 Thread Benedict (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-15214:
-
Fix Version/s: 4.0

> OOMs caught and not rethrown
> 
>
> Key: CASSANDRA-15214
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15214
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Client, Messaging/Internode
>Reporter: Benedict
>Priority: Normal
> Fix For: 4.0, 4.0-rc
>
> Attachments: oom-experiments.zip
>
>
> Netty (at least, and perhaps elsewhere in Executors) catches all exceptions, 
> so presently there is no way to ensure that an OOM reaches the JVM handler to 
> trigger a crash/heapdump.
> It may be that the simplest most consistent way to do this would be to have a 
> single thread spawned at startup that waits for any exceptions we must 
> propagate to the Runtime.
> We could probably submit a patch upstream to Netty, but for a guaranteed 
> future proof approach, it may be worth paying the cost of a single thread.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14888) Several mbeans are not unregistered when dropping a keyspace and table

2019-08-30 Thread Benedict (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-14888:
-
Fix Version/s: 4.0

> Several mbeans are not unregistered when dropping a keyspace and table
> --
>
> Key: CASSANDRA-14888
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14888
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Ariel Weisberg
>Assignee: Alex Deparvu
>Priority: Urgent
>  Labels: patch-available
> Fix For: 4.0, 4.0-rc
>
> Attachments: CASSANDRA-14888.patch
>
>
> CasCommit, CasPrepare, CasPropose, ReadRepairRequests, 
> ShortReadProtectionRequests, AntiCompactionTime, BytesValidated, 
> PartitionsValidated, RepairPrepareTime, RepairSyncTime, 
> RepairedDataInconsistencies, ViewLockAcquireTime, ViewReadTime, 
> WriteFailedIdealCL
> Basically for 3 years people haven't known what they are doing because the 
> entire thing is kind of obscure. Fix it and also add a dtest that detects if 
> any mbeans are left behind after dropping a table and keyspace.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15234) Standardise config and JVM parameters

2019-08-30 Thread Benedict (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-15234:
-
Fix Version/s: 4.0

> Standardise config and JVM parameters
> -
>
> Key: CASSANDRA-15234
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15234
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Benedict
>Priority: Normal
> Fix For: 4.0, 4.0-beta
>
>
> We have a bunch of inconsistent names and config patterns in the codebase, 
> both from the yams and JVM properties.  It would be nice to standardise the 
> naming (such as otc_ vs internode_) as well as the provision of values with 
> units - while maintaining perpetual backwards compatibility with the old 
> parameter names, of course.
> For temporal units, I would propose parsing strings with suffixes of:
> {{code}}
> u|micros(econds?)?
> s(econds?)?
> m(inutes?)?
> h(ours?)?
> d(ays?)?
> mo(nths?)?
> {{code}}
> For rate units, I would propose parsing any of the standard {{B/s, KiB/s, 
> MiB/s, GiB/s, TiB/s}}.
> Perhaps for avoiding ambiguity we could not accept bauds {{bs, Mbps}} or 
> powers of 1000 such as {{KB/s}}, given these are regularly used for either 
> their old or new definition e.g. {{KiB/s}}, or we could support them and 
> simply log the value in bytes/s.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-10190) Python 3 support for cqlsh

2019-08-30 Thread Benedict (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-10190:
-
Fix Version/s: 4.0

> Python 3 support for cqlsh
> --
>
> Key: CASSANDRA-10190
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10190
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Tools
>Reporter: Andrew Pennebaker
>Assignee: Patrick Bannister
>Priority: Normal
>  Labels: cqlsh
> Fix For: 4.0, 4.0-alpha
>
> Attachments: coverage_notes.txt
>
>
> Users who operate in a Python 3 environment may have trouble launching cqlsh. 
> Could we please update cqlsh's syntax to run in Python 3?
> As a workaround, users can setup pyenv, and cd to a directory with a 
> .python-version containing "2.7". But it would be nice if cqlsh supported 
> modern Python versions out of the box.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13994) Remove COMPACT STORAGE internals before 4.0 release

2019-08-30 Thread Benedict (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-13994:
-
Fix Version/s: 4.0

> Remove COMPACT STORAGE internals before 4.0 release
> ---
>
> Key: CASSANDRA-13994
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13994
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Local Write-Read Paths
>Reporter: Alex Petrov
>Assignee: Alex Petrov
>Priority: Low
> Fix For: 4.0, 4.0-rc
>
>
> 4.0 comes without thrift (after [CASSANDRA-5]) and COMPACT STORAGE (after 
> [CASSANDRA-10857]), and since Compact Storage flags are now disabled, all of 
> the related functionality is useless.
> There are still some things to consider:
> 1. One of the system tables (built indexes) was compact. For now, we just 
> added {{value}} column to it to make sure it's backwards-compatible, but we 
> might want to make sure it's just a "normal" table and doesn't have redundant 
> columns.
> 2. Compact Tables were building indexes in {{KEYS}} mode. Removing it is 
> trivial, but this would mean that all built indexes will be defunct. We could 
> log a warning for now and ask users to migrate off those for now and 
> completely remove it from future releases. It's just a couple of classes 
> though.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15216) Cross node message creation times are disabled by default

2019-08-30 Thread Benedict (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-15216:
-
Fix Version/s: 4.0

> Cross node message creation times are disabled by default
> -
>
> Key: CASSANDRA-15216
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15216
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Internode
>Reporter: Benedict
>Priority: Normal
> Fix For: 4.0, 4.0-alpha
>
>
> This can cause a lot of wasted work for messages that have timed out on the 
> coordinator.  We should generally assume that our users have setup NTP on 
> their clusters, and that clocks are modestly in sync, since it’s a 
> requirement for general correctness of last write wins.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14973) Bring v5 driver out of beta, introduce v6 before 4.0 release is cut

2019-08-30 Thread Benedict (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-14973:
-
Fix Version/s: 4.0

> Bring v5 driver out of beta, introduce v6 before 4.0 release is cut
> ---
>
> Key: CASSANDRA-14973
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14973
> Project: Cassandra
>  Issue Type: Task
>Reporter: Alex Petrov
>Assignee: Alex Petrov
>Priority: Urgent
> Fix For: 4.0, 4.0-rc
>
>
> In http://issues.apache.org/jira/browse/CASSANDRA-12142, we’ve introduced 
> Beta flag for v5 protocol. However, up till now, v5 is in beta both in 
> [Cassandra|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/transport/ProtocolVersion.java#L46]
>  and in 
> [java-driver|https://github.com/datastax/java-driver/blob/3.x/driver-core/src/main/java/com/datastax/driver/core/ProtocolVersion.java#L35].
>  
> Before the final 4.0 release is cut, we need to bring v5 out of beta and 
> finalise native protocol spec, and start bringing all new changes to v6 
> protocol, which will be in beta.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15229) BufferPool Regression

2019-08-30 Thread Benedict (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-15229:
-
Fix Version/s: 4.0

> BufferPool Regression
> -
>
> Key: CASSANDRA-15229
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15229
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Caching
>Reporter: Benedict
>Priority: Normal
> Fix For: 4.0, 4.0-beta
>
>
> The BufferPool was never intended to be used for a {{ChunkCache}}, and we 
> need to either change our behaviour to handle uncorrelated lifetimes or use 
> something else.  This is particularly important with the default chunk size 
> for compressed sstables being reduced.  If we address the problem, we should 
> also utilise the BufferPool for native transport connections like we do for 
> internode messaging, and reduce the number of pooling solutions we employ.
> Probably the best thing to do is to improve BufferPool’s behaviour when used 
> for things with uncorrelated lifetimes, which essentially boils down to 
> tracking those chunks that have not been freed and re-circulating them when 
> we run out of completely free blocks.  We should probably also permit 
> instantiating separate {{BufferPool}}, so that we can insulate internode 
> messaging from the {{ChunkCache}}, or at least have separate memory bounds 
> for each, and only share fully-freed chunks.
> With these improvements we can also safely increase the {{BufferPool}} chunk 
> size to 128KiB or 256KiB, to guarantee we can fit compressed pages and reduce 
> the amount of global coordination and per-allocation overhead.  We don’t need 
> 1KiB granularity for allocations, nor 16 byte granularity for tiny 
> allocations.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14973) Bring v5 driver out of beta, introduce v6 before 4.0 release is cut

2019-08-30 Thread Benedict (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-14973:
-
Fix Version/s: (was: 4.0)
   4.0-rc

> Bring v5 driver out of beta, introduce v6 before 4.0 release is cut
> ---
>
> Key: CASSANDRA-14973
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14973
> Project: Cassandra
>  Issue Type: Task
>Reporter: Alex Petrov
>Assignee: Alex Petrov
>Priority: Urgent
> Fix For: 4.0-rc
>
>
> In http://issues.apache.org/jira/browse/CASSANDRA-12142, we’ve introduced 
> Beta flag for v5 protocol. However, up till now, v5 is in beta both in 
> [Cassandra|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/transport/ProtocolVersion.java#L46]
>  and in 
> [java-driver|https://github.com/datastax/java-driver/blob/3.x/driver-core/src/main/java/com/datastax/driver/core/ProtocolVersion.java#L35].
>  
> Before the final 4.0 release is cut, we need to bring v5 out of beta and 
> finalise native protocol spec, and start bringing all new changes to v6 
> protocol, which will be in beta.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14775) StreamingTombstoneHistogramBuilder overflows if > 2B in a single bucket/stabile

2019-08-30 Thread Benedict (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-14775:
-
Fix Version/s: 4.0

> StreamingTombstoneHistogramBuilder overflows if > 2B in a single 
> bucket/stabile
> ---
>
> Key: CASSANDRA-14775
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14775
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Benedict
>Assignee: Alex Deparvu
>Priority: Normal
> Fix For: 4.0, 4.0-beta
>
>
> This may be unlikely, but is certainly not impossible.  In this event, the 
> count for the bucket will be reset to zero, and the time distorted to 1s in 
> the future.  If MAX_DELETION_TIME were encountered through overflow, this 
> might result in a bucket with NO_DELETION_TIME.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14773) Overflow of 32-bit integer during compaction.

2019-08-30 Thread Benedict (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-14773:
-
Fix Version/s: 4.0

> Overflow of 32-bit integer during compaction.
> -
>
> Key: CASSANDRA-14773
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14773
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Vladimir Bukhtoyarov
>Assignee: Vladimir Bukhtoyarov
>Priority: Urgent
> Fix For: 4.0, 4.0-beta
>
>
> In scope of CASSANDRA-13444 the compaction was significantly improved from 
> CPU and memory perspective. Hovewer this improvement introduces the bug in 
> rounding. When rounding the expriration time which is close to  
> *Cell.MAX_DELETION_TIME*(it is just *Integer.MAX_VALUE*) the math overflow 
> happens(because in scope of -CASSANDRA-13444-) data type for point was 
> changed from Long to Integer in order to reduce memory footprint), as result 
> point became negative and acts as silent poison for internal structures of 
> StreamingTombstoneHistogramBuilder like *DistanceHolder* and *DataHolder*. 
> Then depending of point intervals:
>  * The TombstoneHistogram produces wrong values when interval of points is 
> less then binSize, it is not critical.
>  * Compaction crashes with ArrayIndexOutOfBoundsException if amount of point 
> intervals is great then  binSize, this case is very critical.
>  
> This is pull request [https://github.com/apache/cassandra/pull/273] that 
> reproduces the issue and provides the fix. 
>  
> The stacktrace when running(on codebase without fix) 
> *testMathOverflowDuringRoundingOfLargeTimestamp* without -ea JVM flag
> {noformat}
> java.lang.ArrayIndexOutOfBoundsException
> at java.lang.System.arraycopy(Native Method)
> at 
> org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder$DistanceHolder.add(StreamingTombstoneHistogramBuilder.java:208)
> at 
> org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder.flushValue(StreamingTombstoneHistogramBuilder.java:140)
> at 
> org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder$$Lambda$1/1967205423.consume(Unknown
>  Source)
> at 
> org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder$Spool.forEach(StreamingTombstoneHistogramBuilder.java:574)
> at 
> org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder.flushHistogram(StreamingTombstoneHistogramBuilder.java:124)
> at 
> org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder.build(StreamingTombstoneHistogramBuilder.java:184)
> at 
> org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilderTest.testMathOverflowDuringRoundingOfLargeTimestamp(StreamingTombstoneHistogramBuilderTest.java:183)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
> at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:44)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:180)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:41)
> at org.junit.runners.ParentRunner$1.evaluate(ParentRunner.java:173)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
> at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:220)
> at org.junit.runner.JUnitCore.run(JUnitCore.java:159)
> at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
> at 
> com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
> at 
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
> at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
> {noformat}
>  
> The stacktrace when running(on codebase without fix)  
> *testMathOverflowDuringRoundingOfLargeTimestamp* with 

[jira] [Updated] (CASSANDRA-14748) Recycler$WeakOrderQueue occupies Heap

2019-08-30 Thread Benedict (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-14748:
-
Fix Version/s: 4.0

> Recycler$WeakOrderQueue occupies Heap
> -
>
> Key: CASSANDRA-14748
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14748
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
> Environment: The netty Cassandra using is netty-all-4.0.39.Final.jar
>Reporter: HX
>Priority: Normal
> Fix For: 4.0, 3.11.x, 4.0-rc
>
>
> Heap constantly high on some of the nodes in the cluster, I dump the heap and 
> open it through Eclipse Memory Analyzer, looks like Recycler$WeakOrderQueue 
> occupies most of the heap. 
>  
> ||Package||Retained Heap||Retained Heap, %||# Top Dominators||
> |!/jira/icons/i5.gif! |7,078,140,136|100.00%|379,627|
> |io|5,665,035,800|80.04%|13,306|
> |netty|5,665,035,800|80.04%|13,306|
> |util|5,568,107,344|78.67%|2,965|
> |Recycler$WeakOrderQueue|4,950,021,544|69.93%|2,169|



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14834) Avoid keeping StreamingTombstoneHistogramBuilder.Spool in memory during the whole compaction

2019-08-30 Thread Benedict (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-14834:
-
Fix Version/s: 4.0

> Avoid keeping StreamingTombstoneHistogramBuilder.Spool in memory during the 
> whole compaction
> 
>
> Key: CASSANDRA-14834
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14834
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Low
> Fix For: 4.0, 4.0-beta
>
>
> Since CASSANDRA-13444 {{StreamingTombstoneHistogramBuilder.Spool}} is 
> allocated to keep around an array with 131072 * 2 * 2 integers *per written 
> sstable* during the whole compaction. With LCS at times creating 1000s of 
> sstables during a compaction it kills the node.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15229) BufferPool Regression

2019-08-30 Thread Benedict (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-15229:
-
Fix Version/s: (was: 4.0)
   4.0-beta

> BufferPool Regression
> -
>
> Key: CASSANDRA-15229
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15229
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Caching
>Reporter: Benedict
>Priority: Normal
> Fix For: 4.0-beta
>
>
> The BufferPool was never intended to be used for a {{ChunkCache}}, and we 
> need to either change our behaviour to handle uncorrelated lifetimes or use 
> something else.  This is particularly important with the default chunk size 
> for compressed sstables being reduced.  If we address the problem, we should 
> also utilise the BufferPool for native transport connections like we do for 
> internode messaging, and reduce the number of pooling solutions we employ.
> Probably the best thing to do is to improve BufferPool’s behaviour when used 
> for things with uncorrelated lifetimes, which essentially boils down to 
> tracking those chunks that have not been freed and re-circulating them when 
> we run out of completely free blocks.  We should probably also permit 
> instantiating separate {{BufferPool}}, so that we can insulate internode 
> messaging from the {{ChunkCache}}, or at least have separate memory bounds 
> for each, and only share fully-freed chunks.
> With these improvements we can also safely increase the {{BufferPool}} chunk 
> size to 128KiB or 256KiB, to guarantee we can fit compressed pages and reduce 
> the amount of global coordination and per-allocation overhead.  We don’t need 
> 1KiB granularity for allocations, nor 16 byte granularity for tiny 
> allocations.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14775) StreamingTombstoneHistogramBuilder overflows if > 2B in a single bucket/stabile

2019-08-30 Thread Benedict (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-14775:
-
Fix Version/s: (was: 4.0)
   4.0-beta

> StreamingTombstoneHistogramBuilder overflows if > 2B in a single 
> bucket/stabile
> ---
>
> Key: CASSANDRA-14775
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14775
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Benedict
>Assignee: Alex Deparvu
>Priority: Normal
> Fix For: 4.0-beta
>
>
> This may be unlikely, but is certainly not impossible.  In this event, the 
> count for the bucket will be reset to zero, and the time distorted to 1s in 
> the future.  If MAX_DELETION_TIME were encountered through overflow, this 
> might result in a bucket with NO_DELETION_TIME.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14834) Avoid keeping StreamingTombstoneHistogramBuilder.Spool in memory during the whole compaction

2019-08-30 Thread Benedict (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-14834:
-
Fix Version/s: (was: 4.0)
   4.0-beta

> Avoid keeping StreamingTombstoneHistogramBuilder.Spool in memory during the 
> whole compaction
> 
>
> Key: CASSANDRA-14834
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14834
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Low
> Fix For: 4.0-beta
>
>
> Since CASSANDRA-13444 {{StreamingTombstoneHistogramBuilder.Spool}} is 
> allocated to keep around an array with 131072 * 2 * 2 integers *per written 
> sstable* during the whole compaction. With LCS at times creating 1000s of 
> sstables during a compaction it kills the node.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14748) Recycler$WeakOrderQueue occupies Heap

2019-08-30 Thread Benedict (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-14748:
-
Fix Version/s: 4.0-rc

> Recycler$WeakOrderQueue occupies Heap
> -
>
> Key: CASSANDRA-14748
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14748
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
> Environment: The netty Cassandra using is netty-all-4.0.39.Final.jar
>Reporter: HX
>Priority: Normal
> Fix For: 3.11.x, 4.0-rc
>
>
> Heap constantly high on some of the nodes in the cluster, I dump the heap and 
> open it through Eclipse Memory Analyzer, looks like Recycler$WeakOrderQueue 
> occupies most of the heap. 
>  
> ||Package||Retained Heap||Retained Heap, %||# Top Dominators||
> |!/jira/icons/i5.gif! |7,078,140,136|100.00%|379,627|
> |io|5,665,035,800|80.04%|13,306|
> |netty|5,665,035,800|80.04%|13,306|
> |util|5,568,107,344|78.67%|2,965|
> |Recycler$WeakOrderQueue|4,950,021,544|69.93%|2,169|



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14773) Overflow of 32-bit integer during compaction.

2019-08-30 Thread Benedict (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-14773:
-
Fix Version/s: (was: 4.x)
   4.0-beta

> Overflow of 32-bit integer during compaction.
> -
>
> Key: CASSANDRA-14773
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14773
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Vladimir Bukhtoyarov
>Assignee: Vladimir Bukhtoyarov
>Priority: Urgent
> Fix For: 4.0-beta
>
>
> In scope of CASSANDRA-13444 the compaction was significantly improved from 
> CPU and memory perspective. Hovewer this improvement introduces the bug in 
> rounding. When rounding the expriration time which is close to  
> *Cell.MAX_DELETION_TIME*(it is just *Integer.MAX_VALUE*) the math overflow 
> happens(because in scope of -CASSANDRA-13444-) data type for point was 
> changed from Long to Integer in order to reduce memory footprint), as result 
> point became negative and acts as silent poison for internal structures of 
> StreamingTombstoneHistogramBuilder like *DistanceHolder* and *DataHolder*. 
> Then depending of point intervals:
>  * The TombstoneHistogram produces wrong values when interval of points is 
> less then binSize, it is not critical.
>  * Compaction crashes with ArrayIndexOutOfBoundsException if amount of point 
> intervals is great then  binSize, this case is very critical.
>  
> This is pull request [https://github.com/apache/cassandra/pull/273] that 
> reproduces the issue and provides the fix. 
>  
> The stacktrace when running(on codebase without fix) 
> *testMathOverflowDuringRoundingOfLargeTimestamp* without -ea JVM flag
> {noformat}
> java.lang.ArrayIndexOutOfBoundsException
> at java.lang.System.arraycopy(Native Method)
> at 
> org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder$DistanceHolder.add(StreamingTombstoneHistogramBuilder.java:208)
> at 
> org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder.flushValue(StreamingTombstoneHistogramBuilder.java:140)
> at 
> org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder$$Lambda$1/1967205423.consume(Unknown
>  Source)
> at 
> org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder$Spool.forEach(StreamingTombstoneHistogramBuilder.java:574)
> at 
> org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder.flushHistogram(StreamingTombstoneHistogramBuilder.java:124)
> at 
> org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder.build(StreamingTombstoneHistogramBuilder.java:184)
> at 
> org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilderTest.testMathOverflowDuringRoundingOfLargeTimestamp(StreamingTombstoneHistogramBuilderTest.java:183)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
> at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:44)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:180)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:41)
> at org.junit.runners.ParentRunner$1.evaluate(ParentRunner.java:173)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
> at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:220)
> at org.junit.runner.JUnitCore.run(JUnitCore.java:159)
> at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
> at 
> com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
> at 
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
> at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
> {noformat}
>  
> The stacktrace when running(on codebase without fix)  
> 

[jira] [Updated] (CASSANDRA-15214) OOMs caught and not rethrown

2019-08-30 Thread Benedict (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-15214:
-
Fix Version/s: (was: 4.0-rc)
   4.0-beta

> OOMs caught and not rethrown
> 
>
> Key: CASSANDRA-15214
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15214
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Client, Messaging/Internode
>Reporter: Benedict
>Priority: Normal
> Fix For: 4.0-beta
>
> Attachments: oom-experiments.zip
>
>
> Netty (at least, and perhaps elsewhere in Executors) catches all exceptions, 
> so presently there is no way to ensure that an OOM reaches the JVM handler to 
> trigger a crash/heapdump.
> It may be that the simplest most consistent way to do this would be to have a 
> single thread spawned at startup that waits for any exceptions we must 
> propagate to the Runtime.
> We could probably submit a patch upstream to Netty, but for a guaranteed 
> future proof approach, it may be worth paying the cost of a single thread.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15214) OOMs caught and not rethrown

2019-08-30 Thread Benedict (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-15214:
-
Fix Version/s: (was: 4.0-beta)
   4.0-rc

> OOMs caught and not rethrown
> 
>
> Key: CASSANDRA-15214
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15214
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Client, Messaging/Internode
>Reporter: Benedict
>Priority: Normal
> Fix For: 4.0-rc
>
> Attachments: oom-experiments.zip
>
>
> Netty (at least, and perhaps elsewhere in Executors) catches all exceptions, 
> so presently there is no way to ensure that an OOM reaches the JVM handler to 
> trigger a crash/heapdump.
> It may be that the simplest most consistent way to do this would be to have a 
> single thread spawned at startup that waits for any exceptions we must 
> propagate to the Runtime.
> We could probably submit a patch upstream to Netty, but for a guaranteed 
> future proof approach, it may be worth paying the cost of a single thread.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14888) Several mbeans are not unregistered when dropping a keyspace and table

2019-08-30 Thread Benedict (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-14888:
-
Fix Version/s: (was: 4.0.x)
   4.0-rc

> Several mbeans are not unregistered when dropping a keyspace and table
> --
>
> Key: CASSANDRA-14888
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14888
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Ariel Weisberg
>Assignee: Alex Deparvu
>Priority: Urgent
>  Labels: patch-available
> Fix For: 4.0-rc
>
> Attachments: CASSANDRA-14888.patch
>
>
> CasCommit, CasPrepare, CasPropose, ReadRepairRequests, 
> ShortReadProtectionRequests, AntiCompactionTime, BytesValidated, 
> PartitionsValidated, RepairPrepareTime, RepairSyncTime, 
> RepairedDataInconsistencies, ViewLockAcquireTime, ViewReadTime, 
> WriteFailedIdealCL
> Basically for 3 years people haven't known what they are doing because the 
> entire thing is kind of obscure. Fix it and also add a dtest that detects if 
> any mbeans are left behind after dropping a table and keyspace.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15214) OOMs caught and not rethrown

2019-08-30 Thread Benedict (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-15214:
-
Fix Version/s: (was: 4.0)
   4.0-rc

> OOMs caught and not rethrown
> 
>
> Key: CASSANDRA-15214
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15214
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Client, Messaging/Internode
>Reporter: Benedict
>Priority: Normal
> Fix For: 4.0-rc
>
> Attachments: oom-experiments.zip
>
>
> Netty (at least, and perhaps elsewhere in Executors) catches all exceptions, 
> so presently there is no way to ensure that an OOM reaches the JVM handler to 
> trigger a crash/heapdump.
> It may be that the simplest most consistent way to do this would be to have a 
> single thread spawned at startup that waits for any exceptions we must 
> propagate to the Runtime.
> We could probably submit a patch upstream to Netty, but for a guaranteed 
> future proof approach, it may be worth paying the cost of a single thread.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15234) Standardise config and JVM parameters

2019-08-30 Thread Benedict (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-15234:
-
Fix Version/s: 4.0-beta

> Standardise config and JVM parameters
> -
>
> Key: CASSANDRA-15234
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15234
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Benedict
>Priority: Normal
> Fix For: 4.0-beta
>
>
> We have a bunch of inconsistent names and config patterns in the codebase, 
> both from the yams and JVM properties.  It would be nice to standardise the 
> naming (such as otc_ vs internode_) as well as the provision of values with 
> units - while maintaining perpetual backwards compatibility with the old 
> parameter names, of course.
> For temporal units, I would propose parsing strings with suffixes of:
> {{code}}
> u|micros(econds?)?
> s(econds?)?
> m(inutes?)?
> h(ours?)?
> d(ays?)?
> mo(nths?)?
> {{code}}
> For rate units, I would propose parsing any of the standard {{B/s, KiB/s, 
> MiB/s, GiB/s, TiB/s}}.
> Perhaps for avoiding ambiguity we could not accept bauds {{bs, Mbps}} or 
> powers of 1000 such as {{KB/s}}, given these are regularly used for either 
> their old or new definition e.g. {{KiB/s}}, or we could support them and 
> simply log the value in bytes/s.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15216) Cross node message creation times are disabled by default

2019-08-30 Thread Benedict (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-15216:
-
Fix Version/s: (was: 4.0)
   4.0-alpha

> Cross node message creation times are disabled by default
> -
>
> Key: CASSANDRA-15216
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15216
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Internode
>Reporter: Benedict
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> This can cause a lot of wasted work for messages that have timed out on the 
> coordinator.  We should generally assume that our users have setup NTP on 
> their clusters, and that clocks are modestly in sync, since it’s a 
> requirement for general correctness of last write wins.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14801) calculatePendingRanges no longer safe for multiple adjacent range movements

2019-08-30 Thread Benedict (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-14801:
-
Fix Version/s: (was: 4.0)
   4.0-beta

> calculatePendingRanges no longer safe for multiple adjacent range movements
> ---
>
> Key: CASSANDRA-14801
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14801
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Coordination, Legacy/Distributed Metadata
>Reporter: Benedict
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Correctness depended upon the narrowing to a {{Set}}, 
> which we no longer do - we maintain a collection of all {{Replica}}.  Our 
> {{RangesAtEndpoint}} collection built by {{getPendingRanges}} can as a result 
> contain the same endpoint multiple times; and our {{EndpointsForToken}} 
> obtained by {{TokenMetadata.pendingEndpointsFor}} may fail to be constructed, 
> resulting in cluster-wide failures for writes to the affected token ranges 
> for the duration of the range movement.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org