[jira] [Commented] (CASSANDRA-14957) Rolling Restart Of Nodes Cause Dataloss Due To Schema Collision

2019-01-08 Thread Alex Petrov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737948#comment-16737948
 ] 

Alex Petrov commented on CASSANDRA-14957:
-

[~via.vokal] was it the same version or did you upgrade between minor versions?

> Rolling Restart Of Nodes Cause Dataloss Due To Schema Collision
> ---
>
> Key: CASSANDRA-14957
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14957
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Schema
>Reporter: Avraham Kalvo
>Priority: Major
>
> We were issuing a rolling restart on a mission-critical five node C* cluster.
> The first node which was restarted got the following messages in its 
> system.log:
> ```
> January 2nd 2019, 12:06:37.310 - INFO 12:06:35 Initializing 
> tasks_scheduler_external.tasks
> ```
> ```
> WARN 12:06:39 UnknownColumnFamilyException reading from socket; closing
> org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find table for 
> cfId bd7200a0-1567-11e8-8974-855d74ee356f. If a table was just created, this 
> is likely due to the schema not being fully propagated. Please wait for 
> schema agreement on table creation.
> at 
> org.apache.cassandra.config.CFMetaData$Serializer.deserialize(CFMetaData.java:1336)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:660)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:635)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:330)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:349)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:286)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at org.apache.cassandra.net.MessageIn.read(MessageIn.java:98) 
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:201)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:178)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:92)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> ```
> The latter was then repeated several times across the cluster.
> It was then found out that the table in question 
> `tasks_scheduler_external.tasks` was created with a new schema version after 
> the entire cluster was restarted consecutively and schema agreement settled, 
> which started taking requests leaving the previous version of the schema 
> unavailable for any request, thus generating a data loss to our online system.
> Data loss was recovered by manually copying SSTables from the previous 
> version directory of the schema to the new one followed by `nodetool refresh` 
> to the relevant table.
> The above has repeated itself for several tables across various keyspaces.
> One other thing to mention is that a repair was in place for the first node 
> to be restarted, which was obviously stopped as the daemon was shut down, but 
> this doesn't seem to do with the above at first glance.
> Seems somewhat related to:
> https://issues.apache.org/jira/browse/CASSANDRA-13559



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14957) Rolling Restart Of Nodes Cause Dataloss Due To Schema Collision

2019-01-08 Thread Avraham Kalvo (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737908#comment-16737908
 ] 

Avraham Kalvo commented on CASSANDRA-14957:
---

Thanks [~jeromatron],

Looking at the data directory for one of the keyspaces in question right now, 
even a couple of full primary range repairs have completed across the cluster 
since the outage, the following is apparent:

```
36 drwxr-xr-x. 4 root root 28672 Jan  2 12:07 
tasks-bd7200a0156711e88974855d74ee356f
   8 drwxr-xr-x. 4 root root  4096 Jan  9 06:38 
tasks-bd750de0156711e8bdc54f7bcdcb851f
```

Data was not lost from disk, but became no longer available for reads/writes 
via the database, I.e. - effectively lost to the application.
As far as I know, anti-entropy actions don't take care of the above situation 
and indeed it needed to be recovered manually as described in the original 
comment for this issue.
Writes only began to succeed once the schema agreement has settled across all 
the cluster, Until then, the application was timing out on any request to 
Cassandra. 

What do you think?

> Rolling Restart Of Nodes Cause Dataloss Due To Schema Collision
> ---
>
> Key: CASSANDRA-14957
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14957
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Schema
>Reporter: Avraham Kalvo
>Priority: Major
>
> We were issuing a rolling restart on a mission-critical five node C* cluster.
> The first node which was restarted got the following messages in its 
> system.log:
> ```
> January 2nd 2019, 12:06:37.310 - INFO 12:06:35 Initializing 
> tasks_scheduler_external.tasks
> ```
> ```
> WARN 12:06:39 UnknownColumnFamilyException reading from socket; closing
> org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find table for 
> cfId bd7200a0-1567-11e8-8974-855d74ee356f. If a table was just created, this 
> is likely due to the schema not being fully propagated. Please wait for 
> schema agreement on table creation.
> at 
> org.apache.cassandra.config.CFMetaData$Serializer.deserialize(CFMetaData.java:1336)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:660)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:635)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:330)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:349)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:286)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at org.apache.cassandra.net.MessageIn.read(MessageIn.java:98) 
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:201)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:178)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:92)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> ```
> The latter was then repeated several times across the cluster.
> It was then found out that the table in question 
> `tasks_scheduler_external.tasks` was created with a new schema version after 
> the entire cluster was restarted consecutively and schema agreement settled, 
> which started taking requests leaving the previous version of the schema 
> unavailable for any request, thus generating a data loss to our online system.
> Data loss was recovered by manually copying SSTables from the previous 
> version directory of the schema to the new one followed by `nodetool refresh` 
> to the relevant table.
> The above has repeated itself for several tables across various keyspaces.
> One other thing to mention is that a repair was in place for the first node 
> to be restarted, which was obviously stopped as the daemon was shut down, but 
> this doesn't seem to do with the above at first glance.
> Seems somewhat related to:
> https://issues.apache.org/jira/browse/CASSANDRA-13559



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14957) Rolling Restart Of Nodes Cause Dataloss Due To Schema Collision

2019-01-08 Thread Avraham Kalvo (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737908#comment-16737908
 ] 

Avraham Kalvo edited comment on CASSANDRA-14957 at 1/9/19 7:03 AM:
---

Thanks [~jeromatron],

Looking at the data directory for one of the keyspaces in question right now, 
even a couple of full primary range repairs have completed across the cluster 
since the outage, the following is apparent:

```
36 drwxr-xr-x. 4 root root 28672 Jan  2 12:07 
tasks-bd7200a0156711e88974855d74ee356f
   8 drwxr-xr-x. 4 root root  4096 Jan  9 06:38 
tasks-bd750de0156711e8bdc54f7bcdcb851f
```
and with the following sizes:
```
$ du -sh tasks*
2.7Gtasks-bd7200a0156711e88974855d74ee356f
522Mtasks-bd750de0156711e8bdc54f7bcdcb851f
```

Data was not lost from disk, but became no longer available for reads/writes 
via the database, I.e. - effectively lost to the application.
As far as I know, anti-entropy actions don't take care of the above situation 
and indeed it needed to be recovered manually as described in the original 
comment for this issue.
Writes only began to succeed once the schema agreement has settled across all 
the cluster, Until then, the application was timing out on any request to 
Cassandra. 

What do you think?


was (Author: via.vokal):
Thanks [~jeromatron],

Looking at the data directory for one of the keyspaces in question right now, 
even a couple of full primary range repairs have completed across the cluster 
since the outage, the following is apparent:

```
36 drwxr-xr-x. 4 root root 28672 Jan  2 12:07 
tasks-bd7200a0156711e88974855d74ee356f
   8 drwxr-xr-x. 4 root root  4096 Jan  9 06:38 
tasks-bd750de0156711e8bdc54f7bcdcb851f
```

Data was not lost from disk, but became no longer available for reads/writes 
via the database, I.e. - effectively lost to the application.
As far as I know, anti-entropy actions don't take care of the above situation 
and indeed it needed to be recovered manually as described in the original 
comment for this issue.
Writes only began to succeed once the schema agreement has settled across all 
the cluster, Until then, the application was timing out on any request to 
Cassandra. 

What do you think?

> Rolling Restart Of Nodes Cause Dataloss Due To Schema Collision
> ---
>
> Key: CASSANDRA-14957
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14957
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Schema
>Reporter: Avraham Kalvo
>Priority: Major
>
> We were issuing a rolling restart on a mission-critical five node C* cluster.
> The first node which was restarted got the following messages in its 
> system.log:
> ```
> January 2nd 2019, 12:06:37.310 - INFO 12:06:35 Initializing 
> tasks_scheduler_external.tasks
> ```
> ```
> WARN 12:06:39 UnknownColumnFamilyException reading from socket; closing
> org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find table for 
> cfId bd7200a0-1567-11e8-8974-855d74ee356f. If a table was just created, this 
> is likely due to the schema not being fully propagated. Please wait for 
> schema agreement on table creation.
> at 
> org.apache.cassandra.config.CFMetaData$Serializer.deserialize(CFMetaData.java:1336)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:660)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:635)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:330)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:349)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:286)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at org.apache.cassandra.net.MessageIn.read(MessageIn.java:98) 
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:201)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:178)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:92)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> ```
> The latter was then repeated several times across the cluster.
> It was then found out that the table in question 
> `tasks_scheduler_external.tasks` was created with a new schema version after 
> the entire cluster was restarted consecutively and schema 

[jira] [Comment Edited] (CASSANDRA-14972) Provide a script or method to generate the entire website at the push of a button

2019-01-08 Thread Anthony Grasso (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737893#comment-16737893
 ] 

Anthony Grasso edited comment on CASSANDRA-14972 at 1/9/19 6:39 AM:


The steps involved to generate the entire website from scratch are

# SVN checkout - 
[https://svn.apache.org/repos/asf/cassandra/site|https://svn.apache.org/repos/asf/cassandra/site]
# Git clone Cassandra - 
[g...@github.com:apache/cassandra.git|https://github.com/apache/cassandra]
# Set the environment variable {{CASSANDRA_DIR}} to point to the SVN Cassandra 
checkout location
# Install *Ruby 2.3.x* and *Make* for building the website
# Install *Java 1.8.x*, *Ant*, *Maven*, *Python 2.7.x*, and *Python Sphinx* for 
generating the Cassandra docs
# From the SVN Cassandra checkout install *Jekyll* - {{gem install bundler && 
bundle install}}
# From the SVN Cassandra checkout _src_ directory run {{make add-doc}} to make 
the docs for latest
# From the Git Cassandra checkout change to branch {{cassandra-3.11}} - {{git 
checkout cassandra-3.11}}
# From the SVN Cassandra checkout _src_ directory run {{make add-doc}} again to 
make the docs for 3.11
# From the SVN Cassandra checkout _src_ directory run {{make}} to build the 
website

Probably the easiest way to do this is to have a Docker contain which installs 
all the tools required to build the docs and the site. Inside of the container 
have an entry point script which performs the tasks to generate the docs and 
the site.

This is my first take on it. These files are to be placed in the 
_svn.apache.org/repos/asf/cassandra/site_ directory.
* [^docker-compose.yml]
* [^docker-entrypoint.sh]
* [^Dockerfile] 



was (Author: anthony grasso):
The steps involved to generate the entire website from scratch are

# SVN checkout - 
[https://svn.apache.org/repos/asf/cassandra/site|https://svn.apache.org/repos/asf/cassandra/site]
# Git clone Cassandra - 
[g...@github.com:apache/cassandra.git|https://github.com/apache/cassandra]
# Set the environment variable {{CASSANDRA_DIR}} to point to the SVN Cassandra 
checkout location
# Install *Ruby 2.3.x* and *Make* for building the website
# Install *Java 1.8.x*, *Python 2.7.x*, and *Python Sphinx* for generating the 
Cassandra docs
# From the SVN Cassandra checkout install *Jekyll* - {{gem install bundler && 
bundle install}}
# From the SVN Cassandra checkout _src_ directory run {{make add-doc}} to make 
the docs for latest
# From the Git Cassandra checkout change to branch {{cassandra-3.11}} - {{git 
checkout cassandra-3.11}}
# From the SVN Cassandra checkout _src_ directory run {{make add-doc}} again to 
make the docs for 3.11
# From the SVN Cassandra checkout _src_ directory run {{make}} to build the 
website

Probably the easiest way to do this is to have a Docker contain which installs 
all the tools required to build the docs and the site. Inside of the container 
have an entry point script which performs the tasks to generate the docs and 
the site.

This is my first take on it. These files are to be placed in the 
_svn.apache.org/repos/asf/cassandra/site_ directory.
* [^docker-compose.yml]
* [^docker-entrypoint.sh]
* [^Dockerfile] 


> Provide a script or method to generate the entire website at the push of a 
> button
> -
>
> Key: CASSANDRA-14972
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14972
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation/Website
>Reporter: Anthony Grasso
>Assignee: Anthony Grasso
>Priority: Major
> Attachments: Dockerfile, docker-compose.yml, docker-entrypoint.sh
>
>
> The process involved to generate the website involves two repositories (Git 
> and SVN), a range of tools, and a number of steps.
> It would be good to have a script or something similar which we run and it 
> will generate the entire website for us which is ready to commit back into 
> SVN for publication.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14972) Provide a script or method to generate the entire website at the push of a button

2019-01-08 Thread Anthony Grasso (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Grasso updated CASSANDRA-14972:
---
Attachment: Dockerfile
docker-entrypoint.sh
docker-compose.yml

> Provide a script or method to generate the entire website at the push of a 
> button
> -
>
> Key: CASSANDRA-14972
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14972
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation/Website
>Reporter: Anthony Grasso
>Assignee: Anthony Grasso
>Priority: Major
> Attachments: Dockerfile, docker-compose.yml, docker-entrypoint.sh
>
>
> The process involved to generate the website involves two repositories (Git 
> and SVN), a range of tools, and a number of steps.
> It would be good to have a script or something similar which we run and it 
> will generate the entire website for us which is ready to commit back into 
> SVN for publication.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14972) Provide a script or method to generate the entire website at the push of a button

2019-01-08 Thread Anthony Grasso (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737893#comment-16737893
 ] 

Anthony Grasso commented on CASSANDRA-14972:


The steps involved to generate the entire website from scratch are

# SVN checkout - 
[https://svn.apache.org/repos/asf/cassandra/site|https://svn.apache.org/repos/asf/cassandra/site]
# Git clone Cassandra - 
[g...@github.com:apache/cassandra.git|https://github.com/apache/cassandra]
# Set the environment variable {{CASSANDRA_DIR}} to point to the SVN Cassandra 
checkout location
# Install *Ruby 2.3.x* and *Make* for building the website
# Install *Java 1.8.x*, *Python 2.7.x*, and *Python Sphinx* for generating the 
Cassandra docs
# From the SVN Cassandra checkout install *Jekyll* - {{gem install bundler && 
bundle install}}
# From the SVN Cassandra checkout _src_ directory run {{make add-doc}} to make 
the docs for latest
# From the Git Cassandra checkout change to branch {{cassandra-3.11}} - {{git 
checkout cassandra-3.11}}
# From the SVN Cassandra checkout _src_ directory run {{make add-doc}} again to 
make the docs for 3.11
# From the SVN Cassandra checkout _src_ directory run {{make}} to build the 
website

Probably the easiest way to do this is to have a Docker contain which installs 
all the tools required to build the docs and the site. Inside of the container 
have an entry point script which performs the tasks to generate the docs and 
the site.

This is my first take on it. These files are to be placed in the 
_svn.apache.org/repos/asf/cassandra/site_ directory.
* [^docker-compose.yml]
* [^docker-entrypoint.sh]
* [^Dockerfile] 


> Provide a script or method to generate the entire website at the push of a 
> button
> -
>
> Key: CASSANDRA-14972
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14972
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation/Website
>Reporter: Anthony Grasso
>Assignee: Anthony Grasso
>Priority: Major
> Attachments: Dockerfile, docker-compose.yml, docker-entrypoint.sh
>
>
> The process involved to generate the website involves two repositories (Git 
> and SVN), a range of tools, and a number of steps.
> It would be good to have a script or something similar which we run and it 
> will generate the entire website for us which is ready to commit back into 
> SVN for publication.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14972) Provide a script or method to generate the entire website at the push of a button

2019-01-08 Thread Anthony Grasso (JIRA)
Anthony Grasso created CASSANDRA-14972:
--

 Summary: Provide a script or method to generate the entire website 
at the push of a button
 Key: CASSANDRA-14972
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14972
 Project: Cassandra
  Issue Type: Task
  Components: Documentation/Website
Reporter: Anthony Grasso
Assignee: Anthony Grasso


The process involved to generate the website involves two repositories (Git and 
SVN), a range of tools, and a number of steps.

It would be good to have a script or something similar which we run and it will 
generate the entire website for us which is ready to commit back into SVN for 
publication.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14971) Website documentation search function returns broken links

2019-01-08 Thread Anthony Grasso (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737883#comment-16737883
 ] 

Anthony Grasso commented on CASSANDRA-14971:


Awesome! Thanks [~michaelsembwever].

> Website documentation search function returns broken links 
> ---
>
> Key: CASSANDRA-14971
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14971
> Project: Cassandra
>  Issue Type: Bug
>  Components: Documentation/Website
>Reporter: Anthony Grasso
>Assignee: Anthony Grasso
>Priority: Major
> Attachments: CASSANDRA-14971_v01.patch
>
>
> The search bar on the main page of the [Cassandra 
> Documentation|http://cassandra.apache.org/doc/latest/] returns search 
> [results|http://cassandra.apache.org/doc/latest/search.html?q=cache_keywords=yes=default]
>  with broken links.
> When a link from a returned search is clicked, the site returns a 404 with 
> the message similar to this:
> {quote}The requested URL /doc/latest/tools/nodetool/nodetool.rst.html was not 
> found on this server.
> {quote}
> From the error, it appears that the links are pointing to pages that end in 
> *.rst.html* in their name. The links should point to pages that end in 
> *.html*.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14971) Website documentation search function returns broken links

2019-01-08 Thread mck (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-14971:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed as r1850821

> Website documentation search function returns broken links 
> ---
>
> Key: CASSANDRA-14971
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14971
> Project: Cassandra
>  Issue Type: Bug
>  Components: Documentation/Website
>Reporter: Anthony Grasso
>Assignee: Anthony Grasso
>Priority: Major
> Attachments: CASSANDRA-14971_v01.patch
>
>
> The search bar on the main page of the [Cassandra 
> Documentation|http://cassandra.apache.org/doc/latest/] returns search 
> [results|http://cassandra.apache.org/doc/latest/search.html?q=cache_keywords=yes=default]
>  with broken links.
> When a link from a returned search is clicked, the site returns a 404 with 
> the message similar to this:
> {quote}The requested URL /doc/latest/tools/nodetool/nodetool.rst.html was not 
> found on this server.
> {quote}
> From the error, it appears that the links are pointing to pages that end in 
> *.rst.html* in their name. The links should point to pages that end in 
> *.html*.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



svn commit: r1850821 - in /cassandra/site: publish/js/searchtools.js src/js/searchtools.js

2019-01-08 Thread mck
Author: mck
Date: Wed Jan  9 05:12:13 2019
New Revision: 1850821

URL: http://svn.apache.org/viewvc?rev=1850821=rev
Log:
Website documentation search function returns broken links

 patch by Anthony Grasso; reviewed by Mick Semb Wever for CASSANDRA-14971

Modified:
cassandra/site/publish/js/searchtools.js
cassandra/site/src/js/searchtools.js

Modified: cassandra/site/publish/js/searchtools.js
URL: 
http://svn.apache.org/viewvc/cassandra/site/publish/js/searchtools.js?rev=1850821=1850820=1850821=diff
==
--- cassandra/site/publish/js/searchtools.js (original)
+++ cassandra/site/publish/js/searchtools.js Wed Jan  9 05:12:13 2019
@@ -473,7 +473,7 @@ var Search = {
* search for object names
*/
   performObjectSearch : function(object, otherterms) {
-var filenames = this._index.filenames;
+var filenames = this._index.docnames;
 var objects = this._index.objects;
 var objnames = this._index.objnames;
 var titles = this._index.titles;
@@ -539,7 +539,7 @@ var Search = {
* search for full-text terms in the index
*/
   performTermsSearch : function(searchterms, excluded, terms, titleterms) {
-var filenames = this._index.filenames;
+var filenames = this._index.docnames;
 var titles = this._index.titles;
 
 var i, j, file;

Modified: cassandra/site/src/js/searchtools.js
URL: 
http://svn.apache.org/viewvc/cassandra/site/src/js/searchtools.js?rev=1850821=1850820=1850821=diff
==
--- cassandra/site/src/js/searchtools.js (original)
+++ cassandra/site/src/js/searchtools.js Wed Jan  9 05:12:13 2019
@@ -473,7 +473,7 @@ var Search = {
* search for object names
*/
   performObjectSearch : function(object, otherterms) {
-var filenames = this._index.filenames;
+var filenames = this._index.docnames;
 var objects = this._index.objects;
 var objnames = this._index.objnames;
 var titles = this._index.titles;
@@ -539,7 +539,7 @@ var Search = {
* search for full-text terms in the index
*/
   performTermsSearch : function(searchterms, excluded, terms, titleterms) {
-var filenames = this._index.filenames;
+var filenames = this._index.docnames;
 var titles = this._index.titles;
 
 var i, j, file;



-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14971) Website documentation search function returns broken links

2019-01-08 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737845#comment-16737845
 ] 

mck edited comment on CASSANDRA-14971 at 1/9/19 5:09 AM:
-

Solid write up, thanks [~Anthony Grasso].

Patch is +1 from me. Going to test it.


was (Author: michaelsembwever):
Solid write up, thanks [~Anthony Grasso].

Patch is +1 from me. Going to test it, commit it, then update the website.

> Website documentation search function returns broken links 
> ---
>
> Key: CASSANDRA-14971
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14971
> Project: Cassandra
>  Issue Type: Bug
>  Components: Documentation/Website
>Reporter: Anthony Grasso
>Assignee: Anthony Grasso
>Priority: Major
> Attachments: CASSANDRA-14971_v01.patch
>
>
> The search bar on the main page of the [Cassandra 
> Documentation|http://cassandra.apache.org/doc/latest/] returns search 
> [results|http://cassandra.apache.org/doc/latest/search.html?q=cache_keywords=yes=default]
>  with broken links.
> When a link from a returned search is clicked, the site returns a 404 with 
> the message similar to this:
> {quote}The requested URL /doc/latest/tools/nodetool/nodetool.rst.html was not 
> found on this server.
> {quote}
> From the error, it appears that the links are pointing to pages that end in 
> *.rst.html* in their name. The links should point to pages that end in 
> *.html*.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14971) Website documentation search function returns broken links

2019-01-08 Thread mck (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-14971:

Reviewer: mck  (was: Mick Semb Wever)

> Website documentation search function returns broken links 
> ---
>
> Key: CASSANDRA-14971
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14971
> Project: Cassandra
>  Issue Type: Bug
>  Components: Documentation/Website
>Reporter: Anthony Grasso
>Assignee: Anthony Grasso
>Priority: Major
> Attachments: CASSANDRA-14971_v01.patch
>
>
> The search bar on the main page of the [Cassandra 
> Documentation|http://cassandra.apache.org/doc/latest/] returns search 
> [results|http://cassandra.apache.org/doc/latest/search.html?q=cache_keywords=yes=default]
>  with broken links.
> When a link from a returned search is clicked, the site returns a 404 with 
> the message similar to this:
> {quote}The requested URL /doc/latest/tools/nodetool/nodetool.rst.html was not 
> found on this server.
> {quote}
> From the error, it appears that the links are pointing to pages that end in 
> *.rst.html* in their name. The links should point to pages that end in 
> *.html*.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14971) Website documentation search function returns broken links

2019-01-08 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737845#comment-16737845
 ] 

mck commented on CASSANDRA-14971:
-

Solid write up, thanks [~Anthony Grasso].

Patch is +1 from me. Going to test it, commit it, then update the website.

> Website documentation search function returns broken links 
> ---
>
> Key: CASSANDRA-14971
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14971
> Project: Cassandra
>  Issue Type: Bug
>  Components: Documentation/Website
>Reporter: Anthony Grasso
>Assignee: Anthony Grasso
>Priority: Major
> Attachments: CASSANDRA-14971_v01.patch
>
>
> The search bar on the main page of the [Cassandra 
> Documentation|http://cassandra.apache.org/doc/latest/] returns search 
> [results|http://cassandra.apache.org/doc/latest/search.html?q=cache_keywords=yes=default]
>  with broken links.
> When a link from a returned search is clicked, the site returns a 404 with 
> the message similar to this:
> {quote}The requested URL /doc/latest/tools/nodetool/nodetool.rst.html was not 
> found on this server.
> {quote}
> From the error, it appears that the links are pointing to pages that end in 
> *.rst.html* in their name. The links should point to pages that end in 
> *.html*.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums

2019-01-08 Thread Michael Shuler (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737808#comment-16737808
 ] 

Michael Shuler edited comment on CASSANDRA-14970 at 1/9/19 3:35 AM:


Our current release process uploads/signs/checksums the tar.gz and maven 
artifacts to nexus via the 'publish' task, then we vote. After vote, we 
download the tar.gz/.md5/.sha1 files for final release and promote the staging 
repo to release. Since the MD5 and SHA files are there in build.xml, I thought 
the patch for creating the .sha256/.sha512 checksums in the 'release' target 
were used for release build. They are not. I gave another try at uploading the 
.sha256/.sha512 files, but realized we never build them due to the target 
dependencies, so looked a little more.

I created ant target graphs for 2.1 and trunk to get an idea of the target 
relations. The release task I patched isn't depended on by anything, and 
currently is completely unused in our release process.

build_cassandra-2.1.png
build_trunk.png

(edit: removed no-thumb images - they are attached..)


was (Author: mshuler):
Our current release process uploads/signs/checksums the tar.gz and maven 
artifacts to nexus, then we vote. After vote, we download the tar.gz/.md5/.sha1 
files for final release and promote the staging repo to release. Since the MD5 
and SHA files are there in build.xml, I thought the patch for creating the 
.sha256/.sha512 checksums in the 'release' target were used for release build. 
They are not. I gave another try at uploading the .sha256/.sha512 files, but 
realized we never build them due to the target dependencies, so looked a little 
more.

I created ant target graphs for 2.1 and trunk to get an idea of the target 
relations. The release task I patched isn't depended on by anything, and 
currently is completely unused in our release process.

build_cassandra-2.1.png
build_trunk.png

(edit: removed no-thumb images - they are attached..)

> New releases must supply SHA-256 and/or SHA-512 checksums
> -
>
> Key: CASSANDRA-14970
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14970
> Project: Cassandra
>  Issue Type: Bug
>  Components: Packaging
>Reporter: Michael Shuler
>Assignee: Michael Shuler
>Priority: Blocker
> Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0
>
> Attachments: 
> 0001-Update-downloads-for-sha256-sha512-checksum-files.patch, 
> 0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch, 
> ant-publish-checksum-fail.jpg, build_cassandra-2.1.png, build_trunk.png
>
>
> Release policy was updated around 9/2018 to state:
> "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT 
> supply MD5 or SHA-1. Existing releases do not need to be changed."
> build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. 
> cassandra-builds/cassandra-release scripts need to be updated to work with 
> the new checksum files.
> http://www.apache.org/dev/release-distribution#sigs-and-sums



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14365) Commit log replay failure for static columns with collections in clustering keys

2019-01-08 Thread Vincent White (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent White updated CASSANDRA-14365:
--
Description: 
In the old storage engine, static cells with a collection as part of the 
clustering key fail to validate because a 0 byte collection (like in the cell 
name of a static cell) isn't valid.

To reproduce:

1.
{code:java}
CREATE TABLE test.x (
id int,
id2 frozen>,
st int static,
PRIMARY KEY (id, id2)
);

INSERT INTO test.x (id, st) VALUES (1, 2);
{code}
2.
 Kill the cassandra process

3.
 Restart cassandra to replay the commitlog

Outcome:
{noformat}
ERROR [main] 2018-04-05 04:58:23,741 JVMStabilityInspector.java:99 - Exiting 
due to error while processing commit log during initialization.
org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException: 
Unexpected error deserializing mutation; saved to 
/tmp/mutation3825739904516830950dat.  This may be caused by replaying a 
mutation against a table with the same name but incompatible schema.  Exception 
follows: org.apache.cassandra.serializers.MarshalException: Not enough bytes to 
read a set
at 
org.apache.cassandra.db.commitlog.CommitLogReplayer.handleReplayError(CommitLogReplayer.java:638)
 [main/:na]
at 
org.apache.cassandra.db.commitlog.CommitLogReplayer.replayMutation(CommitLogReplayer.java:565)
 [main/:na]
at 
org.apache.cassandra.db.commitlog.CommitLogReplayer.replaySyncSection(CommitLogReplayer.java:517)
 [main/:na]
at 
org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:397)
 [main/:na]
at 
org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:143)
 [main/:na]
at 
org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:181) 
[main/:na]
at 
org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:161) 
[main/:na]
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:284) 
[main/:na]
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:533) 
[main/:na]
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:642) 
[main/:na]


{noformat}
I haven't investigated if there are other more subtle issues caused by these 
cells failing to validate other places in the code, but I believe the fix for 
this is to check for 0 byte length collections and accept them as valid as we 
do with other types.

I haven't had a chance for any extensive testing but this naive patch seems to 
have the desired affect. 


||Patch||
|[2.2 PoC 
Patch|https://github.com/vincewhite/cassandra/commits/zero_length_collection]|


  was:
In the old storage engine, static cells with a collection as part of the 
clustering key fail to validate because a 0 byte collection (like in the cell 
name of a static cell) isn't valid.

To reproduce:

1.
{code:java}
CREATE TABLE test.x (
id int,
id2 frozen>,
st int static,
PRIMARY KEY (id, id2)
);

INSERT INTO test.x (id, st) VALUES (1, 2);
{code}
2.
 Kill the cassandra process

3.
 Restart cassandra to replay the commitlog

Outcome:
{noformat}
ERROR [main] 2018-04-05 04:58:23,741 JVMStabilityInspector.java:99 - Exiting 
due to error while processing commit log during initialization.
org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException: 
Unexpected error deserializing mutation; saved to 
/tmp/mutation3825739904516830950dat.  This may be caused by replaying a 
mutation against a table with the same name but incompatible schema.  Exception 
follows: org.apache.cassandra.serializers.MarshalException: Not enough bytes to 
read a set
at 
org.apache.cassandra.db.commitlog.CommitLogReplayer.handleReplayError(CommitLogReplayer.java:638)
 [main/:na]
at 
org.apache.cassandra.db.commitlog.CommitLogReplayer.replayMutation(CommitLogReplayer.java:565)
 [main/:na]
at 
org.apache.cassandra.db.commitlog.CommitLogReplayer.replaySyncSection(CommitLogReplayer.java:517)
 [main/:na]
at 
org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:397)
 [main/:na]
at 
org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:143)
 [main/:na]
at 
org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:181) 
[main/:na]
at 
org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:161) 
[main/:na]
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:284) 
[main/:na]
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:533) 
[main/:na]
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:642) 
[main/:na]


{noformat}
I haven't investigated if there are other more subtle issues caused by these 
cells failing to validate other places in the 

[jira] [Comment Edited] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums

2019-01-08 Thread Michael Shuler (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737808#comment-16737808
 ] 

Michael Shuler edited comment on CASSANDRA-14970 at 1/9/19 3:28 AM:


Our current release process uploads/signs/checksums the tar.gz and maven 
artifacts to nexus, then we vote. After vote, we download the tar.gz/.md5/.sha1 
files for final release and promote the staging repo to release. Since the MD5 
and SHA files are there in build.xml, I thought the patch for creating the 
.sha256/.sha512 checksums in the 'release' target were used for release build. 
They are not. I gave another try at uploading the .sha256/.sha512 files, but 
realized we never build them due to the target dependencies, so looked a little 
more.

I created ant target graphs for 2.1 and trunk to get an idea of the target 
relations. The release task I patched isn't depended on by anything, and 
currently is completely unused in our release process.

build_cassandra-2.1.png
build_trunk.png

(edit: removed no-thumb images - they are attached..)


was (Author: mshuler):
Our current release process uploads/signs/checksums the tar.gz and maven 
artifacts to nexus, then we vote. After vote, we download the tar.gz/.md5/.sha1 
files for final release and promote the staging repo to release. Since the MD5 
and SHA files are there in build.xml, I thought the patch for creating the 
.sha256/.sha512 checksums in the 'release' target were used for release build. 
They are not. I gave another try at uploading the .sha256/.sha512 files, but 
realized we never build them due to the target dependencies, so looked a little 
more.

I created ant target graphs for 2.1 and trunk to get an idea of the target 
relations. The release task I patched isn't depended on by anything, and 
currently is completely unused in our release process.

!build_cassandra-2.1.png! 
 !build_trunk.png!

> New releases must supply SHA-256 and/or SHA-512 checksums
> -
>
> Key: CASSANDRA-14970
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14970
> Project: Cassandra
>  Issue Type: Bug
>  Components: Packaging
>Reporter: Michael Shuler
>Assignee: Michael Shuler
>Priority: Blocker
> Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0
>
> Attachments: 
> 0001-Update-downloads-for-sha256-sha512-checksum-files.patch, 
> 0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch, 
> ant-publish-checksum-fail.jpg, build_cassandra-2.1.png, build_trunk.png
>
>
> Release policy was updated around 9/2018 to state:
> "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT 
> supply MD5 or SHA-1. Existing releases do not need to be changed."
> build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. 
> cassandra-builds/cassandra-release scripts need to be updated to work with 
> the new checksum files.
> http://www.apache.org/dev/release-distribution#sigs-and-sums



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums

2019-01-08 Thread Michael Shuler (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Shuler updated CASSANDRA-14970:
---
Attachment: build_cassandra-2.1.png

> New releases must supply SHA-256 and/or SHA-512 checksums
> -
>
> Key: CASSANDRA-14970
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14970
> Project: Cassandra
>  Issue Type: Bug
>  Components: Packaging
>Reporter: Michael Shuler
>Assignee: Michael Shuler
>Priority: Blocker
> Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0
>
> Attachments: 
> 0001-Update-downloads-for-sha256-sha512-checksum-files.patch, 
> 0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch, 
> ant-publish-checksum-fail.jpg, build_cassandra-2.1.png, build_trunk.png
>
>
> Release policy was updated around 9/2018 to state:
> "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT 
> supply MD5 or SHA-1. Existing releases do not need to be changed."
> build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. 
> cassandra-builds/cassandra-release scripts need to be updated to work with 
> the new checksum files.
> http://www.apache.org/dev/release-distribution#sigs-and-sums



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums

2019-01-08 Thread Michael Shuler (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737808#comment-16737808
 ] 

Michael Shuler commented on CASSANDRA-14970:


Our current release process uploads/signs/checksums the tar.gz and maven 
artifacts to nexus, then we vote. After vote, we download the tar.gz/.md5/.sha1 
files for final release and promote the staging repo to release. Since the MD5 
and SHA files are there in build.xml, I thought the patch for creating the 
.sha256/.sha512 checksums in the 'release' target were used for release build. 
They are not. I gave another try at uploading the .sha256/.sha512 files, but 
realized we never build them due to the target dependencies, so looked a little 
more.

I created ant target graphs for 2.1 and trunk to get an idea of the target 
relations. The release task I patched isn't depended on by anything, and 
currently is completely unused in our release process.

!build_cassandra-2.1.png! 
 !build_trunk.png!

> New releases must supply SHA-256 and/or SHA-512 checksums
> -
>
> Key: CASSANDRA-14970
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14970
> Project: Cassandra
>  Issue Type: Bug
>  Components: Packaging
>Reporter: Michael Shuler
>Assignee: Michael Shuler
>Priority: Blocker
> Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0
>
> Attachments: 
> 0001-Update-downloads-for-sha256-sha512-checksum-files.patch, 
> 0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch, 
> ant-publish-checksum-fail.jpg, build_cassandra-2.1.png, build_trunk.png
>
>
> Release policy was updated around 9/2018 to state:
> "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT 
> supply MD5 or SHA-1. Existing releases do not need to be changed."
> build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. 
> cassandra-builds/cassandra-release scripts need to be updated to work with 
> the new checksum files.
> http://www.apache.org/dev/release-distribution#sigs-and-sums



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums

2019-01-08 Thread Michael Shuler (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Shuler updated CASSANDRA-14970:
---
Attachment: build_trunk.png

> New releases must supply SHA-256 and/or SHA-512 checksums
> -
>
> Key: CASSANDRA-14970
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14970
> Project: Cassandra
>  Issue Type: Bug
>  Components: Packaging
>Reporter: Michael Shuler
>Assignee: Michael Shuler
>Priority: Blocker
> Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0
>
> Attachments: 
> 0001-Update-downloads-for-sha256-sha512-checksum-files.patch, 
> 0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch, 
> ant-publish-checksum-fail.jpg, build_cassandra-2.1.png, build_trunk.png
>
>
> Release policy was updated around 9/2018 to state:
> "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT 
> supply MD5 or SHA-1. Existing releases do not need to be changed."
> build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. 
> cassandra-builds/cassandra-release scripts need to be updated to work with 
> the new checksum files.
> http://www.apache.org/dev/release-distribution#sigs-and-sums



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14365) Commit log replay failure for static columns with collections in clustering keys

2019-01-08 Thread Vincent White (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent White updated CASSANDRA-14365:
--
Description: 
In the old storage engine, static cells with a collection as part of the 
clustering key fail to validate because a 0 byte collection (like in the cell 
name of a static cell) isn't valid.

To reproduce:

1.
{code:java}
CREATE TABLE test.x (
id int,
id2 frozen>,
st int static,
PRIMARY KEY (id, id2)
);

INSERT INTO test.x (id, st) VALUES (1, 2);
{code}
2.
 Kill the cassandra process

3.
 Restart cassandra to replay the commitlog

Outcome:
{noformat}
ERROR [main] 2018-04-05 04:58:23,741 JVMStabilityInspector.java:99 - Exiting 
due to error while processing commit log during initialization.
org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException: 
Unexpected error deserializing mutation; saved to 
/tmp/mutation3825739904516830950dat.  This may be caused by replaying a 
mutation against a table with the same name but incompatible schema.  Exception 
follows: org.apache.cassandra.serializers.MarshalException: Not enough bytes to 
read a set
at 
org.apache.cassandra.db.commitlog.CommitLogReplayer.handleReplayError(CommitLogReplayer.java:638)
 [main/:na]
at 
org.apache.cassandra.db.commitlog.CommitLogReplayer.replayMutation(CommitLogReplayer.java:565)
 [main/:na]
at 
org.apache.cassandra.db.commitlog.CommitLogReplayer.replaySyncSection(CommitLogReplayer.java:517)
 [main/:na]
at 
org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:397)
 [main/:na]
at 
org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:143)
 [main/:na]
at 
org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:181) 
[main/:na]
at 
org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:161) 
[main/:na]
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:284) 
[main/:na]
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:533) 
[main/:na]
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:642) 
[main/:na]


{noformat}
I haven't investigated if there are other more subtle issues caused by these 
cells failing to validate other places in the code, but I believe the fix for 
this is to check for 0 byte length collections and accept them as valid as we 
do with other types.

I haven't had a chance for any extensive testing but this naive patch seems to 
have the desired affect. 


||Patch||
|[2.2 
PoC|https://github.com/vincewhite/cassandra/commits/zero_length_collection]|


  was:
In the old storage engine, static cells with a collection as part of the 
clustering key fail to validate because a 0 byte collection (like in the cell 
name of a static cell) isn't valid.

To reproduce:

1.
{code:java}
CREATE TABLE test.x (
id int,
id2 frozen>,
st int static,
PRIMARY KEY (id, id2)
);

INSERT INTO test.x (id, st) VALUES (1, 2);
{code}
2.
 Kill the cassandra process

3.
 Restart cassandra to replay the commitlog

Outcome:
{noformat}
ERROR [main] 2018-04-05 04:58:23,741 JVMStabilityInspector.java:99 - Exiting 
due to error while processing commit log during initialization.
org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException: 
Unexpected error deserializing mutation; saved to 
/tmp/mutation3825739904516830950dat.  This may be caused by replaying a 
mutation against a table with the same name but incompatible schema.  Exception 
follows: org.apache.cassandra.serializers.MarshalException: Not enough bytes to 
read a set
at 
org.apache.cassandra.db.commitlog.CommitLogReplayer.handleReplayError(CommitLogReplayer.java:638)
 [main/:na]
at 
org.apache.cassandra.db.commitlog.CommitLogReplayer.replayMutation(CommitLogReplayer.java:565)
 [main/:na]
at 
org.apache.cassandra.db.commitlog.CommitLogReplayer.replaySyncSection(CommitLogReplayer.java:517)
 [main/:na]
at 
org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:397)
 [main/:na]
at 
org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:143)
 [main/:na]
at 
org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:181) 
[main/:na]
at 
org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:161) 
[main/:na]
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:284) 
[main/:na]
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:533) 
[main/:na]
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:642) 
[main/:na]


{noformat}
I haven't investigated if there are other more subtle issues caused by these 
cells failing to validate other places in the code, 

[jira] [Comment Edited] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums

2019-01-08 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737732#comment-16737732
 ] 

mck edited comment on CASSANDRA-14970 at 1/9/19 1:36 AM:
-

[~mshuler] the asf guidelines applies strictly to the distributed convenience 
binary artefacts. The asf maven repository doesn't support it yet, that is the 
nexus repo only keeps sha1 on the jarfiles.


was (Author: michaelsembwever):
[~mshuler] the asf guidelines applies strictly to the distributed convenience 
binary artefacts. The asf maven repository doesn't support it yet, hat is the 
nexus repo only keeps sha1 on the jarfiles.

> New releases must supply SHA-256 and/or SHA-512 checksums
> -
>
> Key: CASSANDRA-14970
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14970
> Project: Cassandra
>  Issue Type: Bug
>  Components: Packaging
>Reporter: Michael Shuler
>Assignee: Michael Shuler
>Priority: Blocker
> Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0
>
> Attachments: 
> 0001-Update-downloads-for-sha256-sha512-checksum-files.patch, 
> 0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch, 
> ant-publish-checksum-fail.jpg
>
>
> Release policy was updated around 9/2018 to state:
> "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT 
> supply MD5 or SHA-1. Existing releases do not need to be changed."
> build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. 
> cassandra-builds/cassandra-release scripts need to be updated to work with 
> the new checksum files.
> http://www.apache.org/dev/release-distribution#sigs-and-sums



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14971) Website documentation search function returns broken links

2019-01-08 Thread Anthony Grasso (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Grasso updated CASSANDRA-14971:
---
  Reviewer: Mick Semb Wever
Attachment: CASSANDRA-14971_v01.patch
Status: Patch Available  (was: In Progress)

Attached {{svn diff}} patch

> Website documentation search function returns broken links 
> ---
>
> Key: CASSANDRA-14971
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14971
> Project: Cassandra
>  Issue Type: Bug
>  Components: Documentation/Website
>Reporter: Anthony Grasso
>Assignee: Anthony Grasso
>Priority: Major
> Attachments: CASSANDRA-14971_v01.patch
>
>
> The search bar on the main page of the [Cassandra 
> Documentation|http://cassandra.apache.org/doc/latest/] returns search 
> [results|http://cassandra.apache.org/doc/latest/search.html?q=cache_keywords=yes=default]
>  with broken links.
> When a link from a returned search is clicked, the site returns a 404 with 
> the message similar to this:
> {quote}The requested URL /doc/latest/tools/nodetool/nodetool.rst.html was not 
> found on this server.
> {quote}
> From the error, it appears that the links are pointing to pages that end in 
> *.rst.html* in their name. The links should point to pages that end in 
> *.html*.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14971) Website documentation search function returns broken links

2019-01-08 Thread Anthony Grasso (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737725#comment-16737725
 ] 

Anthony Grasso commented on CASSANDRA-14971:


It looks like the search results are pieced together by the 
[searchtools.js|https://svn.apache.org/repos/asf/cassandra/site/src/js/searchtools.js]
 file that lives in the _js_ directory in the SVN 
[repository|https://svn.apache.org/repos/asf/cassandra/site]. Specifically the 
{{displayNextItem()}} function walks through the returned results and generates 
the HTML output. This function generates the filenames using the data in the 
returned results.

The search results are generated by the {{performObjectSearch}} and 
{{performTermsSearch}} functions. These functions obtain the file information 
from the search index. In this case, it is the search index file 
([searchindex.js|https://svn.apache.org/repos/asf/cassandra/site/src/doc/4.0/searchindex.js]
 which is generated by Sphinx.

It appears that we are referencing the documents in the {{filenames}} list 
property of the search index. These documents contain the *.rst* extension. We 
should probably be referencing the documents in the {{docnames}} list property 
of the search index.

> Website documentation search function returns broken links 
> ---
>
> Key: CASSANDRA-14971
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14971
> Project: Cassandra
>  Issue Type: Bug
>  Components: Documentation/Website
>Reporter: Anthony Grasso
>Assignee: Anthony Grasso
>Priority: Major
>
> The search bar on the main page of the [Cassandra 
> Documentation|http://cassandra.apache.org/doc/latest/] returns search 
> [results|http://cassandra.apache.org/doc/latest/search.html?q=cache_keywords=yes=default]
>  with broken links.
> When a link from a returned search is clicked, the site returns a 404 with 
> the message similar to this:
> {quote}The requested URL /doc/latest/tools/nodetool/nodetool.rst.html was not 
> found on this server.
> {quote}
> From the error, it appears that the links are pointing to pages that end in 
> *.rst.html* in their name. The links should point to pages that end in 
> *.html*.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums

2019-01-08 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737732#comment-16737732
 ] 

mck edited comment on CASSANDRA-14970 at 1/9/19 1:38 AM:
-

[~mshuler] the asf guidelines applies strictly to the distributed convenience 
binary artefacts. The asf maven repository doesn't support it yet, that is the 
nexus repo only keeps sha1 on the jarfiles. (No asf project is using 
sha-256/512 on maven distributables afaik)


was (Author: michaelsembwever):
[~mshuler] the asf guidelines applies strictly to the distributed convenience 
binary artefacts. The asf maven repository doesn't support it yet, that is the 
nexus repo only keeps sha1 on the jarfiles. (No asf project is using sha-25/512 
on maven distributables afaik)

> New releases must supply SHA-256 and/or SHA-512 checksums
> -
>
> Key: CASSANDRA-14970
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14970
> Project: Cassandra
>  Issue Type: Bug
>  Components: Packaging
>Reporter: Michael Shuler
>Assignee: Michael Shuler
>Priority: Blocker
> Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0
>
> Attachments: 
> 0001-Update-downloads-for-sha256-sha512-checksum-files.patch, 
> 0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch, 
> ant-publish-checksum-fail.jpg
>
>
> Release policy was updated around 9/2018 to state:
> "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT 
> supply MD5 or SHA-1. Existing releases do not need to be changed."
> build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. 
> cassandra-builds/cassandra-release scripts need to be updated to work with 
> the new checksum files.
> http://www.apache.org/dev/release-distribution#sigs-and-sums



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums

2019-01-08 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737732#comment-16737732
 ] 

mck edited comment on CASSANDRA-14970 at 1/9/19 1:38 AM:
-

[~mshuler] the asf guidelines applies strictly to the distributed convenience 
binary artefacts. The asf maven repository doesn't support it yet, that is the 
nexus repo only keeps sha1 on the jarfiles. (No asf project is using sha-25/512 
on maven distributables afaik)


was (Author: michaelsembwever):
[~mshuler] the asf guidelines applies strictly to the distributed convenience 
binary artefacts. The asf maven repository doesn't support it yet, that is the 
nexus repo only keeps sha1 on the jarfiles.

> New releases must supply SHA-256 and/or SHA-512 checksums
> -
>
> Key: CASSANDRA-14970
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14970
> Project: Cassandra
>  Issue Type: Bug
>  Components: Packaging
>Reporter: Michael Shuler
>Assignee: Michael Shuler
>Priority: Blocker
> Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0
>
> Attachments: 
> 0001-Update-downloads-for-sha256-sha512-checksum-files.patch, 
> 0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch, 
> ant-publish-checksum-fail.jpg
>
>
> Release policy was updated around 9/2018 to state:
> "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT 
> supply MD5 or SHA-1. Existing releases do not need to be changed."
> build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. 
> cassandra-builds/cassandra-release scripts need to be updated to work with 
> the new checksum files.
> http://www.apache.org/dev/release-distribution#sigs-and-sums



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums

2019-01-08 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737732#comment-16737732
 ] 

mck commented on CASSANDRA-14970:
-

[~mshuler] the asf guidelines applies strictly to the distributed convenience 
binary artefacts. The asf maven repository doesn't support it yet, hat is the 
nexus repo only keeps sha1 on the jarfiles.

> New releases must supply SHA-256 and/or SHA-512 checksums
> -
>
> Key: CASSANDRA-14970
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14970
> Project: Cassandra
>  Issue Type: Bug
>  Components: Packaging
>Reporter: Michael Shuler
>Assignee: Michael Shuler
>Priority: Blocker
> Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0
>
> Attachments: 
> 0001-Update-downloads-for-sha256-sha512-checksum-files.patch, 
> 0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch, 
> ant-publish-checksum-fail.jpg
>
>
> Release policy was updated around 9/2018 to state:
> "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT 
> supply MD5 or SHA-1. Existing releases do not need to be changed."
> build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. 
> cassandra-builds/cassandra-release scripts need to be updated to work with 
> the new checksum files.
> http://www.apache.org/dev/release-distribution#sigs-and-sums



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14922) In JVM dtests need to clean up after instance shutdown

2019-01-08 Thread Joseph Lynch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737726#comment-16737726
 ] 

Joseph Lynch commented on CASSANDRA-14922:
--

{quote}
Sure thing. I'll start the rebase tomorrow in that case. In that case, also, 
I've pushed my one nit from a quick look through here for Alex to look at, that 
I would have simply ninja'd in (with comment here, of course). This is just 
using the HintsBuffer.free method instead of directly invoking 
DirectByteBuffer.cleaner().clean().
{quote}
Ah cool, yea that appears to still work (and then we can leave the slab private 
in {{HintsBuffer}} as well.

> In JVM dtests need to clean up after instance shutdown
> --
>
> Key: CASSANDRA-14922
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14922
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Minor
> Fix For: 4.0
>
> Attachments: AllThreadsStopped.png, ClassLoadersRetaining.png, 
> Leaking_Metrics_On_Shutdown.png, MainClassRetaining.png, 
> MemoryReclaimedFix.png, Metaspace_Actually_Collected.png, 
> OnlyThreeRootsLeft.png, no_more_references.png
>
>
> Currently the unit tests are failing on circleci ([example 
> one|https://circleci.com/gh/jolynch/cassandra/300#tests/containers/1], 
> [example 
> two|https://circleci.com/gh/rustyrazorblade/cassandra/44#tests/containers/1]) 
> because we use a small container (medium) for unit tests by default and the 
> in JVM dtests are leaking a few hundred megabytes of memory per test right 
> now. This is not a big deal because the dtest runs with the larger containers 
> continue to function fine as well as local testing as the number of in JVM 
> dtests is not yet high enough to cause a problem with more than 2GB of 
> available heap. However we should fix the memory leak so that going forwards 
> we can add more in JVM dtests without worry.
> I've been working with [~ifesdjeen] to debug, and the issue appears to be 
> unreleased Table/Keyspace metrics (screenshot showing the leak attached). I 
> believe that we have a few potential issues that are leading to the leaks:
> 1. The 
> [{{Instance::shutdown}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/Instance.java#L328-L354]
>  method is not successfully cleaning up all the metrics created by the 
> {{CassandraMetricsRegistry}}
>  2. The 
> [{{TestCluster::close}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/TestCluster.java#L283]
>  method is not waiting for all the instances to finish shutting down and 
> cleaning up before continuing on
> 3. I'm not sure if this is an issue assuming we clear all metrics, but 
> [{{TableMetrics::release}}|https://github.com/apache/cassandra/blob/4ae229f5cd270c2b43475b3f752a7b228de260ea/src/java/org/apache/cassandra/metrics/TableMetrics.java#L951]
>  does not release all the metric references (which could leak them)
> I am working on a patch which shuts down everything and assures that we do 
> not leak memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14922) In JVM dtests need to clean up after instance shutdown

2019-01-08 Thread Joseph Lynch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737726#comment-16737726
 ] 

Joseph Lynch edited comment on CASSANDRA-14922 at 1/9/19 1:30 AM:
--

{quote}
Sure thing. I'll start the rebase tomorrow in that case. In that case, also, 
I've pushed my one nit from a quick look through here for Alex to look at, that 
I would have simply ninja'd in (with comment here, of course). This is just 
using the HintsBuffer.free method instead of directly invoking 
DirectByteBuffer.cleaner().clean().
{quote}
Ah cool, yea that appears to still work (and then we can leave the slab private 
in {{HintsBuffer}} as well.)


was (Author: jolynch):
{quote}
Sure thing. I'll start the rebase tomorrow in that case. In that case, also, 
I've pushed my one nit from a quick look through here for Alex to look at, that 
I would have simply ninja'd in (with comment here, of course). This is just 
using the HintsBuffer.free method instead of directly invoking 
DirectByteBuffer.cleaner().clean().
{quote}
Ah cool, yea that appears to still work (and then we can leave the slab private 
in {{HintsBuffer}} as well.

> In JVM dtests need to clean up after instance shutdown
> --
>
> Key: CASSANDRA-14922
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14922
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Minor
> Fix For: 4.0
>
> Attachments: AllThreadsStopped.png, ClassLoadersRetaining.png, 
> Leaking_Metrics_On_Shutdown.png, MainClassRetaining.png, 
> MemoryReclaimedFix.png, Metaspace_Actually_Collected.png, 
> OnlyThreeRootsLeft.png, no_more_references.png
>
>
> Currently the unit tests are failing on circleci ([example 
> one|https://circleci.com/gh/jolynch/cassandra/300#tests/containers/1], 
> [example 
> two|https://circleci.com/gh/rustyrazorblade/cassandra/44#tests/containers/1]) 
> because we use a small container (medium) for unit tests by default and the 
> in JVM dtests are leaking a few hundred megabytes of memory per test right 
> now. This is not a big deal because the dtest runs with the larger containers 
> continue to function fine as well as local testing as the number of in JVM 
> dtests is not yet high enough to cause a problem with more than 2GB of 
> available heap. However we should fix the memory leak so that going forwards 
> we can add more in JVM dtests without worry.
> I've been working with [~ifesdjeen] to debug, and the issue appears to be 
> unreleased Table/Keyspace metrics (screenshot showing the leak attached). I 
> believe that we have a few potential issues that are leading to the leaks:
> 1. The 
> [{{Instance::shutdown}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/Instance.java#L328-L354]
>  method is not successfully cleaning up all the metrics created by the 
> {{CassandraMetricsRegistry}}
>  2. The 
> [{{TestCluster::close}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/TestCluster.java#L283]
>  method is not waiting for all the instances to finish shutting down and 
> cleaning up before continuing on
> 3. I'm not sure if this is an issue assuming we clear all metrics, but 
> [{{TableMetrics::release}}|https://github.com/apache/cassandra/blob/4ae229f5cd270c2b43475b3f752a7b228de260ea/src/java/org/apache/cassandra/metrics/TableMetrics.java#L951]
>  does not release all the metric references (which could leak them)
> I am working on a patch which shuts down everything and assures that we do 
> not leak memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14971) Website documentation search function returns broken links

2019-01-08 Thread Anthony Grasso (JIRA)
Anthony Grasso created CASSANDRA-14971:
--

 Summary: Website documentation search function returns broken 
links 
 Key: CASSANDRA-14971
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14971
 Project: Cassandra
  Issue Type: Bug
  Components: Documentation/Website
Reporter: Anthony Grasso
Assignee: Anthony Grasso


The search bar on the main page of the [Cassandra 
Documentation|http://cassandra.apache.org/doc/latest/] returns search 
[results|http://cassandra.apache.org/doc/latest/search.html?q=cache_keywords=yes=default]
 with broken links.

When a link from a returned search is clicked, the site returns a 404 with the 
message similar to this:
{quote}The requested URL /doc/latest/tools/nodetool/nodetool.rst.html was not 
found on this server.
{quote}
>From the error, it appears that the links are pointing to pages that end in 
>*.rst.html* in their name. The links should point to pages that end in *.html*.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14922) In JVM dtests need to clean up after instance shutdown

2019-01-08 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737711#comment-16737711
 ] 

Benedict commented on CASSANDRA-14922:
--

bq.  can we wait for Alex to see the latest diff though... I've changed the 
patch a bit since he last looked.

Sure thing.  I'll start the rebase tomorrow in that case.  In that case, also, 
I've pushed my one nit from a quick look through 
[here|https://github.com/belliottsmith/cassandra/tree/14922] for Alex to look 
at, that I would have simply ninja'd in (with comment here, of course).  This 
is just using the {{HintsBuffer.free}} method instead of directly invoking 
{{DirectByteBuffer.cleaner().clean()}}.

bq. Regarding the backport, I am slightly concerned about the NativeLibrary 
changes being backported in their current form.

Thanks for highlighting this.  I'll be sure to take a close look at the 
behaviour on each version we backport to.  I expect there will be other places 
that need similar treatment to what you've done here, as well, so I need to 
double check anyway.

bq. I think the Soft references are coming from 
java.io.ObjectStreamClass$Caches.localDescs, but the object serder we're doing 
in InvokableInstance is a bit beyond my JVM skills I'm afraid.

No worries at all, thanks very much for reproducing this information here for 
posterity.  If we ever want to clean this up, it would probably be easiest to 
simply avoid ser/deser entirely (or use custom ser/deser), but your approach is 
a much more suitable compromise for now.  Thanks again also for all the 
investigative work to plug these gaps.

> In JVM dtests need to clean up after instance shutdown
> --
>
> Key: CASSANDRA-14922
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14922
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Minor
> Fix For: 4.0
>
> Attachments: AllThreadsStopped.png, ClassLoadersRetaining.png, 
> Leaking_Metrics_On_Shutdown.png, MainClassRetaining.png, 
> MemoryReclaimedFix.png, Metaspace_Actually_Collected.png, 
> OnlyThreeRootsLeft.png, no_more_references.png
>
>
> Currently the unit tests are failing on circleci ([example 
> one|https://circleci.com/gh/jolynch/cassandra/300#tests/containers/1], 
> [example 
> two|https://circleci.com/gh/rustyrazorblade/cassandra/44#tests/containers/1]) 
> because we use a small container (medium) for unit tests by default and the 
> in JVM dtests are leaking a few hundred megabytes of memory per test right 
> now. This is not a big deal because the dtest runs with the larger containers 
> continue to function fine as well as local testing as the number of in JVM 
> dtests is not yet high enough to cause a problem with more than 2GB of 
> available heap. However we should fix the memory leak so that going forwards 
> we can add more in JVM dtests without worry.
> I've been working with [~ifesdjeen] to debug, and the issue appears to be 
> unreleased Table/Keyspace metrics (screenshot showing the leak attached). I 
> believe that we have a few potential issues that are leading to the leaks:
> 1. The 
> [{{Instance::shutdown}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/Instance.java#L328-L354]
>  method is not successfully cleaning up all the metrics created by the 
> {{CassandraMetricsRegistry}}
>  2. The 
> [{{TestCluster::close}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/TestCluster.java#L283]
>  method is not waiting for all the instances to finish shutting down and 
> cleaning up before continuing on
> 3. I'm not sure if this is an issue assuming we clear all metrics, but 
> [{{TableMetrics::release}}|https://github.com/apache/cassandra/blob/4ae229f5cd270c2b43475b3f752a7b228de260ea/src/java/org/apache/cassandra/metrics/TableMetrics.java#L951]
>  does not release all the metric references (which could leak them)
> I am working on a patch which shuts down everything and assures that we do 
> not leak memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums

2019-01-08 Thread Michael Shuler (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737710#comment-16737710
 ] 

Michael Shuler commented on CASSANDRA-14970:


INFRA-14923 is the issue.

> New releases must supply SHA-256 and/or SHA-512 checksums
> -
>
> Key: CASSANDRA-14970
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14970
> Project: Cassandra
>  Issue Type: Bug
>  Components: Packaging
>Reporter: Michael Shuler
>Assignee: Michael Shuler
>Priority: Blocker
> Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0
>
> Attachments: 
> 0001-Update-downloads-for-sha256-sha512-checksum-files.patch, 
> 0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch, 
> ant-publish-checksum-fail.jpg
>
>
> Release policy was updated around 9/2018 to state:
> "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT 
> supply MD5 or SHA-1. Existing releases do not need to be changed."
> build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. 
> cassandra-builds/cassandra-release scripts need to be updated to work with 
> the new checksum files.
> http://www.apache.org/dev/release-distribution#sigs-and-sums



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums

2019-01-08 Thread Michael Shuler (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737701#comment-16737701
 ] 

Michael Shuler commented on CASSANDRA-14970:


I have no idea how the {{ant publish}} task works.. :( I did a staging publish 
and we still get .md5 and .sha1 checksums.

!ant-publish-checksum-fail.jpg|thumbnail!

> New releases must supply SHA-256 and/or SHA-512 checksums
> -
>
> Key: CASSANDRA-14970
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14970
> Project: Cassandra
>  Issue Type: Bug
>  Components: Packaging
>Reporter: Michael Shuler
>Assignee: Michael Shuler
>Priority: Blocker
> Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0
>
> Attachments: 
> 0001-Update-downloads-for-sha256-sha512-checksum-files.patch, 
> 0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch, 
> ant-publish-checksum-fail.jpg
>
>
> Release policy was updated around 9/2018 to state:
> "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT 
> supply MD5 or SHA-1. Existing releases do not need to be changed."
> build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. 
> cassandra-builds/cassandra-release scripts need to be updated to work with 
> the new checksum files.
> http://www.apache.org/dev/release-distribution#sigs-and-sums



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums

2019-01-08 Thread Michael Shuler (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Shuler updated CASSANDRA-14970:
---
Attachment: ant-publish-checksum-fail.jpg

> New releases must supply SHA-256 and/or SHA-512 checksums
> -
>
> Key: CASSANDRA-14970
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14970
> Project: Cassandra
>  Issue Type: Bug
>  Components: Packaging
>Reporter: Michael Shuler
>Assignee: Michael Shuler
>Priority: Blocker
> Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0
>
> Attachments: 
> 0001-Update-downloads-for-sha256-sha512-checksum-files.patch, 
> 0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch, 
> ant-publish-checksum-fail.jpg
>
>
> Release policy was updated around 9/2018 to state:
> "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT 
> supply MD5 or SHA-1. Existing releases do not need to be changed."
> build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. 
> cassandra-builds/cassandra-release scripts need to be updated to work with 
> the new checksum files.
> http://www.apache.org/dev/release-distribution#sigs-and-sums



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums

2019-01-08 Thread Brandon Williams (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737672#comment-16737672
 ] 

Brandon Williams commented on CASSANDRA-14970:
--

+1

> New releases must supply SHA-256 and/or SHA-512 checksums
> -
>
> Key: CASSANDRA-14970
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14970
> Project: Cassandra
>  Issue Type: Bug
>  Components: Packaging
>Reporter: Michael Shuler
>Assignee: Michael Shuler
>Priority: Blocker
> Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0
>
> Attachments: 
> 0001-Update-downloads-for-sha256-sha512-checksum-files.patch, 
> 0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch
>
>
> Release policy was updated around 9/2018 to state:
> "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT 
> supply MD5 or SHA-1. Existing releases do not need to be changed."
> build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. 
> cassandra-builds/cassandra-release scripts need to be updated to work with 
> the new checksum files.
> http://www.apache.org/dev/release-distribution#sigs-and-sums



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums

2019-01-08 Thread Michael Shuler (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737643#comment-16737643
 ] 

Michael Shuler edited comment on CASSANDRA-14970 at 1/8/19 11:52 PM:
-

[^0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch]

Patch against {{cassandra-2.1}} branch. Merges up without conflict.
{noformat}
(cassandra-2.1)mshuler@hana:~/git/cassandra$ ls -l build/*.{gz,sha*}    
-rw-r--r-- 1 mshuler mshuler 25342702 Jan  8 17:04 
build/apache-cassandra-2.1.20-SNAPSHOT-bin.tar.gz 
-rw-r--r-- 1 mshuler mshuler   65 Jan  8 17:04 
build/apache-cassandra-2.1.20-SNAPSHOT-bin.tar.gz.sha256 
-rw-r--r-- 1 mshuler mshuler  129 Jan  8 17:04 
build/apache-cassandra-2.1.20-SNAPSHOT-bin.tar.gz.sha512 
-rw-r--r-- 1 mshuler mshuler 17265833 Jan  8 17:04 
build/apache-cassandra-2.1.20-SNAPSHOT-src.tar.gz 
-rw-r--r-- 1 mshuler mshuler   65 Jan  8 17:04 
build/apache-cassandra-2.1.20-SNAPSHOT-src.tar.gz.sha256 
-rw-r--r-- 1 mshuler mshuler  129 Jan  8 17:04 
build/apache-cassandra-2.1.20-SNAPSHOT-src.tar.gz.sha512
{noformat}


was (Author: mshuler):
Patch against {{cassandra-2.1}} branch. Merges up without conflict.
{noformat}
(cassandra-2.1)mshuler@hana:~/git/cassandra$ ls -l build/*.{gz,sha*}    
-rw-r--r-- 1 mshuler mshuler 25342702 Jan  8 17:04 
build/apache-cassandra-2.1.20-SNAPSHOT-bin.tar.gz 
-rw-r--r-- 1 mshuler mshuler   65 Jan  8 17:04 
build/apache-cassandra-2.1.20-SNAPSHOT-bin.tar.gz.sha256 
-rw-r--r-- 1 mshuler mshuler  129 Jan  8 17:04 
build/apache-cassandra-2.1.20-SNAPSHOT-bin.tar.gz.sha512 
-rw-r--r-- 1 mshuler mshuler 17265833 Jan  8 17:04 
build/apache-cassandra-2.1.20-SNAPSHOT-src.tar.gz 
-rw-r--r-- 1 mshuler mshuler   65 Jan  8 17:04 
build/apache-cassandra-2.1.20-SNAPSHOT-src.tar.gz.sha256 
-rw-r--r-- 1 mshuler mshuler  129 Jan  8 17:04 
build/apache-cassandra-2.1.20-SNAPSHOT-src.tar.gz.sha512
{noformat}

> New releases must supply SHA-256 and/or SHA-512 checksums
> -
>
> Key: CASSANDRA-14970
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14970
> Project: Cassandra
>  Issue Type: Bug
>  Components: Packaging
>Reporter: Michael Shuler
>Assignee: Michael Shuler
>Priority: Blocker
> Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0
>
> Attachments: 
> 0001-Update-downloads-for-sha256-sha512-checksum-files.patch, 
> 0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch
>
>
> Release policy was updated around 9/2018 to state:
> "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT 
> supply MD5 or SHA-1. Existing releases do not need to be changed."
> build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. 
> cassandra-builds/cassandra-release scripts need to be updated to work with 
> the new checksum files.
> http://www.apache.org/dev/release-distribution#sigs-and-sums



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums

2019-01-08 Thread Michael Shuler (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737662#comment-16737662
 ] 

Michael Shuler commented on CASSANDRA-14970:


[^0001-Update-downloads-for-sha256-sha512-checksum-files.patch] attached for 
the cassandra-builds repo - download the new checksum files for release 
publication.

> New releases must supply SHA-256 and/or SHA-512 checksums
> -
>
> Key: CASSANDRA-14970
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14970
> Project: Cassandra
>  Issue Type: Bug
>  Components: Packaging
>Reporter: Michael Shuler
>Assignee: Michael Shuler
>Priority: Blocker
> Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0
>
> Attachments: 
> 0001-Update-downloads-for-sha256-sha512-checksum-files.patch, 
> 0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch
>
>
> Release policy was updated around 9/2018 to state:
> "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT 
> supply MD5 or SHA-1. Existing releases do not need to be changed."
> build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. 
> cassandra-builds/cassandra-release scripts need to be updated to work with 
> the new checksum files.
> http://www.apache.org/dev/release-distribution#sigs-and-sums



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums

2019-01-08 Thread Michael Shuler (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Shuler updated CASSANDRA-14970:
---
Attachment: 0001-Update-downloads-for-sha256-sha512-checksum-files.patch

> New releases must supply SHA-256 and/or SHA-512 checksums
> -
>
> Key: CASSANDRA-14970
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14970
> Project: Cassandra
>  Issue Type: Bug
>  Components: Packaging
>Reporter: Michael Shuler
>Assignee: Michael Shuler
>Priority: Blocker
> Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0
>
> Attachments: 
> 0001-Update-downloads-for-sha256-sha512-checksum-files.patch, 
> 0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch
>
>
> Release policy was updated around 9/2018 to state:
> "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT 
> supply MD5 or SHA-1. Existing releases do not need to be changed."
> build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. 
> cassandra-builds/cassandra-release scripts need to be updated to work with 
> the new checksum files.
> http://www.apache.org/dev/release-distribution#sigs-and-sums



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14968) Investigate GPG signing of deb and rpm repositories via bintray

2019-01-08 Thread Michael Shuler (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737654#comment-16737654
 ] 

Michael Shuler commented on CASSANDRA-14968:


The apache organization there has a key - I don't know if it would be feasible 
to use the org key to sign the repositories? Individual users can upload public 
(and private (eww..)) keys and the API for bintray includes notes about signing 
via curl POST calls. I personally would not upload my private key anywhere, 
regardless of what ASF's opinion on that might be. Uploading a public key so 
the repo makes it available for download is pretty normal, then the signing 
portion (I guess) can be done offline(?) and uploaded.

I don't know all the ins and outs of how it works. This is precisely why this 
ticket suggests investigating the topic. Is this something you would like 
assigned to you?

> Investigate GPG signing of deb and rpm repositories via bintray
> ---
>
> Key: CASSANDRA-14968
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14968
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Michael Shuler
>Priority: Major
>  Labels: packaging
>
> Currently, the release manager uploads debian packages and built/signed 
> metadata to a generic bintray repository. Perhaps we could utilize the GPG 
> signing feature of the repository, post-upload, via the bintray GPG signing 
> feature.
> https://www.jfrog.com/confluence/display/BT/Managing+Uploaded+Content#ManagingUploadedContent-GPGSigning
>  Depends on CASSANDRA-14967



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums

2019-01-08 Thread Michael Shuler (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Shuler updated CASSANDRA-14970:
---
Attachment: 0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch
Status: Patch Available  (was: Open)

Patch against {{cassandra-2.1}} branch. Merges up without conflict.
{noformat}
(cassandra-2.1)mshuler@hana:~/git/cassandra$ ls -l build/*.{gz,sha*}    
-rw-r--r-- 1 mshuler mshuler 25342702 Jan  8 17:04 
build/apache-cassandra-2.1.20-SNAPSHOT-bin.tar.gz 
-rw-r--r-- 1 mshuler mshuler   65 Jan  8 17:04 
build/apache-cassandra-2.1.20-SNAPSHOT-bin.tar.gz.sha256 
-rw-r--r-- 1 mshuler mshuler  129 Jan  8 17:04 
build/apache-cassandra-2.1.20-SNAPSHOT-bin.tar.gz.sha512 
-rw-r--r-- 1 mshuler mshuler 17265833 Jan  8 17:04 
build/apache-cassandra-2.1.20-SNAPSHOT-src.tar.gz 
-rw-r--r-- 1 mshuler mshuler   65 Jan  8 17:04 
build/apache-cassandra-2.1.20-SNAPSHOT-src.tar.gz.sha256 
-rw-r--r-- 1 mshuler mshuler  129 Jan  8 17:04 
build/apache-cassandra-2.1.20-SNAPSHOT-src.tar.gz.sha512
{noformat}

> New releases must supply SHA-256 and/or SHA-512 checksums
> -
>
> Key: CASSANDRA-14970
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14970
> Project: Cassandra
>  Issue Type: Bug
>  Components: Packaging
>Reporter: Michael Shuler
>Assignee: Michael Shuler
>Priority: Blocker
> Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0
>
> Attachments: 
> 0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch
>
>
> Release policy was updated around 9/2018 to state:
> "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT 
> supply MD5 or SHA-1. Existing releases do not need to be changed."
> build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. 
> cassandra-builds/cassandra-release scripts need to be updated to work with 
> the new checksum files.
> http://www.apache.org/dev/release-distribution#sigs-and-sums



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums

2019-01-08 Thread Michael Shuler (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Shuler reassigned CASSANDRA-14970:
--

Assignee: Michael Shuler

> New releases must supply SHA-256 and/or SHA-512 checksums
> -
>
> Key: CASSANDRA-14970
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14970
> Project: Cassandra
>  Issue Type: Bug
>  Components: Packaging
>Reporter: Michael Shuler
>Assignee: Michael Shuler
>Priority: Blocker
> Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0
>
>
> Release policy was updated around 9/2018 to state:
> "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT 
> supply MD5 or SHA-1. Existing releases do not need to be changed."
> build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. 
> cassandra-builds/cassandra-release scripts need to be updated to work with 
> the new checksum files.
> http://www.apache.org/dev/release-distribution#sigs-and-sums



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14922) In JVM dtests need to clean up after instance shutdown

2019-01-08 Thread Joseph Lynch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737614#comment-16737614
 ] 

Joseph Lynch commented on CASSANDRA-14922:
--

[~benedict],

Awesome, can we wait for Alex to see the latest diff though with the reflection 
removed in favor of his proposed fast local thread pool cleanup method? I've 
changed the patch a bit since he last looked.

Regarding the backport, I am slightly concerned about the NativeLibrary changes 
being backported in their current form. From my reading of the JNA source code 
in version 4.2.2 in trunk we're just skipping the cache by using 
[NativeLibrary::getInstance|https://github.com/java-native-access/jna/blob/4bcc6191c5467361b5c1f12fb5797354cc3aa897/src/com/sun/jna/NativeLibrary.java#L341]
 directly and passing it to 
[Native::register(NativeLibrary)|https://github.com/java-native-access/jna/blob/4bcc6191c5467361b5c1f12fb5797354cc3aa897/src/com/sun/jna/Native.java#L1260]
 instead of having 
[Native::register(String)|https://github.com/java-native-access/jna/blob/4bcc6191c5467361b5c1f12fb5797354cc3aa897/src/com/sun/jna/Native.java#L1251]
 do that for us and cache the classloader along the way 
[here|https://github.com/java-native-access/jna/blob/4bcc6191c5467361b5c1f12fb5797354cc3aa897/src/com/sun/jna/NativeLibrary.java#L363].
 But, if I'm wrong it's unlikely we'd know, as while our tests cover Linux 
pretty thoroughly, darwin/windows are less covered.

Also I forgot to respond to your question about SoftReferences here, did it on 
IRC but not here.
{quote}Do you know where the soft references originate? I wonder if there's 
anything we can do to simply eliminate them.
{quote}
I think the Soft references are coming from 
{{java.io.ObjectStreamClass$Caches.localDescs}}, but the object serder we're 
doing in {{InvokableInstance}} is a bit beyond my JVM skills I'm afraid. I 
don't know how we can prevent the object serializations from caching the class 
descriptions... Perhaps the JVM option is sufficient for now and if we don't 
like that going forward we can dive in more?

> In JVM dtests need to clean up after instance shutdown
> --
>
> Key: CASSANDRA-14922
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14922
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Minor
> Fix For: 4.0
>
> Attachments: AllThreadsStopped.png, ClassLoadersRetaining.png, 
> Leaking_Metrics_On_Shutdown.png, MainClassRetaining.png, 
> MemoryReclaimedFix.png, Metaspace_Actually_Collected.png, 
> OnlyThreeRootsLeft.png, no_more_references.png
>
>
> Currently the unit tests are failing on circleci ([example 
> one|https://circleci.com/gh/jolynch/cassandra/300#tests/containers/1], 
> [example 
> two|https://circleci.com/gh/rustyrazorblade/cassandra/44#tests/containers/1]) 
> because we use a small container (medium) for unit tests by default and the 
> in JVM dtests are leaking a few hundred megabytes of memory per test right 
> now. This is not a big deal because the dtest runs with the larger containers 
> continue to function fine as well as local testing as the number of in JVM 
> dtests is not yet high enough to cause a problem with more than 2GB of 
> available heap. However we should fix the memory leak so that going forwards 
> we can add more in JVM dtests without worry.
> I've been working with [~ifesdjeen] to debug, and the issue appears to be 
> unreleased Table/Keyspace metrics (screenshot showing the leak attached). I 
> believe that we have a few potential issues that are leading to the leaks:
> 1. The 
> [{{Instance::shutdown}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/Instance.java#L328-L354]
>  method is not successfully cleaning up all the metrics created by the 
> {{CassandraMetricsRegistry}}
>  2. The 
> [{{TestCluster::close}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/TestCluster.java#L283]
>  method is not waiting for all the instances to finish shutting down and 
> cleaning up before continuing on
> 3. I'm not sure if this is an issue assuming we clear all metrics, but 
> [{{TableMetrics::release}}|https://github.com/apache/cassandra/blob/4ae229f5cd270c2b43475b3f752a7b228de260ea/src/java/org/apache/cassandra/metrics/TableMetrics.java#L951]
>  does not release all the metric references (which could leak them)
> I am working on a patch which shuts down everything and assures that we do 
> not leak memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To 

[jira] [Created] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums

2019-01-08 Thread Michael Shuler (JIRA)
Michael Shuler created CASSANDRA-14970:
--

 Summary: New releases must supply SHA-256 and/or SHA-512 checksums
 Key: CASSANDRA-14970
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14970
 Project: Cassandra
  Issue Type: Bug
  Components: Packaging
Reporter: Michael Shuler
 Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0


Release policy was updated around 9/2018 to state:

"For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT 
supply MD5 or SHA-1. Existing releases do not need to be changed."

build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. 
cassandra-builds/cassandra-release scripts need to be updated to work with the 
new checksum files.

http://www.apache.org/dev/release-distribution#sigs-and-sums



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14525) streaming failure during bootstrap makes new node into inconsistent state

2019-01-08 Thread Jaydeepkumar Chovatia (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737608#comment-16737608
 ] 

Jaydeepkumar Chovatia commented on CASSANDRA-14525:
---

[~aweisberg] I've already taken care of dests as part of 
https://issues.apache.org/jira/browse/CASSANDRA-14526, here is the [patch for 
dtest|https://github.com/apache/cassandra-dtest/compare/master...jaydeepkumar1984:14526-trunk].
 Not sure if [~jay.zhuang] got a chance to fire dtest, if possible could you 
please help me start dtest with this patch?

> streaming failure during bootstrap makes new node into inconsistent state
> -
>
> Key: CASSANDRA-14525
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14525
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Major
> Fix For: 2.2.14, 3.0.18, 3.11.4, 4.0
>
>
> If bootstrap fails for newly joining node (most common reason is due to 
> streaming failure) then Cassandra state remains in {{joining}} state which is 
> fine but Cassandra also enables Native transport which makes overall state 
> inconsistent. This further creates NullPointer exception if auth is enabled 
> on the new node, please find reproducible steps here:
> For example if bootstrap fails due to streaming errors like
> {quote}java.util.concurrent.ExecutionException: 
> org.apache.cassandra.streaming.StreamException: Stream failed
>  at 
> com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) 
> ~[guava-18.0.jar:na]
>  at 
> org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1256)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:894)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:660)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:573)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:330) 
> [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:695) 
> [apache-cassandra-3.0.16.jar:3.0.16]
>  Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
>  at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310) 
> ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
>  ~[guava-18.0.jar:na]
>  at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:211)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:187)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:440)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.StreamSession.onError(StreamSession.java:540) 
> ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:307)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
> {quote}
> then variable [StorageService.java::dataAvailable 
> |https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L892]
>  will be {{false}}. Since {{dataAvailable}} is {{false}} hence it will not 
> call 

[jira] [Commented] (CASSANDRA-14968) Investigate GPG signing of deb and rpm repositories via bintray

2019-01-08 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737603#comment-16737603
 ] 

mck commented on CASSANDRA-14968:
-

I don't think ASF permits/encourages shared (or even uploaded?) private keys? 
This needs to be checked.

> Investigate GPG signing of deb and rpm repositories via bintray
> ---
>
> Key: CASSANDRA-14968
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14968
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Michael Shuler
>Priority: Major
>  Labels: packaging
>
> Currently, the release manager uploads debian packages and built/signed 
> metadata to a generic bintray repository. Perhaps we could utilize the GPG 
> signing feature of the repository, post-upload, via the bintray GPG signing 
> feature.
> https://www.jfrog.com/confluence/display/BT/Managing+Uploaded+Content#ManagingUploadedContent-GPGSigning
>  Depends on CASSANDRA-14967



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Resolved] (CASSANDRA-14969) Clean up ThreadLocals directly instead of via reflection

2019-01-08 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch resolved CASSANDRA-14969.
--
Resolution: Invalid

My mistake, it appears that after Alex's suggestion to clean up the 
{{FastThreadLocalThread}}'s ThreadLocalMap directly via netty we don't need the 
reflection hack any more. Closing this out, sorry for the ticket spam. 

> Clean up ThreadLocals directly instead of via reflection
> 
>
> Key: CASSANDRA-14969
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14969
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest
>Reporter: Joseph Lynch
>Priority: Minor
>
> In CASSANDRA-14922 we have to institute a bit of a hack via reflection to 
> clean up thread local variables that are not properly {{destroyed}} in 
> {{DistributedTestBase::cleanup}}. Let's make sure that all of the thread 
> locals we have are cleaned up via {{destroy}} calls instead of relying on 
> reflection here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14922) In JVM dtests need to clean up after instance shutdown

2019-01-08 Thread Joseph Lynch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737552#comment-16737552
 ] 

Joseph Lynch edited comment on CASSANDRA-14922 at 1/8/19 10:10 PM:
---

{quote}The patch looks good, and I'd say [~jolynch] let's merge it,
{quote}
Ok, yea I agree let's merge what we have so that the unit tests can pass on 
trunk again. I've put up a patch against trunk with what we have so far 
(including your changes from the demo branch which as far as I can tell remove 
the need for the ThreadLocal clearing).

||trunk||
|[024e6943|https://github.com/apache/cassandra/commit/024e69436e89bb79cdbf4e136a1f6d9c2747275d]|
|[!https://circleci.com/gh/jolynch/cassandra/tree/CASSANDRA-14922.png?circle-token=
 
1102a59698d04899ec971dd36e925928f7b521f5!|https://circleci.com/gh/jolynch/cassandra/tree/CASSANDRA-14922]|

If I attach a profiler during an intellij "run this test until it fails" mode I 
can see that the memory is indeed getting cleaned up:

!MemoryReclaimedFix.png!


was (Author: jolynch):
{quote}The patch looks good, and I'd say [~jolynch] let's merge it,
{quote}
Ok, yea I agree let's merge what we have so that the unit tests can pass on 
trunk again and we can follow up in CASSANDRA-14969. I've put up a patch 
against trunk with what we have so far (including your changes from the demo 
branch).

||trunk||
|[0e4460b2e|https://github.com/apache/cassandra/commit/0e4460b2e0996802f02579c104b68ff165522875]|
|[!https://circleci.com/gh/jolynch/cassandra/tree/CASSANDRA-14922.png?circle-token=
 
1102a59698d04899ec971dd36e925928f7b521f5!|https://circleci.com/gh/jolynch/cassandra/tree/CASSANDRA-14922]|

If I attach a profiler during an intellij "run this test until it fails" mode I 
can see that the memory is indeed getting cleaned up:

!MemoryReclaimedFix.png!

> In JVM dtests need to clean up after instance shutdown
> --
>
> Key: CASSANDRA-14922
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14922
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Minor
> Fix For: 4.0
>
> Attachments: AllThreadsStopped.png, ClassLoadersRetaining.png, 
> Leaking_Metrics_On_Shutdown.png, MainClassRetaining.png, 
> MemoryReclaimedFix.png, Metaspace_Actually_Collected.png, 
> OnlyThreeRootsLeft.png, no_more_references.png
>
>
> Currently the unit tests are failing on circleci ([example 
> one|https://circleci.com/gh/jolynch/cassandra/300#tests/containers/1], 
> [example 
> two|https://circleci.com/gh/rustyrazorblade/cassandra/44#tests/containers/1]) 
> because we use a small container (medium) for unit tests by default and the 
> in JVM dtests are leaking a few hundred megabytes of memory per test right 
> now. This is not a big deal because the dtest runs with the larger containers 
> continue to function fine as well as local testing as the number of in JVM 
> dtests is not yet high enough to cause a problem with more than 2GB of 
> available heap. However we should fix the memory leak so that going forwards 
> we can add more in JVM dtests without worry.
> I've been working with [~ifesdjeen] to debug, and the issue appears to be 
> unreleased Table/Keyspace metrics (screenshot showing the leak attached). I 
> believe that we have a few potential issues that are leading to the leaks:
> 1. The 
> [{{Instance::shutdown}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/Instance.java#L328-L354]
>  method is not successfully cleaning up all the metrics created by the 
> {{CassandraMetricsRegistry}}
>  2. The 
> [{{TestCluster::close}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/TestCluster.java#L283]
>  method is not waiting for all the instances to finish shutting down and 
> cleaning up before continuing on
> 3. I'm not sure if this is an issue assuming we clear all metrics, but 
> [{{TableMetrics::release}}|https://github.com/apache/cassandra/blob/4ae229f5cd270c2b43475b3f752a7b228de260ea/src/java/org/apache/cassandra/metrics/TableMetrics.java#L951]
>  does not release all the metric references (which could leak them)
> I am working on a patch which shuts down everything and assures that we do 
> not leak memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14922) In JVM dtests need to clean up after instance shutdown

2019-01-08 Thread Joseph Lynch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737552#comment-16737552
 ] 

Joseph Lynch edited comment on CASSANDRA-14922 at 1/8/19 9:53 PM:
--

{quote}The patch looks good, and I'd say [~jolynch] let's merge it,
{quote}
Ok, yea I agree let's merge what we have so that the unit tests can pass on 
trunk again and we can follow up in CASSANDRA-14969. I've put up a patch 
against trunk with what we have so far (including your changes from the demo 
branch).

||trunk||
|[0e4460b2e|https://github.com/apache/cassandra/commit/0e4460b2e0996802f02579c104b68ff165522875]|
|[!https://circleci.com/gh/jolynch/cassandra/tree/CASSANDRA-14922.png?circle-token=
 
1102a59698d04899ec971dd36e925928f7b521f5!|https://circleci.com/gh/jolynch/cassandra/tree/CASSANDRA-14922]|

If I attach a profiler during an intellij "run this test until it fails" mode I 
can see that the memory is indeed getting cleaned up:

!MemoryReclaimedFix.png!


was (Author: jolynch):
{quote}The patch looks good, and I'd say [~jolynch] let's merge it,
{quote}
Ok, yea I agree let's merge what we have so that the unit tests can pass on 
trunk again and we can follow up in CASSANDRA-14969. I've put up a patch 
against trunk with what we have so far (including your changes from the demo 
branch).

 
||trunk||
|[0e4460b2e|https://github.com/apache/cassandra/commit/0e4460b2e0996802f02579c104b68ff165522875]|
|[!https://circleci.com/gh/jolynch/cassandra/tree/CASSANDRA-14922.png?circle-token=
 
1102a59698d04899ec971dd36e925928f7b521f5!|https://circleci.com/gh/jolynch/cassandra/tree/CASSANDRA-14922]|

 

If I attach a profiler during an intellij "run this test until it fails" mode I 
can see that the memory is indeed getting cleaned up:

!MemoryReclaimedFix.png!

> In JVM dtests need to clean up after instance shutdown
> --
>
> Key: CASSANDRA-14922
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14922
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Minor
> Fix For: 4.0
>
> Attachments: AllThreadsStopped.png, ClassLoadersRetaining.png, 
> Leaking_Metrics_On_Shutdown.png, MainClassRetaining.png, 
> MemoryReclaimedFix.png, Metaspace_Actually_Collected.png, 
> OnlyThreeRootsLeft.png, no_more_references.png
>
>
> Currently the unit tests are failing on circleci ([example 
> one|https://circleci.com/gh/jolynch/cassandra/300#tests/containers/1], 
> [example 
> two|https://circleci.com/gh/rustyrazorblade/cassandra/44#tests/containers/1]) 
> because we use a small container (medium) for unit tests by default and the 
> in JVM dtests are leaking a few hundred megabytes of memory per test right 
> now. This is not a big deal because the dtest runs with the larger containers 
> continue to function fine as well as local testing as the number of in JVM 
> dtests is not yet high enough to cause a problem with more than 2GB of 
> available heap. However we should fix the memory leak so that going forwards 
> we can add more in JVM dtests without worry.
> I've been working with [~ifesdjeen] to debug, and the issue appears to be 
> unreleased Table/Keyspace metrics (screenshot showing the leak attached). I 
> believe that we have a few potential issues that are leading to the leaks:
> 1. The 
> [{{Instance::shutdown}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/Instance.java#L328-L354]
>  method is not successfully cleaning up all the metrics created by the 
> {{CassandraMetricsRegistry}}
>  2. The 
> [{{TestCluster::close}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/TestCluster.java#L283]
>  method is not waiting for all the instances to finish shutting down and 
> cleaning up before continuing on
> 3. I'm not sure if this is an issue assuming we clear all metrics, but 
> [{{TableMetrics::release}}|https://github.com/apache/cassandra/blob/4ae229f5cd270c2b43475b3f752a7b228de260ea/src/java/org/apache/cassandra/metrics/TableMetrics.java#L951]
>  does not release all the metric references (which could leak them)
> I am working on a patch which shuts down everything and assures that we do 
> not leak memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14922) In JVM dtests need to clean up after instance shutdown

2019-01-08 Thread Joseph Lynch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737552#comment-16737552
 ] 

Joseph Lynch edited comment on CASSANDRA-14922 at 1/8/19 9:44 PM:
--

{quote}The patch looks good, and I'd say [~jolynch] let's merge it,
{quote}
Ok, yea I agree let's merge what we have so that the unit tests can pass on 
trunk again and we can follow up in CASSANDRA-14969. I've put up a patch 
against trunk with what we have so far (including your changes from the demo 
branch).

 
||trunk||
|[0e4460b2e|https://github.com/apache/cassandra/commit/0e4460b2e0996802f02579c104b68ff165522875]|
|[!https://circleci.com/gh/jolynch/cassandra/tree/CASSANDRA-14922.png?circle-token=
 
1102a59698d04899ec971dd36e925928f7b521f5!|https://circleci.com/gh/jolynch/cassandra/tree/CASSANDRA-14922]|

 

If I attach a profiler during an intellij "run this test until it fails" mode I 
can see that the memory is indeed getting cleaned up:

!MemoryReclaimedFix.png!


was (Author: jolynch):
{quote}The patch looks good, and I'd say [~jolynch] let's merge it,
{quote}
Ok, yea I agree let's merge what we have so that the unit tests can pass on 
trunk again and we can follow up in CASSANDRA-14969. I've put up a patch 
against trunk with what we have so far (including your changes from the demo 
branch).

 
||trunk||
|[d361ba9b|https://github.com/apache/cassandra/commit/d361ba9b846cf6dc9c3ef5daca7aab5a39ec8fcc]|
|[!https://circleci.com/gh/jolynch/cassandra/tree/CASSANDRA-14922.png?circle-token=
 
1102a59698d04899ec971dd36e925928f7b521f5!|https://circleci.com/gh/jolynch/cassandra/tree/CASSANDRA-14922]|

 

If I attach a profiler during an intellij "run this test until it fails" mode I 
can see that the memory is indeed getting cleaned up:

!MemoryReclaimedFix.png!

> In JVM dtests need to clean up after instance shutdown
> --
>
> Key: CASSANDRA-14922
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14922
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Minor
> Attachments: AllThreadsStopped.png, ClassLoadersRetaining.png, 
> Leaking_Metrics_On_Shutdown.png, MainClassRetaining.png, 
> MemoryReclaimedFix.png, Metaspace_Actually_Collected.png, 
> OnlyThreeRootsLeft.png, no_more_references.png
>
>
> Currently the unit tests are failing on circleci ([example 
> one|https://circleci.com/gh/jolynch/cassandra/300#tests/containers/1], 
> [example 
> two|https://circleci.com/gh/rustyrazorblade/cassandra/44#tests/containers/1]) 
> because we use a small container (medium) for unit tests by default and the 
> in JVM dtests are leaking a few hundred megabytes of memory per test right 
> now. This is not a big deal because the dtest runs with the larger containers 
> continue to function fine as well as local testing as the number of in JVM 
> dtests is not yet high enough to cause a problem with more than 2GB of 
> available heap. However we should fix the memory leak so that going forwards 
> we can add more in JVM dtests without worry.
> I've been working with [~ifesdjeen] to debug, and the issue appears to be 
> unreleased Table/Keyspace metrics (screenshot showing the leak attached). I 
> believe that we have a few potential issues that are leading to the leaks:
> 1. The 
> [{{Instance::shutdown}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/Instance.java#L328-L354]
>  method is not successfully cleaning up all the metrics created by the 
> {{CassandraMetricsRegistry}}
>  2. The 
> [{{TestCluster::close}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/TestCluster.java#L283]
>  method is not waiting for all the instances to finish shutting down and 
> cleaning up before continuing on
> 3. I'm not sure if this is an issue assuming we clear all metrics, but 
> [{{TableMetrics::release}}|https://github.com/apache/cassandra/blob/4ae229f5cd270c2b43475b3f752a7b228de260ea/src/java/org/apache/cassandra/metrics/TableMetrics.java#L951]
>  does not release all the metric references (which could leak them)
> I am working on a patch which shuts down everything and assures that we do 
> not leak memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14525) streaming failure during bootstrap makes new node into inconsistent state

2019-01-08 Thread Ariel Weisberg (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737561#comment-16737561
 ] 

Ariel Weisberg commented on CASSANDRA-14525:


I think test_resume secondary_indexes_test.py:TestPreJoinCallback.test_resume 
has the same issue.


> streaming failure during bootstrap makes new node into inconsistent state
> -
>
> Key: CASSANDRA-14525
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14525
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Major
> Fix For: 2.2.14, 3.0.18, 3.11.4, 4.0
>
>
> If bootstrap fails for newly joining node (most common reason is due to 
> streaming failure) then Cassandra state remains in {{joining}} state which is 
> fine but Cassandra also enables Native transport which makes overall state 
> inconsistent. This further creates NullPointer exception if auth is enabled 
> on the new node, please find reproducible steps here:
> For example if bootstrap fails due to streaming errors like
> {quote}java.util.concurrent.ExecutionException: 
> org.apache.cassandra.streaming.StreamException: Stream failed
>  at 
> com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) 
> ~[guava-18.0.jar:na]
>  at 
> org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1256)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:894)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:660)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:573)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:330) 
> [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:695) 
> [apache-cassandra-3.0.16.jar:3.0.16]
>  Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
>  at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310) 
> ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
>  ~[guava-18.0.jar:na]
>  at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:211)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:187)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:440)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.StreamSession.onError(StreamSession.java:540) 
> ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:307)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
> {quote}
> then variable [StorageService.java::dataAvailable 
> |https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L892]
>  will be {{false}}. Since {{dataAvailable}} is {{false}} hence it will not 
> call [StorageService.java::finishJoiningRing 
> |https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L933]
>  and as a result 
> 

[jira] [Commented] (CASSANDRA-14922) In JVM dtests need to clean up after instance shutdown

2019-01-08 Thread Benedict (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737557#comment-16737557
 ] 

Benedict commented on CASSANDRA-14922:
--

Marking 'Ready to Commit' given [~ifesdjeen]'s comments.  I'll give it another 
quick once over then commit, so I can rebase CASSANDRA-14931 and 
CASSANDRA-14937.

> In JVM dtests need to clean up after instance shutdown
> --
>
> Key: CASSANDRA-14922
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14922
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Minor
> Fix For: 4.0
>
> Attachments: AllThreadsStopped.png, ClassLoadersRetaining.png, 
> Leaking_Metrics_On_Shutdown.png, MainClassRetaining.png, 
> MemoryReclaimedFix.png, Metaspace_Actually_Collected.png, 
> OnlyThreeRootsLeft.png, no_more_references.png
>
>
> Currently the unit tests are failing on circleci ([example 
> one|https://circleci.com/gh/jolynch/cassandra/300#tests/containers/1], 
> [example 
> two|https://circleci.com/gh/rustyrazorblade/cassandra/44#tests/containers/1]) 
> because we use a small container (medium) for unit tests by default and the 
> in JVM dtests are leaking a few hundred megabytes of memory per test right 
> now. This is not a big deal because the dtest runs with the larger containers 
> continue to function fine as well as local testing as the number of in JVM 
> dtests is not yet high enough to cause a problem with more than 2GB of 
> available heap. However we should fix the memory leak so that going forwards 
> we can add more in JVM dtests without worry.
> I've been working with [~ifesdjeen] to debug, and the issue appears to be 
> unreleased Table/Keyspace metrics (screenshot showing the leak attached). I 
> believe that we have a few potential issues that are leading to the leaks:
> 1. The 
> [{{Instance::shutdown}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/Instance.java#L328-L354]
>  method is not successfully cleaning up all the metrics created by the 
> {{CassandraMetricsRegistry}}
>  2. The 
> [{{TestCluster::close}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/TestCluster.java#L283]
>  method is not waiting for all the instances to finish shutting down and 
> cleaning up before continuing on
> 3. I'm not sure if this is an issue assuming we clear all metrics, but 
> [{{TableMetrics::release}}|https://github.com/apache/cassandra/blob/4ae229f5cd270c2b43475b3f752a7b228de260ea/src/java/org/apache/cassandra/metrics/TableMetrics.java#L951]
>  does not release all the metric references (which could leak them)
> I am working on a patch which shuts down everything and assures that we do 
> not leak memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14525) streaming failure during bootstrap makes new node into inconsistent state

2019-01-08 Thread Ariel Weisberg (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737556#comment-16737556
 ] 

Ariel Weisberg commented on CASSANDRA-14525:


This breaks bootstrap_test.py:TestBootstrap.test_resumable_bootstrap. The test 
expects the cluster to start the native interface when bootstrap fails.

> streaming failure during bootstrap makes new node into inconsistent state
> -
>
> Key: CASSANDRA-14525
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14525
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Major
> Fix For: 2.2.14, 3.0.18, 3.11.4, 4.0
>
>
> If bootstrap fails for newly joining node (most common reason is due to 
> streaming failure) then Cassandra state remains in {{joining}} state which is 
> fine but Cassandra also enables Native transport which makes overall state 
> inconsistent. This further creates NullPointer exception if auth is enabled 
> on the new node, please find reproducible steps here:
> For example if bootstrap fails due to streaming errors like
> {quote}java.util.concurrent.ExecutionException: 
> org.apache.cassandra.streaming.StreamException: Stream failed
>  at 
> com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) 
> ~[guava-18.0.jar:na]
>  at 
> org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1256)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:894)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:660)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:573)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:330) 
> [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567)
>  [apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:695) 
> [apache-cassandra-3.0.16.jar:3.0.16]
>  Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
>  at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310) 
> ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
>  ~[guava-18.0.jar:na]
>  at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
>  ~[guava-18.0.jar:na]
>  at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:211)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:187)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:440)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.StreamSession.onError(StreamSession.java:540) 
> ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:307)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
>  ~[apache-cassandra-3.0.16.jar:3.0.16]
>  at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121]
> {quote}
> then variable [StorageService.java::dataAvailable 
> |https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L892]
>  will be {{false}}. Since {{dataAvailable}} is {{false}} hence it will not 
> call [StorageService.java::finishJoiningRing 
> |https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L933]
>  and as a result 
> 

[jira] [Updated] (CASSANDRA-14922) In JVM dtests need to clean up after instance shutdown

2019-01-08 Thread Benedict (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict updated CASSANDRA-14922:
-
Status: Ready to Commit  (was: Patch Available)

> In JVM dtests need to clean up after instance shutdown
> --
>
> Key: CASSANDRA-14922
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14922
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Minor
> Fix For: 4.0
>
> Attachments: AllThreadsStopped.png, ClassLoadersRetaining.png, 
> Leaking_Metrics_On_Shutdown.png, MainClassRetaining.png, 
> MemoryReclaimedFix.png, Metaspace_Actually_Collected.png, 
> OnlyThreeRootsLeft.png, no_more_references.png
>
>
> Currently the unit tests are failing on circleci ([example 
> one|https://circleci.com/gh/jolynch/cassandra/300#tests/containers/1], 
> [example 
> two|https://circleci.com/gh/rustyrazorblade/cassandra/44#tests/containers/1]) 
> because we use a small container (medium) for unit tests by default and the 
> in JVM dtests are leaking a few hundred megabytes of memory per test right 
> now. This is not a big deal because the dtest runs with the larger containers 
> continue to function fine as well as local testing as the number of in JVM 
> dtests is not yet high enough to cause a problem with more than 2GB of 
> available heap. However we should fix the memory leak so that going forwards 
> we can add more in JVM dtests without worry.
> I've been working with [~ifesdjeen] to debug, and the issue appears to be 
> unreleased Table/Keyspace metrics (screenshot showing the leak attached). I 
> believe that we have a few potential issues that are leading to the leaks:
> 1. The 
> [{{Instance::shutdown}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/Instance.java#L328-L354]
>  method is not successfully cleaning up all the metrics created by the 
> {{CassandraMetricsRegistry}}
>  2. The 
> [{{TestCluster::close}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/TestCluster.java#L283]
>  method is not waiting for all the instances to finish shutting down and 
> cleaning up before continuing on
> 3. I'm not sure if this is an issue assuming we clear all metrics, but 
> [{{TableMetrics::release}}|https://github.com/apache/cassandra/blob/4ae229f5cd270c2b43475b3f752a7b228de260ea/src/java/org/apache/cassandra/metrics/TableMetrics.java#L951]
>  does not release all the metric references (which could leak them)
> I am working on a patch which shuts down everything and assures that we do 
> not leak memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14922) In JVM dtests need to clean up after instance shutdown

2019-01-08 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-14922:
-
Fix Version/s: 4.0
   Status: Patch Available  (was: Open)

> In JVM dtests need to clean up after instance shutdown
> --
>
> Key: CASSANDRA-14922
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14922
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Minor
> Fix For: 4.0
>
> Attachments: AllThreadsStopped.png, ClassLoadersRetaining.png, 
> Leaking_Metrics_On_Shutdown.png, MainClassRetaining.png, 
> MemoryReclaimedFix.png, Metaspace_Actually_Collected.png, 
> OnlyThreeRootsLeft.png, no_more_references.png
>
>
> Currently the unit tests are failing on circleci ([example 
> one|https://circleci.com/gh/jolynch/cassandra/300#tests/containers/1], 
> [example 
> two|https://circleci.com/gh/rustyrazorblade/cassandra/44#tests/containers/1]) 
> because we use a small container (medium) for unit tests by default and the 
> in JVM dtests are leaking a few hundred megabytes of memory per test right 
> now. This is not a big deal because the dtest runs with the larger containers 
> continue to function fine as well as local testing as the number of in JVM 
> dtests is not yet high enough to cause a problem with more than 2GB of 
> available heap. However we should fix the memory leak so that going forwards 
> we can add more in JVM dtests without worry.
> I've been working with [~ifesdjeen] to debug, and the issue appears to be 
> unreleased Table/Keyspace metrics (screenshot showing the leak attached). I 
> believe that we have a few potential issues that are leading to the leaks:
> 1. The 
> [{{Instance::shutdown}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/Instance.java#L328-L354]
>  method is not successfully cleaning up all the metrics created by the 
> {{CassandraMetricsRegistry}}
>  2. The 
> [{{TestCluster::close}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/TestCluster.java#L283]
>  method is not waiting for all the instances to finish shutting down and 
> cleaning up before continuing on
> 3. I'm not sure if this is an issue assuming we clear all metrics, but 
> [{{TableMetrics::release}}|https://github.com/apache/cassandra/blob/4ae229f5cd270c2b43475b3f752a7b228de260ea/src/java/org/apache/cassandra/metrics/TableMetrics.java#L951]
>  does not release all the metric references (which could leak them)
> I am working on a patch which shuts down everything and assures that we do 
> not leak memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14922) In JVM dtests need to clean up after instance shutdown

2019-01-08 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-14922:
-
Attachment: MemoryReclaimedFix.png

> In JVM dtests need to clean up after instance shutdown
> --
>
> Key: CASSANDRA-14922
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14922
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Minor
> Attachments: AllThreadsStopped.png, ClassLoadersRetaining.png, 
> Leaking_Metrics_On_Shutdown.png, MainClassRetaining.png, 
> MemoryReclaimedFix.png, Metaspace_Actually_Collected.png, 
> OnlyThreeRootsLeft.png, no_more_references.png
>
>
> Currently the unit tests are failing on circleci ([example 
> one|https://circleci.com/gh/jolynch/cassandra/300#tests/containers/1], 
> [example 
> two|https://circleci.com/gh/rustyrazorblade/cassandra/44#tests/containers/1]) 
> because we use a small container (medium) for unit tests by default and the 
> in JVM dtests are leaking a few hundred megabytes of memory per test right 
> now. This is not a big deal because the dtest runs with the larger containers 
> continue to function fine as well as local testing as the number of in JVM 
> dtests is not yet high enough to cause a problem with more than 2GB of 
> available heap. However we should fix the memory leak so that going forwards 
> we can add more in JVM dtests without worry.
> I've been working with [~ifesdjeen] to debug, and the issue appears to be 
> unreleased Table/Keyspace metrics (screenshot showing the leak attached). I 
> believe that we have a few potential issues that are leading to the leaks:
> 1. The 
> [{{Instance::shutdown}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/Instance.java#L328-L354]
>  method is not successfully cleaning up all the metrics created by the 
> {{CassandraMetricsRegistry}}
>  2. The 
> [{{TestCluster::close}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/TestCluster.java#L283]
>  method is not waiting for all the instances to finish shutting down and 
> cleaning up before continuing on
> 3. I'm not sure if this is an issue assuming we clear all metrics, but 
> [{{TableMetrics::release}}|https://github.com/apache/cassandra/blob/4ae229f5cd270c2b43475b3f752a7b228de260ea/src/java/org/apache/cassandra/metrics/TableMetrics.java#L951]
>  does not release all the metric references (which could leak them)
> I am working on a patch which shuts down everything and assures that we do 
> not leak memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14922) In JVM dtests need to clean up after instance shutdown

2019-01-08 Thread Joseph Lynch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737552#comment-16737552
 ] 

Joseph Lynch commented on CASSANDRA-14922:
--

{quote}The patch looks good, and I'd say [~jolynch] let's merge it,
{quote}
Ok, yea I agree let's merge what we have so that the unit tests can pass on 
trunk again and we can follow up in CASSANDRA-14969. I've put up a patch 
against trunk with what we have so far (including your changes from the demo 
branch).

 
||trunk||
|[d361ba9b|https://github.com/apache/cassandra/commit/d361ba9b846cf6dc9c3ef5daca7aab5a39ec8fcc]|
|[!https://circleci.com/gh/jolynch/cassandra/tree/CASSANDRA-14922.png?circle-token=
 
1102a59698d04899ec971dd36e925928f7b521f5!|https://circleci.com/gh/jolynch/cassandra/tree/CASSANDRA-14922]|

 

If I attach a profiler during an intellij "run this test until it fails" mode I 
can see that the memory is indeed getting cleaned up:

!MemoryReclaimedFix.png!

> In JVM dtests need to clean up after instance shutdown
> --
>
> Key: CASSANDRA-14922
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14922
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Minor
> Attachments: AllThreadsStopped.png, ClassLoadersRetaining.png, 
> Leaking_Metrics_On_Shutdown.png, MainClassRetaining.png, 
> MemoryReclaimedFix.png, Metaspace_Actually_Collected.png, 
> OnlyThreeRootsLeft.png, no_more_references.png
>
>
> Currently the unit tests are failing on circleci ([example 
> one|https://circleci.com/gh/jolynch/cassandra/300#tests/containers/1], 
> [example 
> two|https://circleci.com/gh/rustyrazorblade/cassandra/44#tests/containers/1]) 
> because we use a small container (medium) for unit tests by default and the 
> in JVM dtests are leaking a few hundred megabytes of memory per test right 
> now. This is not a big deal because the dtest runs with the larger containers 
> continue to function fine as well as local testing as the number of in JVM 
> dtests is not yet high enough to cause a problem with more than 2GB of 
> available heap. However we should fix the memory leak so that going forwards 
> we can add more in JVM dtests without worry.
> I've been working with [~ifesdjeen] to debug, and the issue appears to be 
> unreleased Table/Keyspace metrics (screenshot showing the leak attached). I 
> believe that we have a few potential issues that are leading to the leaks:
> 1. The 
> [{{Instance::shutdown}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/Instance.java#L328-L354]
>  method is not successfully cleaning up all the metrics created by the 
> {{CassandraMetricsRegistry}}
>  2. The 
> [{{TestCluster::close}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/TestCluster.java#L283]
>  method is not waiting for all the instances to finish shutting down and 
> cleaning up before continuing on
> 3. I'm not sure if this is an issue assuming we clear all metrics, but 
> [{{TableMetrics::release}}|https://github.com/apache/cassandra/blob/4ae229f5cd270c2b43475b3f752a7b228de260ea/src/java/org/apache/cassandra/metrics/TableMetrics.java#L951]
>  does not release all the metric references (which could leak them)
> I am working on a patch which shuts down everything and assures that we do 
> not leak memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14688) Update protocol spec and class level doc with protocol checksumming details

2019-01-08 Thread mck (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-14688:

Reviewers: Alex Petrov, mck
 Reviewer:   (was: mck)

> Update protocol spec and class level doc with protocol checksumming details
> ---
>
> Key: CASSANDRA-14688
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14688
> Project: Cassandra
>  Issue Type: Task
>  Components: Legacy/Documentation and Website
>Reporter: Sam Tunnicliffe
>Assignee: Sam Tunnicliffe
>Priority: Major
> Fix For: 4.0
>
>
> CASSANDRA-13304 provides an option to add checksumming to the frame body of 
> native protocol messages. The native protocol spec needs to be updated to 
> reflect this ASAP. We should also verify that the javadoc comments describing 
> the on-wire format in 
> {{o.a.c.transport.frame.checksum.ChecksummingTransformer}} are up to date.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14969) Clean up ThreadLocals directly instead of via reflection

2019-01-08 Thread Joseph Lynch (JIRA)
Joseph Lynch created CASSANDRA-14969:


 Summary: Clean up ThreadLocals directly instead of via reflection
 Key: CASSANDRA-14969
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14969
 Project: Cassandra
  Issue Type: Improvement
  Components: Test/dtest
Reporter: Joseph Lynch


In CASSANDRA-14922 we have to institute a bit of a hack via reflection to clean 
up thread local variables that are not properly {{destroyed}} in 
{{DistributedTestBase::cleanup}}. Let's make sure that all of the thread locals 
we have are cleaned up via {{destroy}} calls instead of relying on reflection 
here.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14953) Failed to reclaim the memory and too many MemtableReclaimMemory pending task

2019-01-08 Thread Jeremy Hanna (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737416#comment-16737416
 ] 

Jeremy Hanna commented on CASSANDRA-14953:
--

This appears to be a use case/configuration specific problem and not a bug with 
the software itself.  I would engage with those on the Cassandra user list or 
stack overflow to troubleshoot further.  See 
http://cassandra.apache.org/community/ for links to both.  Jira is primarily 
meant for development and bugs rather than operational issues.

> Failed to reclaim the memory and too many MemtableReclaimMemory pending task
> 
>
> Key: CASSANDRA-14953
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14953
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Memtable
> Environment: version : cassandra 2.1.15
> jdk: 8
> os:suse
>Reporter: HUANG DUICAN
>Priority: Major
> Attachments: 1.PNG, 2.PNG, cassandra_20190105.zip
>
>
> We found that Cassandra has a lot of write accumulation in the production 
> environment, and our business has experienced a lot of write failures.
>  Through the system.log, it was found that MemtableReclaimMemory was pending 
> at the beginning, and then a large number of MutationStage stacks appeared at 
> a certain moment.
>  Finally, the heap memory is full, the GC time reaches tens of seconds, the 
> node status is DN through nodetool, but the Cassandra process is still 
> running.We killed the node and restarted the node, and the above situation 
> disappeared.
>  
> Also the number of Active MemtableReclaimMemory threads seems to stay at 1.
> (you can see the 1.PNG)
> a large number of MutationStage stacks appeared at a certain moment.
> (you can see the 2.PNG)
>  
> long GC time:
>  - MemtableReclaimMemory 1 156 24565 0 0
>  - G1 Old Generation GC in 87121ms. G1 Old Gen: 51175946656 -> 50082999760;
>  - MutationStage 128 11931622 1983820772 0 0
>  - CounterMutationStage 0 0 0 0 0
>  - MemtableReclaimMemory 1 156 24565 0 0
>  - G1 Young Generation GC in {color:#FF}969ms{color}. G1 Eden Space: 
> 1090519040 -> 0; G1 Old Gen: 50082999760 -> 51156741584;
>  - MutationStage 128 11953653 1983820772 0 0
>  - CounterMutationStage 0 0 0 0 0
>  - MemtableReclaimMemory 1 156 24565 0 0
>  - G1 Old Generation GC in {color:#FF}84785ms{color}. G1 Old Gen: 
> 51173518800 -> 50180911432;
>  - MutationStage 128 11967484 1983820772 0 0
>  - CounterMutationStage 0 0 0 0 0
>  - MemtableReclaimMemory 1 156 24565 0 0
>  - G1 Young Generation GC in 611ms. G1 Eden Space: 989855744 -> 0; G1 Old 
> Gen: 50180911432 -> 51153989960;
>  - MutationStage 128 11975849 1983820772 0 0
>  - CounterMutationStage 0 0 0 0 0
>  - MemtableReclaimMemory 1 156 24565 0 0
>  - G1 Old Generation GC in {color:#FF}85845ms{color}. G1 Old Gen: 
> 51170767176 -> 50238295416;
>  - MutationStage 128 11978192 1983820772 0 0
>  - CounterMutationStage 0 0 0 0 0
>  - MemtableReclaimMemory 1 156 24565 0 0
>  - G1 Young Generation GC in 602ms. G1 Eden Space: 939524096 -> 0; G1 Old 
> Gen: 50238295416 -> 51161042296;
>  - MutationStage 128 11994295 1983820772 0 0
>  - CounterMutationStage 0 0 0 0 0
>  - MemtableReclaimMemory 1 156 24565 0 0
>  - G1 Old Generation GC in {color:#FF}85307ms{color}. G1 Old Gen: 
> 51177819512 -> 50288829624; Metaspace: 36544536 -> 36525696
>  - MutationStage 128 12001932 1983820772 0 0
>  - CounterMutationStage 0 0 0 0 0
> 66 - MutationStage 128 12004395 1983820772 0 0
> 66 - CounterMutationStage 0 0 0 0 0
>  - MemtableReclaimMemory 1 156 24565 0 0
> 66 - MemtableReclaimMemory 1 156 24565 0 0
>  - G1 Young Generation GC in 610ms. G1 Eden Space: 889192448 -> 0; G1 Old 
> Gen: 50288829624 -> 51178022072;
>  - MutationStage 128 12023677 1983820772 0 0
> Why is this happening? 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14953) Failed to reclaim the memory and too many MemtableReclaimMemory pending task

2019-01-08 Thread Jeremy Hanna (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737416#comment-16737416
 ] 

Jeremy Hanna edited comment on CASSANDRA-14953 at 1/8/19 6:55 PM:
--

This appears to be a use case/configuration specific problem and not a bug with 
the software itself.  I would engage with those on the Cassandra user list or 
stack overflow to troubleshoot further.  See 
http://cassandra.apache.org/community/ for links to both.  Jira is primarily 
meant for development and bugs rather than operational questions.


was (Author: jeromatron):
This appears to be a use case/configuration specific problem and not a bug with 
the software itself.  I would engage with those on the Cassandra user list or 
stack overflow to troubleshoot further.  See 
http://cassandra.apache.org/community/ for links to both.  Jira is primarily 
meant for development and bugs rather than operational issues.

> Failed to reclaim the memory and too many MemtableReclaimMemory pending task
> 
>
> Key: CASSANDRA-14953
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14953
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Memtable
> Environment: version : cassandra 2.1.15
> jdk: 8
> os:suse
>Reporter: HUANG DUICAN
>Priority: Major
> Attachments: 1.PNG, 2.PNG, cassandra_20190105.zip
>
>
> We found that Cassandra has a lot of write accumulation in the production 
> environment, and our business has experienced a lot of write failures.
>  Through the system.log, it was found that MemtableReclaimMemory was pending 
> at the beginning, and then a large number of MutationStage stacks appeared at 
> a certain moment.
>  Finally, the heap memory is full, the GC time reaches tens of seconds, the 
> node status is DN through nodetool, but the Cassandra process is still 
> running.We killed the node and restarted the node, and the above situation 
> disappeared.
>  
> Also the number of Active MemtableReclaimMemory threads seems to stay at 1.
> (you can see the 1.PNG)
> a large number of MutationStage stacks appeared at a certain moment.
> (you can see the 2.PNG)
>  
> long GC time:
>  - MemtableReclaimMemory 1 156 24565 0 0
>  - G1 Old Generation GC in 87121ms. G1 Old Gen: 51175946656 -> 50082999760;
>  - MutationStage 128 11931622 1983820772 0 0
>  - CounterMutationStage 0 0 0 0 0
>  - MemtableReclaimMemory 1 156 24565 0 0
>  - G1 Young Generation GC in {color:#FF}969ms{color}. G1 Eden Space: 
> 1090519040 -> 0; G1 Old Gen: 50082999760 -> 51156741584;
>  - MutationStage 128 11953653 1983820772 0 0
>  - CounterMutationStage 0 0 0 0 0
>  - MemtableReclaimMemory 1 156 24565 0 0
>  - G1 Old Generation GC in {color:#FF}84785ms{color}. G1 Old Gen: 
> 51173518800 -> 50180911432;
>  - MutationStage 128 11967484 1983820772 0 0
>  - CounterMutationStage 0 0 0 0 0
>  - MemtableReclaimMemory 1 156 24565 0 0
>  - G1 Young Generation GC in 611ms. G1 Eden Space: 989855744 -> 0; G1 Old 
> Gen: 50180911432 -> 51153989960;
>  - MutationStage 128 11975849 1983820772 0 0
>  - CounterMutationStage 0 0 0 0 0
>  - MemtableReclaimMemory 1 156 24565 0 0
>  - G1 Old Generation GC in {color:#FF}85845ms{color}. G1 Old Gen: 
> 51170767176 -> 50238295416;
>  - MutationStage 128 11978192 1983820772 0 0
>  - CounterMutationStage 0 0 0 0 0
>  - MemtableReclaimMemory 1 156 24565 0 0
>  - G1 Young Generation GC in 602ms. G1 Eden Space: 939524096 -> 0; G1 Old 
> Gen: 50238295416 -> 51161042296;
>  - MutationStage 128 11994295 1983820772 0 0
>  - CounterMutationStage 0 0 0 0 0
>  - MemtableReclaimMemory 1 156 24565 0 0
>  - G1 Old Generation GC in {color:#FF}85307ms{color}. G1 Old Gen: 
> 51177819512 -> 50288829624; Metaspace: 36544536 -> 36525696
>  - MutationStage 128 12001932 1983820772 0 0
>  - CounterMutationStage 0 0 0 0 0
> 66 - MutationStage 128 12004395 1983820772 0 0
> 66 - CounterMutationStage 0 0 0 0 0
>  - MemtableReclaimMemory 1 156 24565 0 0
> 66 - MemtableReclaimMemory 1 156 24565 0 0
>  - G1 Young Generation GC in 610ms. G1 Eden Space: 889192448 -> 0; G1 Old 
> Gen: 50288829624 -> 51178022072;
>  - MutationStage 128 12023677 1983820772 0 0
> Why is this happening? 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14957) Rolling Restart Of Nodes Cause Dataloss Due To Schema Collision

2019-01-08 Thread Jeremy Hanna (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737379#comment-16737379
 ] 

Jeremy Hanna commented on CASSANDRA-14957:
--

The schema has to agree across the cluster.  If a node is being restarted, it 
has to catch up with the schema before being able to process writes to the new 
table.  Until then, it will probably have messages in the logs that it can't 
identify a table with a certain id.

How did you determine that there was data loss outside of temporary 
inconsistency between nodes?  If the writes succeeded on other nodes at the 
consistency level you specified, then there wasn't data loss.  You just had a 
temporary inconsistency on the node being restarted.  So the normal anti 
entropy operations like read repair and full repair should get it back into a 
consistent state.

> Rolling Restart Of Nodes Cause Dataloss Due To Schema Collision
> ---
>
> Key: CASSANDRA-14957
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14957
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Schema
>Reporter: Avraham Kalvo
>Priority: Major
>
> We were issuing a rolling restart on a mission-critical five node C* cluster.
> The first node which was restarted got the following messages in its 
> system.log:
> ```
> January 2nd 2019, 12:06:37.310 - INFO 12:06:35 Initializing 
> tasks_scheduler_external.tasks
> ```
> ```
> WARN 12:06:39 UnknownColumnFamilyException reading from socket; closing
> org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find table for 
> cfId bd7200a0-1567-11e8-8974-855d74ee356f. If a table was just created, this 
> is likely due to the schema not being fully propagated. Please wait for 
> schema agreement on table creation.
> at 
> org.apache.cassandra.config.CFMetaData$Serializer.deserialize(CFMetaData.java:1336)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:660)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:635)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:330)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:349)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:286)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at org.apache.cassandra.net.MessageIn.read(MessageIn.java:98) 
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:201)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:178)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> at 
> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:92)
>  ~[apache-cassandra-3.0.10.jar:3.0.10]
> ```
> The latter was then repeated several times across the cluster.
> It was then found out that the table in question 
> `tasks_scheduler_external.tasks` was created with a new schema version after 
> the entire cluster was restarted consecutively and schema agreement settled, 
> which started taking requests leaving the previous version of the schema 
> unavailable for any request, thus generating a data loss to our online system.
> Data loss was recovered by manually copying SSTables from the previous 
> version directory of the schema to the new one followed by `nodetool refresh` 
> to the relevant table.
> The above has repeated itself for several tables across various keyspaces.
> One other thing to mention is that a repair was in place for the first node 
> to be restarted, which was obviously stopped as the daemon was shut down, but 
> this doesn't seem to do with the above at first glance.
> Seems somewhat related to:
> https://issues.apache.org/jira/browse/CASSANDRA-13559



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14931) Backport In-JVM dtests to 2.2, 3.0 and 3.11

2019-01-08 Thread Alex Petrov (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov updated CASSANDRA-14931:

Reviewer: Alex Petrov

> Backport In-JVM dtests to 2.2, 3.0 and 3.11
> ---
>
> Key: CASSANDRA-14931
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14931
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest
>Reporter: Benedict
>Assignee: Benedict
>Priority: Major
> Fix For: 2.2.14, 3.0.18, 3.11.4
>
>
> The In-JVM dtests are of significant value, and much of the testing we are 
> exploring with it can easily be utilised on all presently maintained 
> versions.  We should backport the functionality to at least 3.0.x and 3.11.x 
> - and perhaps even consider 2.2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14914) Deserialization Error

2019-01-08 Thread EDSON VICENTE CARLI JUNIOR (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737136#comment-16737136
 ] 

EDSON VICENTE CARLI JUNIOR edited comment on CASSANDRA-14914 at 1/8/19 1:50 PM:


No, I not used custom timestamp into my application


was (Author: gandbranco):
No, I not used no one custom timestamp into my application

> Deserialization Error
> -
>
> Key: CASSANDRA-14914
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14914
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
> Environment: I use cassandra 3.9, but I tried to upgrade to 3.11 and 
> nothing has changed.
>Reporter: EDSON VICENTE CARLI JUNIOR
>Priority: Critical
> Fix For: 3.11.x
>
> Attachments: mutation4465429258841992355dat
>
>
>  
>  I have a single cassandra, now this error appears when I start the server:
> {code:java}
> ERROR 11:18:45 Exiting due to error while processing commit log during 
> initialization.
> org.apache.cassandra.db.commitlog.CommitLogReadHandler$CommitLogReadException:
>  Unexpected error deserializing mutation; saved to 
> /tmp/mutation4787806670239768067dat.  This may be caused by replaying a 
> mutation against a table with the same name but incompatible schema.  
> Exception follows: org.apache.cassandra.serializers.MarshalException: A local 
> deletion time should not be negative
> {code}
> If I delete all the commitlog and saved_cached files the server goes up, but 
> the next day when I reboot the cassandra, the error occurs again.
> The file mutationDDdat change name for each restart. I attachament a 
> example mutation file .
> What's wrong? How to make cassandra stable again?
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14914) Deserialization Error

2019-01-08 Thread EDSON VICENTE CARLI JUNIOR (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737136#comment-16737136
 ] 

EDSON VICENTE CARLI JUNIOR commented on CASSANDRA-14914:


No, I not used no one custom timestamp into my application

> Deserialization Error
> -
>
> Key: CASSANDRA-14914
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14914
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
> Environment: I use cassandra 3.9, but I tried to upgrade to 3.11 and 
> nothing has changed.
>Reporter: EDSON VICENTE CARLI JUNIOR
>Priority: Critical
> Fix For: 3.11.x
>
> Attachments: mutation4465429258841992355dat
>
>
>  
>  I have a single cassandra, now this error appears when I start the server:
> {code:java}
> ERROR 11:18:45 Exiting due to error while processing commit log during 
> initialization.
> org.apache.cassandra.db.commitlog.CommitLogReadHandler$CommitLogReadException:
>  Unexpected error deserializing mutation; saved to 
> /tmp/mutation4787806670239768067dat.  This may be caused by replaying a 
> mutation against a table with the same name but incompatible schema.  
> Exception follows: org.apache.cassandra.serializers.MarshalException: A local 
> deletion time should not be negative
> {code}
> If I delete all the commitlog and saved_cached files the server goes up, but 
> the next day when I reboot the cassandra, the error occurs again.
> The file mutationDDdat change name for each restart. I attachament a 
> example mutation file .
> What's wrong? How to make cassandra stable again?
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14922) In JVM dtests need to clean up after instance shutdown

2019-01-08 Thread Alex Petrov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737083#comment-16737083
 ] 

Alex Petrov commented on CASSANDRA-14922:
-

The patch looks good, and I'd say [~jolynch] let's merge it, since tests have 
been failing for a while now, unless there's something else you wanted to 
include in the patch immediately. 

I've had a couple of minor suggestions. All of the issues are easier to see / 
reproducible with a very small heap, ~256Mb: 
  * Hints are leaking direct memory 
  * Threadlocals are leaked 
  * FastThreadLocalThread thread locals are leaked (sorry for a tongue-twister)

I've put together a small 
[demo|https://github.com/apache/cassandra/compare/trunk...ifesdjeen:CASSANDRA-14922]
 just for demonstration purposes if you wanted to see the impact of suggested 
changes.

> In JVM dtests need to clean up after instance shutdown
> --
>
> Key: CASSANDRA-14922
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14922
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Minor
> Attachments: AllThreadsStopped.png, ClassLoadersRetaining.png, 
> Leaking_Metrics_On_Shutdown.png, MainClassRetaining.png, 
> Metaspace_Actually_Collected.png, OnlyThreeRootsLeft.png, 
> no_more_references.png
>
>
> Currently the unit tests are failing on circleci ([example 
> one|https://circleci.com/gh/jolynch/cassandra/300#tests/containers/1], 
> [example 
> two|https://circleci.com/gh/rustyrazorblade/cassandra/44#tests/containers/1]) 
> because we use a small container (medium) for unit tests by default and the 
> in JVM dtests are leaking a few hundred megabytes of memory per test right 
> now. This is not a big deal because the dtest runs with the larger containers 
> continue to function fine as well as local testing as the number of in JVM 
> dtests is not yet high enough to cause a problem with more than 2GB of 
> available heap. However we should fix the memory leak so that going forwards 
> we can add more in JVM dtests without worry.
> I've been working with [~ifesdjeen] to debug, and the issue appears to be 
> unreleased Table/Keyspace metrics (screenshot showing the leak attached). I 
> believe that we have a few potential issues that are leading to the leaks:
> 1. The 
> [{{Instance::shutdown}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/Instance.java#L328-L354]
>  method is not successfully cleaning up all the metrics created by the 
> {{CassandraMetricsRegistry}}
>  2. The 
> [{{TestCluster::close}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/TestCluster.java#L283]
>  method is not waiting for all the instances to finish shutting down and 
> cleaning up before continuing on
> 3. I'm not sure if this is an issue assuming we clear all metrics, but 
> [{{TableMetrics::release}}|https://github.com/apache/cassandra/blob/4ae229f5cd270c2b43475b3f752a7b228de260ea/src/java/org/apache/cassandra/metrics/TableMetrics.java#L951]
>  does not release all the metric references (which could leak them)
> I am working on a patch which shuts down everything and assures that we do 
> not leak memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14937) Multi-version In-JVM dtests

2019-01-08 Thread Alex Petrov (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov updated CASSANDRA-14937:

Reviewer: Alex Petrov

> Multi-version In-JVM dtests
> ---
>
> Key: CASSANDRA-14937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14937
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Test/dtest
>Reporter: Benedict
>Assignee: Benedict
>Priority: Major
> Fix For: 2.2.x, 3.0.x, 3.11.x
>
>
> In order to support more sophisticated upgrade tests, including complex fuzz 
> tests that can span a sequence of version upgrades, we propose abstracting a 
> cross-version API for the in-jvm dtests.  This will permit starting a node 
> with an arbitrary compatible C* version, stopping the node, and restarting it 
> with another C* version.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14905) if SizeEstimatesRecorder misses a 'onDropTable' notification, the size_estimates table will never be cleared for that table.

2019-01-08 Thread Aleksandr Sorokoumov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736995#comment-16736995
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-14905:
--

Thanks for the review! It'll be great if both authors can get the credit. 
Otherwise please give it to Joel.

> if SizeEstimatesRecorder misses a 'onDropTable' notification, the 
> size_estimates table will never be cleared for that table.
> 
>
> Key: CASSANDRA-14905
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14905
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Aleksandr Sorokoumov
>Assignee: Aleksandr Sorokoumov
>Priority: Minor
> Fix For: 3.0.x, 3.11.x, 4.0.x
>
> Attachments: 14905-3.0-dtest.png, 14905-3.0-testall.png, 
> 14905-3.11-dtest.png, 14905-3.11-testall.png, 14905-4.0-dtest.png, 
> 14905-4.0-testall.png
>
>
> if a node is down when a keyspace/table is dropped, it will receive the 
> schema notification before the size estimates listener is registered, so the 
> entries for the dropped keyspace/table will never be cleaned from the table. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14956) Paged Range Slice queries with DISTINCT can drop rows from results

2019-01-08 Thread Sam Tunnicliffe (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736989#comment-16736989
 ] 

Sam Tunnicliffe commented on CASSANDRA-14956:
-

Pushed a 2.1 branch after discussion on dev@ about one last 2.1 release before 
EOL. Unfortunately, CircleCI no longer supports v1.0 job configuration so a CI 
run is going to need the v2.0 config backporting (which we may want to do 
before a release anyway).

[14956-2.1|https://github.com/beobal/cassandra/tree/14956-2.1]

> Paged Range Slice queries with DISTINCT can drop rows from results
> --
>
> Key: CASSANDRA-14956
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14956
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL/Interpreter
>Reporter: Sam Tunnicliffe
>Assignee: Sam Tunnicliffe
>Priority: Major
> Fix For: 2.2.14
>
>
> If we have a partition where the first CQL row is fully deleted (possibly via 
> TTLs), and that partition happens to fall on the page boundary of a paged 
> range query which is using SELECT DISTINCT, the next live partition *after* 
> it is omitted from the result set. This is due to over fetching of the pages 
> and a bug in trimming those pages where overlap occurs.
> This does not affect 3.0+.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14688) Update protocol spec and class level doc with protocol checksumming details

2019-01-08 Thread Alex Petrov (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736941#comment-16736941
 ] 

Alex Petrov commented on CASSANDRA-14688:
-

Thank you for adding this much needed doc! 

+1 Patch looks good; just have a couple of nits: 
  * Bytes for lengths seem to be added in [big-endian 
order|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/transport/frame/checksum/ChecksummingTransformer.java#L353]
 for checksumming (which is a default for protocol, too) and the code is 
equivalent to doing {{ByteBuffer#putInt}}. Since ordering was handled 
explicitly here, despite {{ByteBuffer}} overload for {{Checksum#of}}, do we 
want to specify endianness here?
  * This is more a code comment though, but since it's always good to have 
mapping from code to protocol documentation, currently [numCompressedChunks 
|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/transport/frame/checksum/ChecksummingTransformer.java#L177]
 is a variable that represents number of all chunks, not only compressed. Maybe 
we'd like to change it to reduce ambiguity.

The rest of comments of even smaller significance are in the patch.

> Update protocol spec and class level doc with protocol checksumming details
> ---
>
> Key: CASSANDRA-14688
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14688
> Project: Cassandra
>  Issue Type: Task
>  Components: Legacy/Documentation and Website
>Reporter: Sam Tunnicliffe
>Assignee: Sam Tunnicliffe
>Priority: Major
> Fix For: 4.0
>
>
> CASSANDRA-13304 provides an option to add checksumming to the frame body of 
> native protocol messages. The native protocol spec needs to be updated to 
> reflect this ASAP. We should also verify that the javadoc comments describing 
> the on-wire format in 
> {{o.a.c.transport.frame.checksum.ChecksummingTransformer}} are up to date.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org