[jira] [Commented] (CASSANDRA-14957) Rolling Restart Of Nodes Cause Dataloss Due To Schema Collision
[ https://issues.apache.org/jira/browse/CASSANDRA-14957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737948#comment-16737948 ] Alex Petrov commented on CASSANDRA-14957: - [~via.vokal] was it the same version or did you upgrade between minor versions? > Rolling Restart Of Nodes Cause Dataloss Due To Schema Collision > --- > > Key: CASSANDRA-14957 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14957 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Schema >Reporter: Avraham Kalvo >Priority: Major > > We were issuing a rolling restart on a mission-critical five node C* cluster. > The first node which was restarted got the following messages in its > system.log: > ``` > January 2nd 2019, 12:06:37.310 - INFO 12:06:35 Initializing > tasks_scheduler_external.tasks > ``` > ``` > WARN 12:06:39 UnknownColumnFamilyException reading from socket; closing > org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find table for > cfId bd7200a0-1567-11e8-8974-855d74ee356f. If a table was just created, this > is likely due to the schema not being fully propagated. Please wait for > schema agreement on table creation. > at > org.apache.cassandra.config.CFMetaData$Serializer.deserialize(CFMetaData.java:1336) > ~[apache-cassandra-3.0.10.jar:3.0.10] > at > org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:660) > ~[apache-cassandra-3.0.10.jar:3.0.10] > at > org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:635) > ~[apache-cassandra-3.0.10.jar:3.0.10] > at > org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:330) > ~[apache-cassandra-3.0.10.jar:3.0.10] > at > org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:349) > ~[apache-cassandra-3.0.10.jar:3.0.10] > at > org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:286) > ~[apache-cassandra-3.0.10.jar:3.0.10] > at org.apache.cassandra.net.MessageIn.read(MessageIn.java:98) > ~[apache-cassandra-3.0.10.jar:3.0.10] > at > org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:201) > ~[apache-cassandra-3.0.10.jar:3.0.10] > at > org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:178) > ~[apache-cassandra-3.0.10.jar:3.0.10] > at > org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:92) > ~[apache-cassandra-3.0.10.jar:3.0.10] > ``` > The latter was then repeated several times across the cluster. > It was then found out that the table in question > `tasks_scheduler_external.tasks` was created with a new schema version after > the entire cluster was restarted consecutively and schema agreement settled, > which started taking requests leaving the previous version of the schema > unavailable for any request, thus generating a data loss to our online system. > Data loss was recovered by manually copying SSTables from the previous > version directory of the schema to the new one followed by `nodetool refresh` > to the relevant table. > The above has repeated itself for several tables across various keyspaces. > One other thing to mention is that a repair was in place for the first node > to be restarted, which was obviously stopped as the daemon was shut down, but > this doesn't seem to do with the above at first glance. > Seems somewhat related to: > https://issues.apache.org/jira/browse/CASSANDRA-13559 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14957) Rolling Restart Of Nodes Cause Dataloss Due To Schema Collision
[ https://issues.apache.org/jira/browse/CASSANDRA-14957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737908#comment-16737908 ] Avraham Kalvo commented on CASSANDRA-14957: --- Thanks [~jeromatron], Looking at the data directory for one of the keyspaces in question right now, even a couple of full primary range repairs have completed across the cluster since the outage, the following is apparent: ``` 36 drwxr-xr-x. 4 root root 28672 Jan 2 12:07 tasks-bd7200a0156711e88974855d74ee356f 8 drwxr-xr-x. 4 root root 4096 Jan 9 06:38 tasks-bd750de0156711e8bdc54f7bcdcb851f ``` Data was not lost from disk, but became no longer available for reads/writes via the database, I.e. - effectively lost to the application. As far as I know, anti-entropy actions don't take care of the above situation and indeed it needed to be recovered manually as described in the original comment for this issue. Writes only began to succeed once the schema agreement has settled across all the cluster, Until then, the application was timing out on any request to Cassandra. What do you think? > Rolling Restart Of Nodes Cause Dataloss Due To Schema Collision > --- > > Key: CASSANDRA-14957 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14957 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Schema >Reporter: Avraham Kalvo >Priority: Major > > We were issuing a rolling restart on a mission-critical five node C* cluster. > The first node which was restarted got the following messages in its > system.log: > ``` > January 2nd 2019, 12:06:37.310 - INFO 12:06:35 Initializing > tasks_scheduler_external.tasks > ``` > ``` > WARN 12:06:39 UnknownColumnFamilyException reading from socket; closing > org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find table for > cfId bd7200a0-1567-11e8-8974-855d74ee356f. If a table was just created, this > is likely due to the schema not being fully propagated. Please wait for > schema agreement on table creation. > at > org.apache.cassandra.config.CFMetaData$Serializer.deserialize(CFMetaData.java:1336) > ~[apache-cassandra-3.0.10.jar:3.0.10] > at > org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:660) > ~[apache-cassandra-3.0.10.jar:3.0.10] > at > org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:635) > ~[apache-cassandra-3.0.10.jar:3.0.10] > at > org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:330) > ~[apache-cassandra-3.0.10.jar:3.0.10] > at > org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:349) > ~[apache-cassandra-3.0.10.jar:3.0.10] > at > org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:286) > ~[apache-cassandra-3.0.10.jar:3.0.10] > at org.apache.cassandra.net.MessageIn.read(MessageIn.java:98) > ~[apache-cassandra-3.0.10.jar:3.0.10] > at > org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:201) > ~[apache-cassandra-3.0.10.jar:3.0.10] > at > org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:178) > ~[apache-cassandra-3.0.10.jar:3.0.10] > at > org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:92) > ~[apache-cassandra-3.0.10.jar:3.0.10] > ``` > The latter was then repeated several times across the cluster. > It was then found out that the table in question > `tasks_scheduler_external.tasks` was created with a new schema version after > the entire cluster was restarted consecutively and schema agreement settled, > which started taking requests leaving the previous version of the schema > unavailable for any request, thus generating a data loss to our online system. > Data loss was recovered by manually copying SSTables from the previous > version directory of the schema to the new one followed by `nodetool refresh` > to the relevant table. > The above has repeated itself for several tables across various keyspaces. > One other thing to mention is that a repair was in place for the first node > to be restarted, which was obviously stopped as the daemon was shut down, but > this doesn't seem to do with the above at first glance. > Seems somewhat related to: > https://issues.apache.org/jira/browse/CASSANDRA-13559 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14957) Rolling Restart Of Nodes Cause Dataloss Due To Schema Collision
[ https://issues.apache.org/jira/browse/CASSANDRA-14957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737908#comment-16737908 ] Avraham Kalvo edited comment on CASSANDRA-14957 at 1/9/19 7:03 AM: --- Thanks [~jeromatron], Looking at the data directory for one of the keyspaces in question right now, even a couple of full primary range repairs have completed across the cluster since the outage, the following is apparent: ``` 36 drwxr-xr-x. 4 root root 28672 Jan 2 12:07 tasks-bd7200a0156711e88974855d74ee356f 8 drwxr-xr-x. 4 root root 4096 Jan 9 06:38 tasks-bd750de0156711e8bdc54f7bcdcb851f ``` and with the following sizes: ``` $ du -sh tasks* 2.7Gtasks-bd7200a0156711e88974855d74ee356f 522Mtasks-bd750de0156711e8bdc54f7bcdcb851f ``` Data was not lost from disk, but became no longer available for reads/writes via the database, I.e. - effectively lost to the application. As far as I know, anti-entropy actions don't take care of the above situation and indeed it needed to be recovered manually as described in the original comment for this issue. Writes only began to succeed once the schema agreement has settled across all the cluster, Until then, the application was timing out on any request to Cassandra. What do you think? was (Author: via.vokal): Thanks [~jeromatron], Looking at the data directory for one of the keyspaces in question right now, even a couple of full primary range repairs have completed across the cluster since the outage, the following is apparent: ``` 36 drwxr-xr-x. 4 root root 28672 Jan 2 12:07 tasks-bd7200a0156711e88974855d74ee356f 8 drwxr-xr-x. 4 root root 4096 Jan 9 06:38 tasks-bd750de0156711e8bdc54f7bcdcb851f ``` Data was not lost from disk, but became no longer available for reads/writes via the database, I.e. - effectively lost to the application. As far as I know, anti-entropy actions don't take care of the above situation and indeed it needed to be recovered manually as described in the original comment for this issue. Writes only began to succeed once the schema agreement has settled across all the cluster, Until then, the application was timing out on any request to Cassandra. What do you think? > Rolling Restart Of Nodes Cause Dataloss Due To Schema Collision > --- > > Key: CASSANDRA-14957 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14957 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Schema >Reporter: Avraham Kalvo >Priority: Major > > We were issuing a rolling restart on a mission-critical five node C* cluster. > The first node which was restarted got the following messages in its > system.log: > ``` > January 2nd 2019, 12:06:37.310 - INFO 12:06:35 Initializing > tasks_scheduler_external.tasks > ``` > ``` > WARN 12:06:39 UnknownColumnFamilyException reading from socket; closing > org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find table for > cfId bd7200a0-1567-11e8-8974-855d74ee356f. If a table was just created, this > is likely due to the schema not being fully propagated. Please wait for > schema agreement on table creation. > at > org.apache.cassandra.config.CFMetaData$Serializer.deserialize(CFMetaData.java:1336) > ~[apache-cassandra-3.0.10.jar:3.0.10] > at > org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:660) > ~[apache-cassandra-3.0.10.jar:3.0.10] > at > org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:635) > ~[apache-cassandra-3.0.10.jar:3.0.10] > at > org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:330) > ~[apache-cassandra-3.0.10.jar:3.0.10] > at > org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:349) > ~[apache-cassandra-3.0.10.jar:3.0.10] > at > org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:286) > ~[apache-cassandra-3.0.10.jar:3.0.10] > at org.apache.cassandra.net.MessageIn.read(MessageIn.java:98) > ~[apache-cassandra-3.0.10.jar:3.0.10] > at > org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:201) > ~[apache-cassandra-3.0.10.jar:3.0.10] > at > org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:178) > ~[apache-cassandra-3.0.10.jar:3.0.10] > at > org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:92) > ~[apache-cassandra-3.0.10.jar:3.0.10] > ``` > The latter was then repeated several times across the cluster. > It was then found out that the table in question > `tasks_scheduler_external.tasks` was created with a new schema version after > the entire cluster was restarted consecutively and schema
[jira] [Comment Edited] (CASSANDRA-14972) Provide a script or method to generate the entire website at the push of a button
[ https://issues.apache.org/jira/browse/CASSANDRA-14972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737893#comment-16737893 ] Anthony Grasso edited comment on CASSANDRA-14972 at 1/9/19 6:39 AM: The steps involved to generate the entire website from scratch are # SVN checkout - [https://svn.apache.org/repos/asf/cassandra/site|https://svn.apache.org/repos/asf/cassandra/site] # Git clone Cassandra - [g...@github.com:apache/cassandra.git|https://github.com/apache/cassandra] # Set the environment variable {{CASSANDRA_DIR}} to point to the SVN Cassandra checkout location # Install *Ruby 2.3.x* and *Make* for building the website # Install *Java 1.8.x*, *Ant*, *Maven*, *Python 2.7.x*, and *Python Sphinx* for generating the Cassandra docs # From the SVN Cassandra checkout install *Jekyll* - {{gem install bundler && bundle install}} # From the SVN Cassandra checkout _src_ directory run {{make add-doc}} to make the docs for latest # From the Git Cassandra checkout change to branch {{cassandra-3.11}} - {{git checkout cassandra-3.11}} # From the SVN Cassandra checkout _src_ directory run {{make add-doc}} again to make the docs for 3.11 # From the SVN Cassandra checkout _src_ directory run {{make}} to build the website Probably the easiest way to do this is to have a Docker contain which installs all the tools required to build the docs and the site. Inside of the container have an entry point script which performs the tasks to generate the docs and the site. This is my first take on it. These files are to be placed in the _svn.apache.org/repos/asf/cassandra/site_ directory. * [^docker-compose.yml] * [^docker-entrypoint.sh] * [^Dockerfile] was (Author: anthony grasso): The steps involved to generate the entire website from scratch are # SVN checkout - [https://svn.apache.org/repos/asf/cassandra/site|https://svn.apache.org/repos/asf/cassandra/site] # Git clone Cassandra - [g...@github.com:apache/cassandra.git|https://github.com/apache/cassandra] # Set the environment variable {{CASSANDRA_DIR}} to point to the SVN Cassandra checkout location # Install *Ruby 2.3.x* and *Make* for building the website # Install *Java 1.8.x*, *Python 2.7.x*, and *Python Sphinx* for generating the Cassandra docs # From the SVN Cassandra checkout install *Jekyll* - {{gem install bundler && bundle install}} # From the SVN Cassandra checkout _src_ directory run {{make add-doc}} to make the docs for latest # From the Git Cassandra checkout change to branch {{cassandra-3.11}} - {{git checkout cassandra-3.11}} # From the SVN Cassandra checkout _src_ directory run {{make add-doc}} again to make the docs for 3.11 # From the SVN Cassandra checkout _src_ directory run {{make}} to build the website Probably the easiest way to do this is to have a Docker contain which installs all the tools required to build the docs and the site. Inside of the container have an entry point script which performs the tasks to generate the docs and the site. This is my first take on it. These files are to be placed in the _svn.apache.org/repos/asf/cassandra/site_ directory. * [^docker-compose.yml] * [^docker-entrypoint.sh] * [^Dockerfile] > Provide a script or method to generate the entire website at the push of a > button > - > > Key: CASSANDRA-14972 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14972 > Project: Cassandra > Issue Type: Task > Components: Documentation/Website >Reporter: Anthony Grasso >Assignee: Anthony Grasso >Priority: Major > Attachments: Dockerfile, docker-compose.yml, docker-entrypoint.sh > > > The process involved to generate the website involves two repositories (Git > and SVN), a range of tools, and a number of steps. > It would be good to have a script or something similar which we run and it > will generate the entire website for us which is ready to commit back into > SVN for publication. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14972) Provide a script or method to generate the entire website at the push of a button
[ https://issues.apache.org/jira/browse/CASSANDRA-14972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anthony Grasso updated CASSANDRA-14972: --- Attachment: Dockerfile docker-entrypoint.sh docker-compose.yml > Provide a script or method to generate the entire website at the push of a > button > - > > Key: CASSANDRA-14972 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14972 > Project: Cassandra > Issue Type: Task > Components: Documentation/Website >Reporter: Anthony Grasso >Assignee: Anthony Grasso >Priority: Major > Attachments: Dockerfile, docker-compose.yml, docker-entrypoint.sh > > > The process involved to generate the website involves two repositories (Git > and SVN), a range of tools, and a number of steps. > It would be good to have a script or something similar which we run and it > will generate the entire website for us which is ready to commit back into > SVN for publication. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14972) Provide a script or method to generate the entire website at the push of a button
[ https://issues.apache.org/jira/browse/CASSANDRA-14972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737893#comment-16737893 ] Anthony Grasso commented on CASSANDRA-14972: The steps involved to generate the entire website from scratch are # SVN checkout - [https://svn.apache.org/repos/asf/cassandra/site|https://svn.apache.org/repos/asf/cassandra/site] # Git clone Cassandra - [g...@github.com:apache/cassandra.git|https://github.com/apache/cassandra] # Set the environment variable {{CASSANDRA_DIR}} to point to the SVN Cassandra checkout location # Install *Ruby 2.3.x* and *Make* for building the website # Install *Java 1.8.x*, *Python 2.7.x*, and *Python Sphinx* for generating the Cassandra docs # From the SVN Cassandra checkout install *Jekyll* - {{gem install bundler && bundle install}} # From the SVN Cassandra checkout _src_ directory run {{make add-doc}} to make the docs for latest # From the Git Cassandra checkout change to branch {{cassandra-3.11}} - {{git checkout cassandra-3.11}} # From the SVN Cassandra checkout _src_ directory run {{make add-doc}} again to make the docs for 3.11 # From the SVN Cassandra checkout _src_ directory run {{make}} to build the website Probably the easiest way to do this is to have a Docker contain which installs all the tools required to build the docs and the site. Inside of the container have an entry point script which performs the tasks to generate the docs and the site. This is my first take on it. These files are to be placed in the _svn.apache.org/repos/asf/cassandra/site_ directory. * [^docker-compose.yml] * [^docker-entrypoint.sh] * [^Dockerfile] > Provide a script or method to generate the entire website at the push of a > button > - > > Key: CASSANDRA-14972 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14972 > Project: Cassandra > Issue Type: Task > Components: Documentation/Website >Reporter: Anthony Grasso >Assignee: Anthony Grasso >Priority: Major > Attachments: Dockerfile, docker-compose.yml, docker-entrypoint.sh > > > The process involved to generate the website involves two repositories (Git > and SVN), a range of tools, and a number of steps. > It would be good to have a script or something similar which we run and it > will generate the entire website for us which is ready to commit back into > SVN for publication. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-14972) Provide a script or method to generate the entire website at the push of a button
Anthony Grasso created CASSANDRA-14972: -- Summary: Provide a script or method to generate the entire website at the push of a button Key: CASSANDRA-14972 URL: https://issues.apache.org/jira/browse/CASSANDRA-14972 Project: Cassandra Issue Type: Task Components: Documentation/Website Reporter: Anthony Grasso Assignee: Anthony Grasso The process involved to generate the website involves two repositories (Git and SVN), a range of tools, and a number of steps. It would be good to have a script or something similar which we run and it will generate the entire website for us which is ready to commit back into SVN for publication. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14971) Website documentation search function returns broken links
[ https://issues.apache.org/jira/browse/CASSANDRA-14971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737883#comment-16737883 ] Anthony Grasso commented on CASSANDRA-14971: Awesome! Thanks [~michaelsembwever]. > Website documentation search function returns broken links > --- > > Key: CASSANDRA-14971 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14971 > Project: Cassandra > Issue Type: Bug > Components: Documentation/Website >Reporter: Anthony Grasso >Assignee: Anthony Grasso >Priority: Major > Attachments: CASSANDRA-14971_v01.patch > > > The search bar on the main page of the [Cassandra > Documentation|http://cassandra.apache.org/doc/latest/] returns search > [results|http://cassandra.apache.org/doc/latest/search.html?q=cache_keywords=yes=default] > with broken links. > When a link from a returned search is clicked, the site returns a 404 with > the message similar to this: > {quote}The requested URL /doc/latest/tools/nodetool/nodetool.rst.html was not > found on this server. > {quote} > From the error, it appears that the links are pointing to pages that end in > *.rst.html* in their name. The links should point to pages that end in > *.html*. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14971) Website documentation search function returns broken links
[ https://issues.apache.org/jira/browse/CASSANDRA-14971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mck updated CASSANDRA-14971: Resolution: Fixed Status: Resolved (was: Patch Available) Committed as r1850821 > Website documentation search function returns broken links > --- > > Key: CASSANDRA-14971 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14971 > Project: Cassandra > Issue Type: Bug > Components: Documentation/Website >Reporter: Anthony Grasso >Assignee: Anthony Grasso >Priority: Major > Attachments: CASSANDRA-14971_v01.patch > > > The search bar on the main page of the [Cassandra > Documentation|http://cassandra.apache.org/doc/latest/] returns search > [results|http://cassandra.apache.org/doc/latest/search.html?q=cache_keywords=yes=default] > with broken links. > When a link from a returned search is clicked, the site returns a 404 with > the message similar to this: > {quote}The requested URL /doc/latest/tools/nodetool/nodetool.rst.html was not > found on this server. > {quote} > From the error, it appears that the links are pointing to pages that end in > *.rst.html* in their name. The links should point to pages that end in > *.html*. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
svn commit: r1850821 - in /cassandra/site: publish/js/searchtools.js src/js/searchtools.js
Author: mck Date: Wed Jan 9 05:12:13 2019 New Revision: 1850821 URL: http://svn.apache.org/viewvc?rev=1850821=rev Log: Website documentation search function returns broken links patch by Anthony Grasso; reviewed by Mick Semb Wever for CASSANDRA-14971 Modified: cassandra/site/publish/js/searchtools.js cassandra/site/src/js/searchtools.js Modified: cassandra/site/publish/js/searchtools.js URL: http://svn.apache.org/viewvc/cassandra/site/publish/js/searchtools.js?rev=1850821=1850820=1850821=diff == --- cassandra/site/publish/js/searchtools.js (original) +++ cassandra/site/publish/js/searchtools.js Wed Jan 9 05:12:13 2019 @@ -473,7 +473,7 @@ var Search = { * search for object names */ performObjectSearch : function(object, otherterms) { -var filenames = this._index.filenames; +var filenames = this._index.docnames; var objects = this._index.objects; var objnames = this._index.objnames; var titles = this._index.titles; @@ -539,7 +539,7 @@ var Search = { * search for full-text terms in the index */ performTermsSearch : function(searchterms, excluded, terms, titleterms) { -var filenames = this._index.filenames; +var filenames = this._index.docnames; var titles = this._index.titles; var i, j, file; Modified: cassandra/site/src/js/searchtools.js URL: http://svn.apache.org/viewvc/cassandra/site/src/js/searchtools.js?rev=1850821=1850820=1850821=diff == --- cassandra/site/src/js/searchtools.js (original) +++ cassandra/site/src/js/searchtools.js Wed Jan 9 05:12:13 2019 @@ -473,7 +473,7 @@ var Search = { * search for object names */ performObjectSearch : function(object, otherterms) { -var filenames = this._index.filenames; +var filenames = this._index.docnames; var objects = this._index.objects; var objnames = this._index.objnames; var titles = this._index.titles; @@ -539,7 +539,7 @@ var Search = { * search for full-text terms in the index */ performTermsSearch : function(searchterms, excluded, terms, titleterms) { -var filenames = this._index.filenames; +var filenames = this._index.docnames; var titles = this._index.titles; var i, j, file; - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14971) Website documentation search function returns broken links
[ https://issues.apache.org/jira/browse/CASSANDRA-14971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737845#comment-16737845 ] mck edited comment on CASSANDRA-14971 at 1/9/19 5:09 AM: - Solid write up, thanks [~Anthony Grasso]. Patch is +1 from me. Going to test it. was (Author: michaelsembwever): Solid write up, thanks [~Anthony Grasso]. Patch is +1 from me. Going to test it, commit it, then update the website. > Website documentation search function returns broken links > --- > > Key: CASSANDRA-14971 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14971 > Project: Cassandra > Issue Type: Bug > Components: Documentation/Website >Reporter: Anthony Grasso >Assignee: Anthony Grasso >Priority: Major > Attachments: CASSANDRA-14971_v01.patch > > > The search bar on the main page of the [Cassandra > Documentation|http://cassandra.apache.org/doc/latest/] returns search > [results|http://cassandra.apache.org/doc/latest/search.html?q=cache_keywords=yes=default] > with broken links. > When a link from a returned search is clicked, the site returns a 404 with > the message similar to this: > {quote}The requested URL /doc/latest/tools/nodetool/nodetool.rst.html was not > found on this server. > {quote} > From the error, it appears that the links are pointing to pages that end in > *.rst.html* in their name. The links should point to pages that end in > *.html*. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14971) Website documentation search function returns broken links
[ https://issues.apache.org/jira/browse/CASSANDRA-14971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mck updated CASSANDRA-14971: Reviewer: mck (was: Mick Semb Wever) > Website documentation search function returns broken links > --- > > Key: CASSANDRA-14971 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14971 > Project: Cassandra > Issue Type: Bug > Components: Documentation/Website >Reporter: Anthony Grasso >Assignee: Anthony Grasso >Priority: Major > Attachments: CASSANDRA-14971_v01.patch > > > The search bar on the main page of the [Cassandra > Documentation|http://cassandra.apache.org/doc/latest/] returns search > [results|http://cassandra.apache.org/doc/latest/search.html?q=cache_keywords=yes=default] > with broken links. > When a link from a returned search is clicked, the site returns a 404 with > the message similar to this: > {quote}The requested URL /doc/latest/tools/nodetool/nodetool.rst.html was not > found on this server. > {quote} > From the error, it appears that the links are pointing to pages that end in > *.rst.html* in their name. The links should point to pages that end in > *.html*. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14971) Website documentation search function returns broken links
[ https://issues.apache.org/jira/browse/CASSANDRA-14971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737845#comment-16737845 ] mck commented on CASSANDRA-14971: - Solid write up, thanks [~Anthony Grasso]. Patch is +1 from me. Going to test it, commit it, then update the website. > Website documentation search function returns broken links > --- > > Key: CASSANDRA-14971 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14971 > Project: Cassandra > Issue Type: Bug > Components: Documentation/Website >Reporter: Anthony Grasso >Assignee: Anthony Grasso >Priority: Major > Attachments: CASSANDRA-14971_v01.patch > > > The search bar on the main page of the [Cassandra > Documentation|http://cassandra.apache.org/doc/latest/] returns search > [results|http://cassandra.apache.org/doc/latest/search.html?q=cache_keywords=yes=default] > with broken links. > When a link from a returned search is clicked, the site returns a 404 with > the message similar to this: > {quote}The requested URL /doc/latest/tools/nodetool/nodetool.rst.html was not > found on this server. > {quote} > From the error, it appears that the links are pointing to pages that end in > *.rst.html* in their name. The links should point to pages that end in > *.html*. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums
[ https://issues.apache.org/jira/browse/CASSANDRA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737808#comment-16737808 ] Michael Shuler edited comment on CASSANDRA-14970 at 1/9/19 3:35 AM: Our current release process uploads/signs/checksums the tar.gz and maven artifacts to nexus via the 'publish' task, then we vote. After vote, we download the tar.gz/.md5/.sha1 files for final release and promote the staging repo to release. Since the MD5 and SHA files are there in build.xml, I thought the patch for creating the .sha256/.sha512 checksums in the 'release' target were used for release build. They are not. I gave another try at uploading the .sha256/.sha512 files, but realized we never build them due to the target dependencies, so looked a little more. I created ant target graphs for 2.1 and trunk to get an idea of the target relations. The release task I patched isn't depended on by anything, and currently is completely unused in our release process. build_cassandra-2.1.png build_trunk.png (edit: removed no-thumb images - they are attached..) was (Author: mshuler): Our current release process uploads/signs/checksums the tar.gz and maven artifacts to nexus, then we vote. After vote, we download the tar.gz/.md5/.sha1 files for final release and promote the staging repo to release. Since the MD5 and SHA files are there in build.xml, I thought the patch for creating the .sha256/.sha512 checksums in the 'release' target were used for release build. They are not. I gave another try at uploading the .sha256/.sha512 files, but realized we never build them due to the target dependencies, so looked a little more. I created ant target graphs for 2.1 and trunk to get an idea of the target relations. The release task I patched isn't depended on by anything, and currently is completely unused in our release process. build_cassandra-2.1.png build_trunk.png (edit: removed no-thumb images - they are attached..) > New releases must supply SHA-256 and/or SHA-512 checksums > - > > Key: CASSANDRA-14970 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14970 > Project: Cassandra > Issue Type: Bug > Components: Packaging >Reporter: Michael Shuler >Assignee: Michael Shuler >Priority: Blocker > Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0 > > Attachments: > 0001-Update-downloads-for-sha256-sha512-checksum-files.patch, > 0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch, > ant-publish-checksum-fail.jpg, build_cassandra-2.1.png, build_trunk.png > > > Release policy was updated around 9/2018 to state: > "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT > supply MD5 or SHA-1. Existing releases do not need to be changed." > build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. > cassandra-builds/cassandra-release scripts need to be updated to work with > the new checksum files. > http://www.apache.org/dev/release-distribution#sigs-and-sums -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14365) Commit log replay failure for static columns with collections in clustering keys
[ https://issues.apache.org/jira/browse/CASSANDRA-14365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vincent White updated CASSANDRA-14365: -- Description: In the old storage engine, static cells with a collection as part of the clustering key fail to validate because a 0 byte collection (like in the cell name of a static cell) isn't valid. To reproduce: 1. {code:java} CREATE TABLE test.x ( id int, id2 frozen>, st int static, PRIMARY KEY (id, id2) ); INSERT INTO test.x (id, st) VALUES (1, 2); {code} 2. Kill the cassandra process 3. Restart cassandra to replay the commitlog Outcome: {noformat} ERROR [main] 2018-04-05 04:58:23,741 JVMStabilityInspector.java:99 - Exiting due to error while processing commit log during initialization. org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException: Unexpected error deserializing mutation; saved to /tmp/mutation3825739904516830950dat. This may be caused by replaying a mutation against a table with the same name but incompatible schema. Exception follows: org.apache.cassandra.serializers.MarshalException: Not enough bytes to read a set at org.apache.cassandra.db.commitlog.CommitLogReplayer.handleReplayError(CommitLogReplayer.java:638) [main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.replayMutation(CommitLogReplayer.java:565) [main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.replaySyncSection(CommitLogReplayer.java:517) [main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:397) [main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:143) [main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:181) [main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:161) [main/:na] at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:284) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:533) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:642) [main/:na] {noformat} I haven't investigated if there are other more subtle issues caused by these cells failing to validate other places in the code, but I believe the fix for this is to check for 0 byte length collections and accept them as valid as we do with other types. I haven't had a chance for any extensive testing but this naive patch seems to have the desired affect. ||Patch|| |[2.2 PoC Patch|https://github.com/vincewhite/cassandra/commits/zero_length_collection]| was: In the old storage engine, static cells with a collection as part of the clustering key fail to validate because a 0 byte collection (like in the cell name of a static cell) isn't valid. To reproduce: 1. {code:java} CREATE TABLE test.x ( id int, id2 frozen>, st int static, PRIMARY KEY (id, id2) ); INSERT INTO test.x (id, st) VALUES (1, 2); {code} 2. Kill the cassandra process 3. Restart cassandra to replay the commitlog Outcome: {noformat} ERROR [main] 2018-04-05 04:58:23,741 JVMStabilityInspector.java:99 - Exiting due to error while processing commit log during initialization. org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException: Unexpected error deserializing mutation; saved to /tmp/mutation3825739904516830950dat. This may be caused by replaying a mutation against a table with the same name but incompatible schema. Exception follows: org.apache.cassandra.serializers.MarshalException: Not enough bytes to read a set at org.apache.cassandra.db.commitlog.CommitLogReplayer.handleReplayError(CommitLogReplayer.java:638) [main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.replayMutation(CommitLogReplayer.java:565) [main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.replaySyncSection(CommitLogReplayer.java:517) [main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:397) [main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:143) [main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:181) [main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:161) [main/:na] at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:284) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:533) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:642) [main/:na] {noformat} I haven't investigated if there are other more subtle issues caused by these cells failing to validate other places in the
[jira] [Comment Edited] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums
[ https://issues.apache.org/jira/browse/CASSANDRA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737808#comment-16737808 ] Michael Shuler edited comment on CASSANDRA-14970 at 1/9/19 3:28 AM: Our current release process uploads/signs/checksums the tar.gz and maven artifacts to nexus, then we vote. After vote, we download the tar.gz/.md5/.sha1 files for final release and promote the staging repo to release. Since the MD5 and SHA files are there in build.xml, I thought the patch for creating the .sha256/.sha512 checksums in the 'release' target were used for release build. They are not. I gave another try at uploading the .sha256/.sha512 files, but realized we never build them due to the target dependencies, so looked a little more. I created ant target graphs for 2.1 and trunk to get an idea of the target relations. The release task I patched isn't depended on by anything, and currently is completely unused in our release process. build_cassandra-2.1.png build_trunk.png (edit: removed no-thumb images - they are attached..) was (Author: mshuler): Our current release process uploads/signs/checksums the tar.gz and maven artifacts to nexus, then we vote. After vote, we download the tar.gz/.md5/.sha1 files for final release and promote the staging repo to release. Since the MD5 and SHA files are there in build.xml, I thought the patch for creating the .sha256/.sha512 checksums in the 'release' target were used for release build. They are not. I gave another try at uploading the .sha256/.sha512 files, but realized we never build them due to the target dependencies, so looked a little more. I created ant target graphs for 2.1 and trunk to get an idea of the target relations. The release task I patched isn't depended on by anything, and currently is completely unused in our release process. !build_cassandra-2.1.png! !build_trunk.png! > New releases must supply SHA-256 and/or SHA-512 checksums > - > > Key: CASSANDRA-14970 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14970 > Project: Cassandra > Issue Type: Bug > Components: Packaging >Reporter: Michael Shuler >Assignee: Michael Shuler >Priority: Blocker > Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0 > > Attachments: > 0001-Update-downloads-for-sha256-sha512-checksum-files.patch, > 0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch, > ant-publish-checksum-fail.jpg, build_cassandra-2.1.png, build_trunk.png > > > Release policy was updated around 9/2018 to state: > "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT > supply MD5 or SHA-1. Existing releases do not need to be changed." > build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. > cassandra-builds/cassandra-release scripts need to be updated to work with > the new checksum files. > http://www.apache.org/dev/release-distribution#sigs-and-sums -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums
[ https://issues.apache.org/jira/browse/CASSANDRA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Shuler updated CASSANDRA-14970: --- Attachment: build_cassandra-2.1.png > New releases must supply SHA-256 and/or SHA-512 checksums > - > > Key: CASSANDRA-14970 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14970 > Project: Cassandra > Issue Type: Bug > Components: Packaging >Reporter: Michael Shuler >Assignee: Michael Shuler >Priority: Blocker > Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0 > > Attachments: > 0001-Update-downloads-for-sha256-sha512-checksum-files.patch, > 0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch, > ant-publish-checksum-fail.jpg, build_cassandra-2.1.png, build_trunk.png > > > Release policy was updated around 9/2018 to state: > "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT > supply MD5 or SHA-1. Existing releases do not need to be changed." > build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. > cassandra-builds/cassandra-release scripts need to be updated to work with > the new checksum files. > http://www.apache.org/dev/release-distribution#sigs-and-sums -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums
[ https://issues.apache.org/jira/browse/CASSANDRA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737808#comment-16737808 ] Michael Shuler commented on CASSANDRA-14970: Our current release process uploads/signs/checksums the tar.gz and maven artifacts to nexus, then we vote. After vote, we download the tar.gz/.md5/.sha1 files for final release and promote the staging repo to release. Since the MD5 and SHA files are there in build.xml, I thought the patch for creating the .sha256/.sha512 checksums in the 'release' target were used for release build. They are not. I gave another try at uploading the .sha256/.sha512 files, but realized we never build them due to the target dependencies, so looked a little more. I created ant target graphs for 2.1 and trunk to get an idea of the target relations. The release task I patched isn't depended on by anything, and currently is completely unused in our release process. !build_cassandra-2.1.png! !build_trunk.png! > New releases must supply SHA-256 and/or SHA-512 checksums > - > > Key: CASSANDRA-14970 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14970 > Project: Cassandra > Issue Type: Bug > Components: Packaging >Reporter: Michael Shuler >Assignee: Michael Shuler >Priority: Blocker > Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0 > > Attachments: > 0001-Update-downloads-for-sha256-sha512-checksum-files.patch, > 0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch, > ant-publish-checksum-fail.jpg, build_cassandra-2.1.png, build_trunk.png > > > Release policy was updated around 9/2018 to state: > "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT > supply MD5 or SHA-1. Existing releases do not need to be changed." > build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. > cassandra-builds/cassandra-release scripts need to be updated to work with > the new checksum files. > http://www.apache.org/dev/release-distribution#sigs-and-sums -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums
[ https://issues.apache.org/jira/browse/CASSANDRA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Shuler updated CASSANDRA-14970: --- Attachment: build_trunk.png > New releases must supply SHA-256 and/or SHA-512 checksums > - > > Key: CASSANDRA-14970 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14970 > Project: Cassandra > Issue Type: Bug > Components: Packaging >Reporter: Michael Shuler >Assignee: Michael Shuler >Priority: Blocker > Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0 > > Attachments: > 0001-Update-downloads-for-sha256-sha512-checksum-files.patch, > 0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch, > ant-publish-checksum-fail.jpg, build_cassandra-2.1.png, build_trunk.png > > > Release policy was updated around 9/2018 to state: > "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT > supply MD5 or SHA-1. Existing releases do not need to be changed." > build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. > cassandra-builds/cassandra-release scripts need to be updated to work with > the new checksum files. > http://www.apache.org/dev/release-distribution#sigs-and-sums -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14365) Commit log replay failure for static columns with collections in clustering keys
[ https://issues.apache.org/jira/browse/CASSANDRA-14365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vincent White updated CASSANDRA-14365: -- Description: In the old storage engine, static cells with a collection as part of the clustering key fail to validate because a 0 byte collection (like in the cell name of a static cell) isn't valid. To reproduce: 1. {code:java} CREATE TABLE test.x ( id int, id2 frozen>, st int static, PRIMARY KEY (id, id2) ); INSERT INTO test.x (id, st) VALUES (1, 2); {code} 2. Kill the cassandra process 3. Restart cassandra to replay the commitlog Outcome: {noformat} ERROR [main] 2018-04-05 04:58:23,741 JVMStabilityInspector.java:99 - Exiting due to error while processing commit log during initialization. org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException: Unexpected error deserializing mutation; saved to /tmp/mutation3825739904516830950dat. This may be caused by replaying a mutation against a table with the same name but incompatible schema. Exception follows: org.apache.cassandra.serializers.MarshalException: Not enough bytes to read a set at org.apache.cassandra.db.commitlog.CommitLogReplayer.handleReplayError(CommitLogReplayer.java:638) [main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.replayMutation(CommitLogReplayer.java:565) [main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.replaySyncSection(CommitLogReplayer.java:517) [main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:397) [main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:143) [main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:181) [main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:161) [main/:na] at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:284) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:533) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:642) [main/:na] {noformat} I haven't investigated if there are other more subtle issues caused by these cells failing to validate other places in the code, but I believe the fix for this is to check for 0 byte length collections and accept them as valid as we do with other types. I haven't had a chance for any extensive testing but this naive patch seems to have the desired affect. ||Patch|| |[2.2 PoC|https://github.com/vincewhite/cassandra/commits/zero_length_collection]| was: In the old storage engine, static cells with a collection as part of the clustering key fail to validate because a 0 byte collection (like in the cell name of a static cell) isn't valid. To reproduce: 1. {code:java} CREATE TABLE test.x ( id int, id2 frozen>, st int static, PRIMARY KEY (id, id2) ); INSERT INTO test.x (id, st) VALUES (1, 2); {code} 2. Kill the cassandra process 3. Restart cassandra to replay the commitlog Outcome: {noformat} ERROR [main] 2018-04-05 04:58:23,741 JVMStabilityInspector.java:99 - Exiting due to error while processing commit log during initialization. org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException: Unexpected error deserializing mutation; saved to /tmp/mutation3825739904516830950dat. This may be caused by replaying a mutation against a table with the same name but incompatible schema. Exception follows: org.apache.cassandra.serializers.MarshalException: Not enough bytes to read a set at org.apache.cassandra.db.commitlog.CommitLogReplayer.handleReplayError(CommitLogReplayer.java:638) [main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.replayMutation(CommitLogReplayer.java:565) [main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.replaySyncSection(CommitLogReplayer.java:517) [main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:397) [main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:143) [main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:181) [main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:161) [main/:na] at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:284) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:533) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:642) [main/:na] {noformat} I haven't investigated if there are other more subtle issues caused by these cells failing to validate other places in the code,
[jira] [Comment Edited] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums
[ https://issues.apache.org/jira/browse/CASSANDRA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737732#comment-16737732 ] mck edited comment on CASSANDRA-14970 at 1/9/19 1:36 AM: - [~mshuler] the asf guidelines applies strictly to the distributed convenience binary artefacts. The asf maven repository doesn't support it yet, that is the nexus repo only keeps sha1 on the jarfiles. was (Author: michaelsembwever): [~mshuler] the asf guidelines applies strictly to the distributed convenience binary artefacts. The asf maven repository doesn't support it yet, hat is the nexus repo only keeps sha1 on the jarfiles. > New releases must supply SHA-256 and/or SHA-512 checksums > - > > Key: CASSANDRA-14970 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14970 > Project: Cassandra > Issue Type: Bug > Components: Packaging >Reporter: Michael Shuler >Assignee: Michael Shuler >Priority: Blocker > Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0 > > Attachments: > 0001-Update-downloads-for-sha256-sha512-checksum-files.patch, > 0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch, > ant-publish-checksum-fail.jpg > > > Release policy was updated around 9/2018 to state: > "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT > supply MD5 or SHA-1. Existing releases do not need to be changed." > build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. > cassandra-builds/cassandra-release scripts need to be updated to work with > the new checksum files. > http://www.apache.org/dev/release-distribution#sigs-and-sums -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14971) Website documentation search function returns broken links
[ https://issues.apache.org/jira/browse/CASSANDRA-14971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anthony Grasso updated CASSANDRA-14971: --- Reviewer: Mick Semb Wever Attachment: CASSANDRA-14971_v01.patch Status: Patch Available (was: In Progress) Attached {{svn diff}} patch > Website documentation search function returns broken links > --- > > Key: CASSANDRA-14971 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14971 > Project: Cassandra > Issue Type: Bug > Components: Documentation/Website >Reporter: Anthony Grasso >Assignee: Anthony Grasso >Priority: Major > Attachments: CASSANDRA-14971_v01.patch > > > The search bar on the main page of the [Cassandra > Documentation|http://cassandra.apache.org/doc/latest/] returns search > [results|http://cassandra.apache.org/doc/latest/search.html?q=cache_keywords=yes=default] > with broken links. > When a link from a returned search is clicked, the site returns a 404 with > the message similar to this: > {quote}The requested URL /doc/latest/tools/nodetool/nodetool.rst.html was not > found on this server. > {quote} > From the error, it appears that the links are pointing to pages that end in > *.rst.html* in their name. The links should point to pages that end in > *.html*. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14971) Website documentation search function returns broken links
[ https://issues.apache.org/jira/browse/CASSANDRA-14971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737725#comment-16737725 ] Anthony Grasso commented on CASSANDRA-14971: It looks like the search results are pieced together by the [searchtools.js|https://svn.apache.org/repos/asf/cassandra/site/src/js/searchtools.js] file that lives in the _js_ directory in the SVN [repository|https://svn.apache.org/repos/asf/cassandra/site]. Specifically the {{displayNextItem()}} function walks through the returned results and generates the HTML output. This function generates the filenames using the data in the returned results. The search results are generated by the {{performObjectSearch}} and {{performTermsSearch}} functions. These functions obtain the file information from the search index. In this case, it is the search index file ([searchindex.js|https://svn.apache.org/repos/asf/cassandra/site/src/doc/4.0/searchindex.js] which is generated by Sphinx. It appears that we are referencing the documents in the {{filenames}} list property of the search index. These documents contain the *.rst* extension. We should probably be referencing the documents in the {{docnames}} list property of the search index. > Website documentation search function returns broken links > --- > > Key: CASSANDRA-14971 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14971 > Project: Cassandra > Issue Type: Bug > Components: Documentation/Website >Reporter: Anthony Grasso >Assignee: Anthony Grasso >Priority: Major > > The search bar on the main page of the [Cassandra > Documentation|http://cassandra.apache.org/doc/latest/] returns search > [results|http://cassandra.apache.org/doc/latest/search.html?q=cache_keywords=yes=default] > with broken links. > When a link from a returned search is clicked, the site returns a 404 with > the message similar to this: > {quote}The requested URL /doc/latest/tools/nodetool/nodetool.rst.html was not > found on this server. > {quote} > From the error, it appears that the links are pointing to pages that end in > *.rst.html* in their name. The links should point to pages that end in > *.html*. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums
[ https://issues.apache.org/jira/browse/CASSANDRA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737732#comment-16737732 ] mck edited comment on CASSANDRA-14970 at 1/9/19 1:38 AM: - [~mshuler] the asf guidelines applies strictly to the distributed convenience binary artefacts. The asf maven repository doesn't support it yet, that is the nexus repo only keeps sha1 on the jarfiles. (No asf project is using sha-256/512 on maven distributables afaik) was (Author: michaelsembwever): [~mshuler] the asf guidelines applies strictly to the distributed convenience binary artefacts. The asf maven repository doesn't support it yet, that is the nexus repo only keeps sha1 on the jarfiles. (No asf project is using sha-25/512 on maven distributables afaik) > New releases must supply SHA-256 and/or SHA-512 checksums > - > > Key: CASSANDRA-14970 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14970 > Project: Cassandra > Issue Type: Bug > Components: Packaging >Reporter: Michael Shuler >Assignee: Michael Shuler >Priority: Blocker > Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0 > > Attachments: > 0001-Update-downloads-for-sha256-sha512-checksum-files.patch, > 0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch, > ant-publish-checksum-fail.jpg > > > Release policy was updated around 9/2018 to state: > "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT > supply MD5 or SHA-1. Existing releases do not need to be changed." > build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. > cassandra-builds/cassandra-release scripts need to be updated to work with > the new checksum files. > http://www.apache.org/dev/release-distribution#sigs-and-sums -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums
[ https://issues.apache.org/jira/browse/CASSANDRA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737732#comment-16737732 ] mck edited comment on CASSANDRA-14970 at 1/9/19 1:38 AM: - [~mshuler] the asf guidelines applies strictly to the distributed convenience binary artefacts. The asf maven repository doesn't support it yet, that is the nexus repo only keeps sha1 on the jarfiles. (No asf project is using sha-25/512 on maven distributables afaik) was (Author: michaelsembwever): [~mshuler] the asf guidelines applies strictly to the distributed convenience binary artefacts. The asf maven repository doesn't support it yet, that is the nexus repo only keeps sha1 on the jarfiles. > New releases must supply SHA-256 and/or SHA-512 checksums > - > > Key: CASSANDRA-14970 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14970 > Project: Cassandra > Issue Type: Bug > Components: Packaging >Reporter: Michael Shuler >Assignee: Michael Shuler >Priority: Blocker > Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0 > > Attachments: > 0001-Update-downloads-for-sha256-sha512-checksum-files.patch, > 0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch, > ant-publish-checksum-fail.jpg > > > Release policy was updated around 9/2018 to state: > "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT > supply MD5 or SHA-1. Existing releases do not need to be changed." > build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. > cassandra-builds/cassandra-release scripts need to be updated to work with > the new checksum files. > http://www.apache.org/dev/release-distribution#sigs-and-sums -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums
[ https://issues.apache.org/jira/browse/CASSANDRA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737732#comment-16737732 ] mck commented on CASSANDRA-14970: - [~mshuler] the asf guidelines applies strictly to the distributed convenience binary artefacts. The asf maven repository doesn't support it yet, hat is the nexus repo only keeps sha1 on the jarfiles. > New releases must supply SHA-256 and/or SHA-512 checksums > - > > Key: CASSANDRA-14970 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14970 > Project: Cassandra > Issue Type: Bug > Components: Packaging >Reporter: Michael Shuler >Assignee: Michael Shuler >Priority: Blocker > Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0 > > Attachments: > 0001-Update-downloads-for-sha256-sha512-checksum-files.patch, > 0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch, > ant-publish-checksum-fail.jpg > > > Release policy was updated around 9/2018 to state: > "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT > supply MD5 or SHA-1. Existing releases do not need to be changed." > build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. > cassandra-builds/cassandra-release scripts need to be updated to work with > the new checksum files. > http://www.apache.org/dev/release-distribution#sigs-and-sums -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14922) In JVM dtests need to clean up after instance shutdown
[ https://issues.apache.org/jira/browse/CASSANDRA-14922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737726#comment-16737726 ] Joseph Lynch commented on CASSANDRA-14922: -- {quote} Sure thing. I'll start the rebase tomorrow in that case. In that case, also, I've pushed my one nit from a quick look through here for Alex to look at, that I would have simply ninja'd in (with comment here, of course). This is just using the HintsBuffer.free method instead of directly invoking DirectByteBuffer.cleaner().clean(). {quote} Ah cool, yea that appears to still work (and then we can leave the slab private in {{HintsBuffer}} as well. > In JVM dtests need to clean up after instance shutdown > -- > > Key: CASSANDRA-14922 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14922 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Minor > Fix For: 4.0 > > Attachments: AllThreadsStopped.png, ClassLoadersRetaining.png, > Leaking_Metrics_On_Shutdown.png, MainClassRetaining.png, > MemoryReclaimedFix.png, Metaspace_Actually_Collected.png, > OnlyThreeRootsLeft.png, no_more_references.png > > > Currently the unit tests are failing on circleci ([example > one|https://circleci.com/gh/jolynch/cassandra/300#tests/containers/1], > [example > two|https://circleci.com/gh/rustyrazorblade/cassandra/44#tests/containers/1]) > because we use a small container (medium) for unit tests by default and the > in JVM dtests are leaking a few hundred megabytes of memory per test right > now. This is not a big deal because the dtest runs with the larger containers > continue to function fine as well as local testing as the number of in JVM > dtests is not yet high enough to cause a problem with more than 2GB of > available heap. However we should fix the memory leak so that going forwards > we can add more in JVM dtests without worry. > I've been working with [~ifesdjeen] to debug, and the issue appears to be > unreleased Table/Keyspace metrics (screenshot showing the leak attached). I > believe that we have a few potential issues that are leading to the leaks: > 1. The > [{{Instance::shutdown}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/Instance.java#L328-L354] > method is not successfully cleaning up all the metrics created by the > {{CassandraMetricsRegistry}} > 2. The > [{{TestCluster::close}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/TestCluster.java#L283] > method is not waiting for all the instances to finish shutting down and > cleaning up before continuing on > 3. I'm not sure if this is an issue assuming we clear all metrics, but > [{{TableMetrics::release}}|https://github.com/apache/cassandra/blob/4ae229f5cd270c2b43475b3f752a7b228de260ea/src/java/org/apache/cassandra/metrics/TableMetrics.java#L951] > does not release all the metric references (which could leak them) > I am working on a patch which shuts down everything and assures that we do > not leak memory. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14922) In JVM dtests need to clean up after instance shutdown
[ https://issues.apache.org/jira/browse/CASSANDRA-14922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737726#comment-16737726 ] Joseph Lynch edited comment on CASSANDRA-14922 at 1/9/19 1:30 AM: -- {quote} Sure thing. I'll start the rebase tomorrow in that case. In that case, also, I've pushed my one nit from a quick look through here for Alex to look at, that I would have simply ninja'd in (with comment here, of course). This is just using the HintsBuffer.free method instead of directly invoking DirectByteBuffer.cleaner().clean(). {quote} Ah cool, yea that appears to still work (and then we can leave the slab private in {{HintsBuffer}} as well.) was (Author: jolynch): {quote} Sure thing. I'll start the rebase tomorrow in that case. In that case, also, I've pushed my one nit from a quick look through here for Alex to look at, that I would have simply ninja'd in (with comment here, of course). This is just using the HintsBuffer.free method instead of directly invoking DirectByteBuffer.cleaner().clean(). {quote} Ah cool, yea that appears to still work (and then we can leave the slab private in {{HintsBuffer}} as well. > In JVM dtests need to clean up after instance shutdown > -- > > Key: CASSANDRA-14922 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14922 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Minor > Fix For: 4.0 > > Attachments: AllThreadsStopped.png, ClassLoadersRetaining.png, > Leaking_Metrics_On_Shutdown.png, MainClassRetaining.png, > MemoryReclaimedFix.png, Metaspace_Actually_Collected.png, > OnlyThreeRootsLeft.png, no_more_references.png > > > Currently the unit tests are failing on circleci ([example > one|https://circleci.com/gh/jolynch/cassandra/300#tests/containers/1], > [example > two|https://circleci.com/gh/rustyrazorblade/cassandra/44#tests/containers/1]) > because we use a small container (medium) for unit tests by default and the > in JVM dtests are leaking a few hundred megabytes of memory per test right > now. This is not a big deal because the dtest runs with the larger containers > continue to function fine as well as local testing as the number of in JVM > dtests is not yet high enough to cause a problem with more than 2GB of > available heap. However we should fix the memory leak so that going forwards > we can add more in JVM dtests without worry. > I've been working with [~ifesdjeen] to debug, and the issue appears to be > unreleased Table/Keyspace metrics (screenshot showing the leak attached). I > believe that we have a few potential issues that are leading to the leaks: > 1. The > [{{Instance::shutdown}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/Instance.java#L328-L354] > method is not successfully cleaning up all the metrics created by the > {{CassandraMetricsRegistry}} > 2. The > [{{TestCluster::close}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/TestCluster.java#L283] > method is not waiting for all the instances to finish shutting down and > cleaning up before continuing on > 3. I'm not sure if this is an issue assuming we clear all metrics, but > [{{TableMetrics::release}}|https://github.com/apache/cassandra/blob/4ae229f5cd270c2b43475b3f752a7b228de260ea/src/java/org/apache/cassandra/metrics/TableMetrics.java#L951] > does not release all the metric references (which could leak them) > I am working on a patch which shuts down everything and assures that we do > not leak memory. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-14971) Website documentation search function returns broken links
Anthony Grasso created CASSANDRA-14971: -- Summary: Website documentation search function returns broken links Key: CASSANDRA-14971 URL: https://issues.apache.org/jira/browse/CASSANDRA-14971 Project: Cassandra Issue Type: Bug Components: Documentation/Website Reporter: Anthony Grasso Assignee: Anthony Grasso The search bar on the main page of the [Cassandra Documentation|http://cassandra.apache.org/doc/latest/] returns search [results|http://cassandra.apache.org/doc/latest/search.html?q=cache_keywords=yes=default] with broken links. When a link from a returned search is clicked, the site returns a 404 with the message similar to this: {quote}The requested URL /doc/latest/tools/nodetool/nodetool.rst.html was not found on this server. {quote} >From the error, it appears that the links are pointing to pages that end in >*.rst.html* in their name. The links should point to pages that end in *.html*. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14922) In JVM dtests need to clean up after instance shutdown
[ https://issues.apache.org/jira/browse/CASSANDRA-14922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737711#comment-16737711 ] Benedict commented on CASSANDRA-14922: -- bq. can we wait for Alex to see the latest diff though... I've changed the patch a bit since he last looked. Sure thing. I'll start the rebase tomorrow in that case. In that case, also, I've pushed my one nit from a quick look through [here|https://github.com/belliottsmith/cassandra/tree/14922] for Alex to look at, that I would have simply ninja'd in (with comment here, of course). This is just using the {{HintsBuffer.free}} method instead of directly invoking {{DirectByteBuffer.cleaner().clean()}}. bq. Regarding the backport, I am slightly concerned about the NativeLibrary changes being backported in their current form. Thanks for highlighting this. I'll be sure to take a close look at the behaviour on each version we backport to. I expect there will be other places that need similar treatment to what you've done here, as well, so I need to double check anyway. bq. I think the Soft references are coming from java.io.ObjectStreamClass$Caches.localDescs, but the object serder we're doing in InvokableInstance is a bit beyond my JVM skills I'm afraid. No worries at all, thanks very much for reproducing this information here for posterity. If we ever want to clean this up, it would probably be easiest to simply avoid ser/deser entirely (or use custom ser/deser), but your approach is a much more suitable compromise for now. Thanks again also for all the investigative work to plug these gaps. > In JVM dtests need to clean up after instance shutdown > -- > > Key: CASSANDRA-14922 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14922 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Minor > Fix For: 4.0 > > Attachments: AllThreadsStopped.png, ClassLoadersRetaining.png, > Leaking_Metrics_On_Shutdown.png, MainClassRetaining.png, > MemoryReclaimedFix.png, Metaspace_Actually_Collected.png, > OnlyThreeRootsLeft.png, no_more_references.png > > > Currently the unit tests are failing on circleci ([example > one|https://circleci.com/gh/jolynch/cassandra/300#tests/containers/1], > [example > two|https://circleci.com/gh/rustyrazorblade/cassandra/44#tests/containers/1]) > because we use a small container (medium) for unit tests by default and the > in JVM dtests are leaking a few hundred megabytes of memory per test right > now. This is not a big deal because the dtest runs with the larger containers > continue to function fine as well as local testing as the number of in JVM > dtests is not yet high enough to cause a problem with more than 2GB of > available heap. However we should fix the memory leak so that going forwards > we can add more in JVM dtests without worry. > I've been working with [~ifesdjeen] to debug, and the issue appears to be > unreleased Table/Keyspace metrics (screenshot showing the leak attached). I > believe that we have a few potential issues that are leading to the leaks: > 1. The > [{{Instance::shutdown}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/Instance.java#L328-L354] > method is not successfully cleaning up all the metrics created by the > {{CassandraMetricsRegistry}} > 2. The > [{{TestCluster::close}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/TestCluster.java#L283] > method is not waiting for all the instances to finish shutting down and > cleaning up before continuing on > 3. I'm not sure if this is an issue assuming we clear all metrics, but > [{{TableMetrics::release}}|https://github.com/apache/cassandra/blob/4ae229f5cd270c2b43475b3f752a7b228de260ea/src/java/org/apache/cassandra/metrics/TableMetrics.java#L951] > does not release all the metric references (which could leak them) > I am working on a patch which shuts down everything and assures that we do > not leak memory. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums
[ https://issues.apache.org/jira/browse/CASSANDRA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737710#comment-16737710 ] Michael Shuler commented on CASSANDRA-14970: INFRA-14923 is the issue. > New releases must supply SHA-256 and/or SHA-512 checksums > - > > Key: CASSANDRA-14970 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14970 > Project: Cassandra > Issue Type: Bug > Components: Packaging >Reporter: Michael Shuler >Assignee: Michael Shuler >Priority: Blocker > Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0 > > Attachments: > 0001-Update-downloads-for-sha256-sha512-checksum-files.patch, > 0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch, > ant-publish-checksum-fail.jpg > > > Release policy was updated around 9/2018 to state: > "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT > supply MD5 or SHA-1. Existing releases do not need to be changed." > build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. > cassandra-builds/cassandra-release scripts need to be updated to work with > the new checksum files. > http://www.apache.org/dev/release-distribution#sigs-and-sums -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums
[ https://issues.apache.org/jira/browse/CASSANDRA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737701#comment-16737701 ] Michael Shuler commented on CASSANDRA-14970: I have no idea how the {{ant publish}} task works.. :( I did a staging publish and we still get .md5 and .sha1 checksums. !ant-publish-checksum-fail.jpg|thumbnail! > New releases must supply SHA-256 and/or SHA-512 checksums > - > > Key: CASSANDRA-14970 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14970 > Project: Cassandra > Issue Type: Bug > Components: Packaging >Reporter: Michael Shuler >Assignee: Michael Shuler >Priority: Blocker > Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0 > > Attachments: > 0001-Update-downloads-for-sha256-sha512-checksum-files.patch, > 0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch, > ant-publish-checksum-fail.jpg > > > Release policy was updated around 9/2018 to state: > "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT > supply MD5 or SHA-1. Existing releases do not need to be changed." > build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. > cassandra-builds/cassandra-release scripts need to be updated to work with > the new checksum files. > http://www.apache.org/dev/release-distribution#sigs-and-sums -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums
[ https://issues.apache.org/jira/browse/CASSANDRA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Shuler updated CASSANDRA-14970: --- Attachment: ant-publish-checksum-fail.jpg > New releases must supply SHA-256 and/or SHA-512 checksums > - > > Key: CASSANDRA-14970 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14970 > Project: Cassandra > Issue Type: Bug > Components: Packaging >Reporter: Michael Shuler >Assignee: Michael Shuler >Priority: Blocker > Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0 > > Attachments: > 0001-Update-downloads-for-sha256-sha512-checksum-files.patch, > 0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch, > ant-publish-checksum-fail.jpg > > > Release policy was updated around 9/2018 to state: > "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT > supply MD5 or SHA-1. Existing releases do not need to be changed." > build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. > cassandra-builds/cassandra-release scripts need to be updated to work with > the new checksum files. > http://www.apache.org/dev/release-distribution#sigs-and-sums -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums
[ https://issues.apache.org/jira/browse/CASSANDRA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737672#comment-16737672 ] Brandon Williams commented on CASSANDRA-14970: -- +1 > New releases must supply SHA-256 and/or SHA-512 checksums > - > > Key: CASSANDRA-14970 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14970 > Project: Cassandra > Issue Type: Bug > Components: Packaging >Reporter: Michael Shuler >Assignee: Michael Shuler >Priority: Blocker > Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0 > > Attachments: > 0001-Update-downloads-for-sha256-sha512-checksum-files.patch, > 0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch > > > Release policy was updated around 9/2018 to state: > "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT > supply MD5 or SHA-1. Existing releases do not need to be changed." > build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. > cassandra-builds/cassandra-release scripts need to be updated to work with > the new checksum files. > http://www.apache.org/dev/release-distribution#sigs-and-sums -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums
[ https://issues.apache.org/jira/browse/CASSANDRA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737643#comment-16737643 ] Michael Shuler edited comment on CASSANDRA-14970 at 1/8/19 11:52 PM: - [^0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch] Patch against {{cassandra-2.1}} branch. Merges up without conflict. {noformat} (cassandra-2.1)mshuler@hana:~/git/cassandra$ ls -l build/*.{gz,sha*} -rw-r--r-- 1 mshuler mshuler 25342702 Jan 8 17:04 build/apache-cassandra-2.1.20-SNAPSHOT-bin.tar.gz -rw-r--r-- 1 mshuler mshuler 65 Jan 8 17:04 build/apache-cassandra-2.1.20-SNAPSHOT-bin.tar.gz.sha256 -rw-r--r-- 1 mshuler mshuler 129 Jan 8 17:04 build/apache-cassandra-2.1.20-SNAPSHOT-bin.tar.gz.sha512 -rw-r--r-- 1 mshuler mshuler 17265833 Jan 8 17:04 build/apache-cassandra-2.1.20-SNAPSHOT-src.tar.gz -rw-r--r-- 1 mshuler mshuler 65 Jan 8 17:04 build/apache-cassandra-2.1.20-SNAPSHOT-src.tar.gz.sha256 -rw-r--r-- 1 mshuler mshuler 129 Jan 8 17:04 build/apache-cassandra-2.1.20-SNAPSHOT-src.tar.gz.sha512 {noformat} was (Author: mshuler): Patch against {{cassandra-2.1}} branch. Merges up without conflict. {noformat} (cassandra-2.1)mshuler@hana:~/git/cassandra$ ls -l build/*.{gz,sha*} -rw-r--r-- 1 mshuler mshuler 25342702 Jan 8 17:04 build/apache-cassandra-2.1.20-SNAPSHOT-bin.tar.gz -rw-r--r-- 1 mshuler mshuler 65 Jan 8 17:04 build/apache-cassandra-2.1.20-SNAPSHOT-bin.tar.gz.sha256 -rw-r--r-- 1 mshuler mshuler 129 Jan 8 17:04 build/apache-cassandra-2.1.20-SNAPSHOT-bin.tar.gz.sha512 -rw-r--r-- 1 mshuler mshuler 17265833 Jan 8 17:04 build/apache-cassandra-2.1.20-SNAPSHOT-src.tar.gz -rw-r--r-- 1 mshuler mshuler 65 Jan 8 17:04 build/apache-cassandra-2.1.20-SNAPSHOT-src.tar.gz.sha256 -rw-r--r-- 1 mshuler mshuler 129 Jan 8 17:04 build/apache-cassandra-2.1.20-SNAPSHOT-src.tar.gz.sha512 {noformat} > New releases must supply SHA-256 and/or SHA-512 checksums > - > > Key: CASSANDRA-14970 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14970 > Project: Cassandra > Issue Type: Bug > Components: Packaging >Reporter: Michael Shuler >Assignee: Michael Shuler >Priority: Blocker > Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0 > > Attachments: > 0001-Update-downloads-for-sha256-sha512-checksum-files.patch, > 0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch > > > Release policy was updated around 9/2018 to state: > "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT > supply MD5 or SHA-1. Existing releases do not need to be changed." > build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. > cassandra-builds/cassandra-release scripts need to be updated to work with > the new checksum files. > http://www.apache.org/dev/release-distribution#sigs-and-sums -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums
[ https://issues.apache.org/jira/browse/CASSANDRA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737662#comment-16737662 ] Michael Shuler commented on CASSANDRA-14970: [^0001-Update-downloads-for-sha256-sha512-checksum-files.patch] attached for the cassandra-builds repo - download the new checksum files for release publication. > New releases must supply SHA-256 and/or SHA-512 checksums > - > > Key: CASSANDRA-14970 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14970 > Project: Cassandra > Issue Type: Bug > Components: Packaging >Reporter: Michael Shuler >Assignee: Michael Shuler >Priority: Blocker > Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0 > > Attachments: > 0001-Update-downloads-for-sha256-sha512-checksum-files.patch, > 0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch > > > Release policy was updated around 9/2018 to state: > "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT > supply MD5 or SHA-1. Existing releases do not need to be changed." > build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. > cassandra-builds/cassandra-release scripts need to be updated to work with > the new checksum files. > http://www.apache.org/dev/release-distribution#sigs-and-sums -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums
[ https://issues.apache.org/jira/browse/CASSANDRA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Shuler updated CASSANDRA-14970: --- Attachment: 0001-Update-downloads-for-sha256-sha512-checksum-files.patch > New releases must supply SHA-256 and/or SHA-512 checksums > - > > Key: CASSANDRA-14970 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14970 > Project: Cassandra > Issue Type: Bug > Components: Packaging >Reporter: Michael Shuler >Assignee: Michael Shuler >Priority: Blocker > Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0 > > Attachments: > 0001-Update-downloads-for-sha256-sha512-checksum-files.patch, > 0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch > > > Release policy was updated around 9/2018 to state: > "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT > supply MD5 or SHA-1. Existing releases do not need to be changed." > build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. > cassandra-builds/cassandra-release scripts need to be updated to work with > the new checksum files. > http://www.apache.org/dev/release-distribution#sigs-and-sums -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14968) Investigate GPG signing of deb and rpm repositories via bintray
[ https://issues.apache.org/jira/browse/CASSANDRA-14968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737654#comment-16737654 ] Michael Shuler commented on CASSANDRA-14968: The apache organization there has a key - I don't know if it would be feasible to use the org key to sign the repositories? Individual users can upload public (and private (eww..)) keys and the API for bintray includes notes about signing via curl POST calls. I personally would not upload my private key anywhere, regardless of what ASF's opinion on that might be. Uploading a public key so the repo makes it available for download is pretty normal, then the signing portion (I guess) can be done offline(?) and uploaded. I don't know all the ins and outs of how it works. This is precisely why this ticket suggests investigating the topic. Is this something you would like assigned to you? > Investigate GPG signing of deb and rpm repositories via bintray > --- > > Key: CASSANDRA-14968 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14968 > Project: Cassandra > Issue Type: Bug >Reporter: Michael Shuler >Priority: Major > Labels: packaging > > Currently, the release manager uploads debian packages and built/signed > metadata to a generic bintray repository. Perhaps we could utilize the GPG > signing feature of the repository, post-upload, via the bintray GPG signing > feature. > https://www.jfrog.com/confluence/display/BT/Managing+Uploaded+Content#ManagingUploadedContent-GPGSigning > Depends on CASSANDRA-14967 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums
[ https://issues.apache.org/jira/browse/CASSANDRA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Shuler updated CASSANDRA-14970: --- Attachment: 0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch Status: Patch Available (was: Open) Patch against {{cassandra-2.1}} branch. Merges up without conflict. {noformat} (cassandra-2.1)mshuler@hana:~/git/cassandra$ ls -l build/*.{gz,sha*} -rw-r--r-- 1 mshuler mshuler 25342702 Jan 8 17:04 build/apache-cassandra-2.1.20-SNAPSHOT-bin.tar.gz -rw-r--r-- 1 mshuler mshuler 65 Jan 8 17:04 build/apache-cassandra-2.1.20-SNAPSHOT-bin.tar.gz.sha256 -rw-r--r-- 1 mshuler mshuler 129 Jan 8 17:04 build/apache-cassandra-2.1.20-SNAPSHOT-bin.tar.gz.sha512 -rw-r--r-- 1 mshuler mshuler 17265833 Jan 8 17:04 build/apache-cassandra-2.1.20-SNAPSHOT-src.tar.gz -rw-r--r-- 1 mshuler mshuler 65 Jan 8 17:04 build/apache-cassandra-2.1.20-SNAPSHOT-src.tar.gz.sha256 -rw-r--r-- 1 mshuler mshuler 129 Jan 8 17:04 build/apache-cassandra-2.1.20-SNAPSHOT-src.tar.gz.sha512 {noformat} > New releases must supply SHA-256 and/or SHA-512 checksums > - > > Key: CASSANDRA-14970 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14970 > Project: Cassandra > Issue Type: Bug > Components: Packaging >Reporter: Michael Shuler >Assignee: Michael Shuler >Priority: Blocker > Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0 > > Attachments: > 0001-Update-release-checksum-algorithms-to-SHA-256-SHA-512.patch > > > Release policy was updated around 9/2018 to state: > "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT > supply MD5 or SHA-1. Existing releases do not need to be changed." > build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. > cassandra-builds/cassandra-release scripts need to be updated to work with > the new checksum files. > http://www.apache.org/dev/release-distribution#sigs-and-sums -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums
[ https://issues.apache.org/jira/browse/CASSANDRA-14970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Shuler reassigned CASSANDRA-14970: -- Assignee: Michael Shuler > New releases must supply SHA-256 and/or SHA-512 checksums > - > > Key: CASSANDRA-14970 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14970 > Project: Cassandra > Issue Type: Bug > Components: Packaging >Reporter: Michael Shuler >Assignee: Michael Shuler >Priority: Blocker > Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0 > > > Release policy was updated around 9/2018 to state: > "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT > supply MD5 or SHA-1. Existing releases do not need to be changed." > build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. > cassandra-builds/cassandra-release scripts need to be updated to work with > the new checksum files. > http://www.apache.org/dev/release-distribution#sigs-and-sums -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14922) In JVM dtests need to clean up after instance shutdown
[ https://issues.apache.org/jira/browse/CASSANDRA-14922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737614#comment-16737614 ] Joseph Lynch commented on CASSANDRA-14922: -- [~benedict], Awesome, can we wait for Alex to see the latest diff though with the reflection removed in favor of his proposed fast local thread pool cleanup method? I've changed the patch a bit since he last looked. Regarding the backport, I am slightly concerned about the NativeLibrary changes being backported in their current form. From my reading of the JNA source code in version 4.2.2 in trunk we're just skipping the cache by using [NativeLibrary::getInstance|https://github.com/java-native-access/jna/blob/4bcc6191c5467361b5c1f12fb5797354cc3aa897/src/com/sun/jna/NativeLibrary.java#L341] directly and passing it to [Native::register(NativeLibrary)|https://github.com/java-native-access/jna/blob/4bcc6191c5467361b5c1f12fb5797354cc3aa897/src/com/sun/jna/Native.java#L1260] instead of having [Native::register(String)|https://github.com/java-native-access/jna/blob/4bcc6191c5467361b5c1f12fb5797354cc3aa897/src/com/sun/jna/Native.java#L1251] do that for us and cache the classloader along the way [here|https://github.com/java-native-access/jna/blob/4bcc6191c5467361b5c1f12fb5797354cc3aa897/src/com/sun/jna/NativeLibrary.java#L363]. But, if I'm wrong it's unlikely we'd know, as while our tests cover Linux pretty thoroughly, darwin/windows are less covered. Also I forgot to respond to your question about SoftReferences here, did it on IRC but not here. {quote}Do you know where the soft references originate? I wonder if there's anything we can do to simply eliminate them. {quote} I think the Soft references are coming from {{java.io.ObjectStreamClass$Caches.localDescs}}, but the object serder we're doing in {{InvokableInstance}} is a bit beyond my JVM skills I'm afraid. I don't know how we can prevent the object serializations from caching the class descriptions... Perhaps the JVM option is sufficient for now and if we don't like that going forward we can dive in more? > In JVM dtests need to clean up after instance shutdown > -- > > Key: CASSANDRA-14922 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14922 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Minor > Fix For: 4.0 > > Attachments: AllThreadsStopped.png, ClassLoadersRetaining.png, > Leaking_Metrics_On_Shutdown.png, MainClassRetaining.png, > MemoryReclaimedFix.png, Metaspace_Actually_Collected.png, > OnlyThreeRootsLeft.png, no_more_references.png > > > Currently the unit tests are failing on circleci ([example > one|https://circleci.com/gh/jolynch/cassandra/300#tests/containers/1], > [example > two|https://circleci.com/gh/rustyrazorblade/cassandra/44#tests/containers/1]) > because we use a small container (medium) for unit tests by default and the > in JVM dtests are leaking a few hundred megabytes of memory per test right > now. This is not a big deal because the dtest runs with the larger containers > continue to function fine as well as local testing as the number of in JVM > dtests is not yet high enough to cause a problem with more than 2GB of > available heap. However we should fix the memory leak so that going forwards > we can add more in JVM dtests without worry. > I've been working with [~ifesdjeen] to debug, and the issue appears to be > unreleased Table/Keyspace metrics (screenshot showing the leak attached). I > believe that we have a few potential issues that are leading to the leaks: > 1. The > [{{Instance::shutdown}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/Instance.java#L328-L354] > method is not successfully cleaning up all the metrics created by the > {{CassandraMetricsRegistry}} > 2. The > [{{TestCluster::close}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/TestCluster.java#L283] > method is not waiting for all the instances to finish shutting down and > cleaning up before continuing on > 3. I'm not sure if this is an issue assuming we clear all metrics, but > [{{TableMetrics::release}}|https://github.com/apache/cassandra/blob/4ae229f5cd270c2b43475b3f752a7b228de260ea/src/java/org/apache/cassandra/metrics/TableMetrics.java#L951] > does not release all the metric references (which could leak them) > I am working on a patch which shuts down everything and assures that we do > not leak memory. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To
[jira] [Created] (CASSANDRA-14970) New releases must supply SHA-256 and/or SHA-512 checksums
Michael Shuler created CASSANDRA-14970: -- Summary: New releases must supply SHA-256 and/or SHA-512 checksums Key: CASSANDRA-14970 URL: https://issues.apache.org/jira/browse/CASSANDRA-14970 Project: Cassandra Issue Type: Bug Components: Packaging Reporter: Michael Shuler Fix For: 2.1.21, 2.2.14, 3.0.18, 3.11.4, 4.0 Release policy was updated around 9/2018 to state: "For new releases, PMCs MUST supply SHA-256 and/or SHA-512; and SHOULD NOT supply MD5 or SHA-1. Existing releases do not need to be changed." build.xml needs to be updated from MD5 & SHA-1 to, at least, SHA-256 or both. cassandra-builds/cassandra-release scripts need to be updated to work with the new checksum files. http://www.apache.org/dev/release-distribution#sigs-and-sums -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14525) streaming failure during bootstrap makes new node into inconsistent state
[ https://issues.apache.org/jira/browse/CASSANDRA-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737608#comment-16737608 ] Jaydeepkumar Chovatia commented on CASSANDRA-14525: --- [~aweisberg] I've already taken care of dests as part of https://issues.apache.org/jira/browse/CASSANDRA-14526, here is the [patch for dtest|https://github.com/apache/cassandra-dtest/compare/master...jaydeepkumar1984:14526-trunk]. Not sure if [~jay.zhuang] got a chance to fire dtest, if possible could you please help me start dtest with this patch? > streaming failure during bootstrap makes new node into inconsistent state > - > > Key: CASSANDRA-14525 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14525 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Core >Reporter: Jaydeepkumar Chovatia >Assignee: Jaydeepkumar Chovatia >Priority: Major > Fix For: 2.2.14, 3.0.18, 3.11.4, 4.0 > > > If bootstrap fails for newly joining node (most common reason is due to > streaming failure) then Cassandra state remains in {{joining}} state which is > fine but Cassandra also enables Native transport which makes overall state > inconsistent. This further creates NullPointer exception if auth is enabled > on the new node, please find reproducible steps here: > For example if bootstrap fails due to streaming errors like > {quote}java.util.concurrent.ExecutionException: > org.apache.cassandra.streaming.StreamException: Stream failed > at > com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299) > ~[guava-18.0.jar:na] > at > com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286) > ~[guava-18.0.jar:na] > at > com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) > ~[guava-18.0.jar:na] > at > org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1256) > [apache-cassandra-3.0.16.jar:3.0.16] > at > org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:894) > [apache-cassandra-3.0.16.jar:3.0.16] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:660) > [apache-cassandra-3.0.16.jar:3.0.16] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:573) > [apache-cassandra-3.0.16.jar:3.0.16] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:330) > [apache-cassandra-3.0.16.jar:3.0.16] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567) > [apache-cassandra-3.0.16.jar:3.0.16] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:695) > [apache-cassandra-3.0.16.jar:3.0.16] > Caused by: org.apache.cassandra.streaming.StreamException: Stream failed > at > org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85) > ~[apache-cassandra-3.0.16.jar:3.0.16] > at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310) > ~[guava-18.0.jar:na] > at > com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457) > ~[guava-18.0.jar:na] > at > com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156) > ~[guava-18.0.jar:na] > at > com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145) > ~[guava-18.0.jar:na] > at > com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202) > ~[guava-18.0.jar:na] > at > org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:211) > ~[apache-cassandra-3.0.16.jar:3.0.16] > at > org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:187) > ~[apache-cassandra-3.0.16.jar:3.0.16] > at > org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:440) > ~[apache-cassandra-3.0.16.jar:3.0.16] > at > org.apache.cassandra.streaming.StreamSession.onError(StreamSession.java:540) > ~[apache-cassandra-3.0.16.jar:3.0.16] > at > org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:307) > ~[apache-cassandra-3.0.16.jar:3.0.16] > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79) > ~[apache-cassandra-3.0.16.jar:3.0.16] > at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121] > {quote} > then variable [StorageService.java::dataAvailable > |https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L892] > will be {{false}}. Since {{dataAvailable}} is {{false}} hence it will not > call
[jira] [Commented] (CASSANDRA-14968) Investigate GPG signing of deb and rpm repositories via bintray
[ https://issues.apache.org/jira/browse/CASSANDRA-14968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737603#comment-16737603 ] mck commented on CASSANDRA-14968: - I don't think ASF permits/encourages shared (or even uploaded?) private keys? This needs to be checked. > Investigate GPG signing of deb and rpm repositories via bintray > --- > > Key: CASSANDRA-14968 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14968 > Project: Cassandra > Issue Type: Bug >Reporter: Michael Shuler >Priority: Major > Labels: packaging > > Currently, the release manager uploads debian packages and built/signed > metadata to a generic bintray repository. Perhaps we could utilize the GPG > signing feature of the repository, post-upload, via the bintray GPG signing > feature. > https://www.jfrog.com/confluence/display/BT/Managing+Uploaded+Content#ManagingUploadedContent-GPGSigning > Depends on CASSANDRA-14967 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Resolved] (CASSANDRA-14969) Clean up ThreadLocals directly instead of via reflection
[ https://issues.apache.org/jira/browse/CASSANDRA-14969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch resolved CASSANDRA-14969. -- Resolution: Invalid My mistake, it appears that after Alex's suggestion to clean up the {{FastThreadLocalThread}}'s ThreadLocalMap directly via netty we don't need the reflection hack any more. Closing this out, sorry for the ticket spam. > Clean up ThreadLocals directly instead of via reflection > > > Key: CASSANDRA-14969 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14969 > Project: Cassandra > Issue Type: Improvement > Components: Test/dtest >Reporter: Joseph Lynch >Priority: Minor > > In CASSANDRA-14922 we have to institute a bit of a hack via reflection to > clean up thread local variables that are not properly {{destroyed}} in > {{DistributedTestBase::cleanup}}. Let's make sure that all of the thread > locals we have are cleaned up via {{destroy}} calls instead of relying on > reflection here. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14922) In JVM dtests need to clean up after instance shutdown
[ https://issues.apache.org/jira/browse/CASSANDRA-14922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737552#comment-16737552 ] Joseph Lynch edited comment on CASSANDRA-14922 at 1/8/19 10:10 PM: --- {quote}The patch looks good, and I'd say [~jolynch] let's merge it, {quote} Ok, yea I agree let's merge what we have so that the unit tests can pass on trunk again. I've put up a patch against trunk with what we have so far (including your changes from the demo branch which as far as I can tell remove the need for the ThreadLocal clearing). ||trunk|| |[024e6943|https://github.com/apache/cassandra/commit/024e69436e89bb79cdbf4e136a1f6d9c2747275d]| |[!https://circleci.com/gh/jolynch/cassandra/tree/CASSANDRA-14922.png?circle-token= 1102a59698d04899ec971dd36e925928f7b521f5!|https://circleci.com/gh/jolynch/cassandra/tree/CASSANDRA-14922]| If I attach a profiler during an intellij "run this test until it fails" mode I can see that the memory is indeed getting cleaned up: !MemoryReclaimedFix.png! was (Author: jolynch): {quote}The patch looks good, and I'd say [~jolynch] let's merge it, {quote} Ok, yea I agree let's merge what we have so that the unit tests can pass on trunk again and we can follow up in CASSANDRA-14969. I've put up a patch against trunk with what we have so far (including your changes from the demo branch). ||trunk|| |[0e4460b2e|https://github.com/apache/cassandra/commit/0e4460b2e0996802f02579c104b68ff165522875]| |[!https://circleci.com/gh/jolynch/cassandra/tree/CASSANDRA-14922.png?circle-token= 1102a59698d04899ec971dd36e925928f7b521f5!|https://circleci.com/gh/jolynch/cassandra/tree/CASSANDRA-14922]| If I attach a profiler during an intellij "run this test until it fails" mode I can see that the memory is indeed getting cleaned up: !MemoryReclaimedFix.png! > In JVM dtests need to clean up after instance shutdown > -- > > Key: CASSANDRA-14922 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14922 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Minor > Fix For: 4.0 > > Attachments: AllThreadsStopped.png, ClassLoadersRetaining.png, > Leaking_Metrics_On_Shutdown.png, MainClassRetaining.png, > MemoryReclaimedFix.png, Metaspace_Actually_Collected.png, > OnlyThreeRootsLeft.png, no_more_references.png > > > Currently the unit tests are failing on circleci ([example > one|https://circleci.com/gh/jolynch/cassandra/300#tests/containers/1], > [example > two|https://circleci.com/gh/rustyrazorblade/cassandra/44#tests/containers/1]) > because we use a small container (medium) for unit tests by default and the > in JVM dtests are leaking a few hundred megabytes of memory per test right > now. This is not a big deal because the dtest runs with the larger containers > continue to function fine as well as local testing as the number of in JVM > dtests is not yet high enough to cause a problem with more than 2GB of > available heap. However we should fix the memory leak so that going forwards > we can add more in JVM dtests without worry. > I've been working with [~ifesdjeen] to debug, and the issue appears to be > unreleased Table/Keyspace metrics (screenshot showing the leak attached). I > believe that we have a few potential issues that are leading to the leaks: > 1. The > [{{Instance::shutdown}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/Instance.java#L328-L354] > method is not successfully cleaning up all the metrics created by the > {{CassandraMetricsRegistry}} > 2. The > [{{TestCluster::close}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/TestCluster.java#L283] > method is not waiting for all the instances to finish shutting down and > cleaning up before continuing on > 3. I'm not sure if this is an issue assuming we clear all metrics, but > [{{TableMetrics::release}}|https://github.com/apache/cassandra/blob/4ae229f5cd270c2b43475b3f752a7b228de260ea/src/java/org/apache/cassandra/metrics/TableMetrics.java#L951] > does not release all the metric references (which could leak them) > I am working on a patch which shuts down everything and assures that we do > not leak memory. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14922) In JVM dtests need to clean up after instance shutdown
[ https://issues.apache.org/jira/browse/CASSANDRA-14922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737552#comment-16737552 ] Joseph Lynch edited comment on CASSANDRA-14922 at 1/8/19 9:53 PM: -- {quote}The patch looks good, and I'd say [~jolynch] let's merge it, {quote} Ok, yea I agree let's merge what we have so that the unit tests can pass on trunk again and we can follow up in CASSANDRA-14969. I've put up a patch against trunk with what we have so far (including your changes from the demo branch). ||trunk|| |[0e4460b2e|https://github.com/apache/cassandra/commit/0e4460b2e0996802f02579c104b68ff165522875]| |[!https://circleci.com/gh/jolynch/cassandra/tree/CASSANDRA-14922.png?circle-token= 1102a59698d04899ec971dd36e925928f7b521f5!|https://circleci.com/gh/jolynch/cassandra/tree/CASSANDRA-14922]| If I attach a profiler during an intellij "run this test until it fails" mode I can see that the memory is indeed getting cleaned up: !MemoryReclaimedFix.png! was (Author: jolynch): {quote}The patch looks good, and I'd say [~jolynch] let's merge it, {quote} Ok, yea I agree let's merge what we have so that the unit tests can pass on trunk again and we can follow up in CASSANDRA-14969. I've put up a patch against trunk with what we have so far (including your changes from the demo branch). ||trunk|| |[0e4460b2e|https://github.com/apache/cassandra/commit/0e4460b2e0996802f02579c104b68ff165522875]| |[!https://circleci.com/gh/jolynch/cassandra/tree/CASSANDRA-14922.png?circle-token= 1102a59698d04899ec971dd36e925928f7b521f5!|https://circleci.com/gh/jolynch/cassandra/tree/CASSANDRA-14922]| If I attach a profiler during an intellij "run this test until it fails" mode I can see that the memory is indeed getting cleaned up: !MemoryReclaimedFix.png! > In JVM dtests need to clean up after instance shutdown > -- > > Key: CASSANDRA-14922 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14922 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Minor > Fix For: 4.0 > > Attachments: AllThreadsStopped.png, ClassLoadersRetaining.png, > Leaking_Metrics_On_Shutdown.png, MainClassRetaining.png, > MemoryReclaimedFix.png, Metaspace_Actually_Collected.png, > OnlyThreeRootsLeft.png, no_more_references.png > > > Currently the unit tests are failing on circleci ([example > one|https://circleci.com/gh/jolynch/cassandra/300#tests/containers/1], > [example > two|https://circleci.com/gh/rustyrazorblade/cassandra/44#tests/containers/1]) > because we use a small container (medium) for unit tests by default and the > in JVM dtests are leaking a few hundred megabytes of memory per test right > now. This is not a big deal because the dtest runs with the larger containers > continue to function fine as well as local testing as the number of in JVM > dtests is not yet high enough to cause a problem with more than 2GB of > available heap. However we should fix the memory leak so that going forwards > we can add more in JVM dtests without worry. > I've been working with [~ifesdjeen] to debug, and the issue appears to be > unreleased Table/Keyspace metrics (screenshot showing the leak attached). I > believe that we have a few potential issues that are leading to the leaks: > 1. The > [{{Instance::shutdown}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/Instance.java#L328-L354] > method is not successfully cleaning up all the metrics created by the > {{CassandraMetricsRegistry}} > 2. The > [{{TestCluster::close}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/TestCluster.java#L283] > method is not waiting for all the instances to finish shutting down and > cleaning up before continuing on > 3. I'm not sure if this is an issue assuming we clear all metrics, but > [{{TableMetrics::release}}|https://github.com/apache/cassandra/blob/4ae229f5cd270c2b43475b3f752a7b228de260ea/src/java/org/apache/cassandra/metrics/TableMetrics.java#L951] > does not release all the metric references (which could leak them) > I am working on a patch which shuts down everything and assures that we do > not leak memory. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14922) In JVM dtests need to clean up after instance shutdown
[ https://issues.apache.org/jira/browse/CASSANDRA-14922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737552#comment-16737552 ] Joseph Lynch edited comment on CASSANDRA-14922 at 1/8/19 9:44 PM: -- {quote}The patch looks good, and I'd say [~jolynch] let's merge it, {quote} Ok, yea I agree let's merge what we have so that the unit tests can pass on trunk again and we can follow up in CASSANDRA-14969. I've put up a patch against trunk with what we have so far (including your changes from the demo branch). ||trunk|| |[0e4460b2e|https://github.com/apache/cassandra/commit/0e4460b2e0996802f02579c104b68ff165522875]| |[!https://circleci.com/gh/jolynch/cassandra/tree/CASSANDRA-14922.png?circle-token= 1102a59698d04899ec971dd36e925928f7b521f5!|https://circleci.com/gh/jolynch/cassandra/tree/CASSANDRA-14922]| If I attach a profiler during an intellij "run this test until it fails" mode I can see that the memory is indeed getting cleaned up: !MemoryReclaimedFix.png! was (Author: jolynch): {quote}The patch looks good, and I'd say [~jolynch] let's merge it, {quote} Ok, yea I agree let's merge what we have so that the unit tests can pass on trunk again and we can follow up in CASSANDRA-14969. I've put up a patch against trunk with what we have so far (including your changes from the demo branch). ||trunk|| |[d361ba9b|https://github.com/apache/cassandra/commit/d361ba9b846cf6dc9c3ef5daca7aab5a39ec8fcc]| |[!https://circleci.com/gh/jolynch/cassandra/tree/CASSANDRA-14922.png?circle-token= 1102a59698d04899ec971dd36e925928f7b521f5!|https://circleci.com/gh/jolynch/cassandra/tree/CASSANDRA-14922]| If I attach a profiler during an intellij "run this test until it fails" mode I can see that the memory is indeed getting cleaned up: !MemoryReclaimedFix.png! > In JVM dtests need to clean up after instance shutdown > -- > > Key: CASSANDRA-14922 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14922 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Minor > Attachments: AllThreadsStopped.png, ClassLoadersRetaining.png, > Leaking_Metrics_On_Shutdown.png, MainClassRetaining.png, > MemoryReclaimedFix.png, Metaspace_Actually_Collected.png, > OnlyThreeRootsLeft.png, no_more_references.png > > > Currently the unit tests are failing on circleci ([example > one|https://circleci.com/gh/jolynch/cassandra/300#tests/containers/1], > [example > two|https://circleci.com/gh/rustyrazorblade/cassandra/44#tests/containers/1]) > because we use a small container (medium) for unit tests by default and the > in JVM dtests are leaking a few hundred megabytes of memory per test right > now. This is not a big deal because the dtest runs with the larger containers > continue to function fine as well as local testing as the number of in JVM > dtests is not yet high enough to cause a problem with more than 2GB of > available heap. However we should fix the memory leak so that going forwards > we can add more in JVM dtests without worry. > I've been working with [~ifesdjeen] to debug, and the issue appears to be > unreleased Table/Keyspace metrics (screenshot showing the leak attached). I > believe that we have a few potential issues that are leading to the leaks: > 1. The > [{{Instance::shutdown}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/Instance.java#L328-L354] > method is not successfully cleaning up all the metrics created by the > {{CassandraMetricsRegistry}} > 2. The > [{{TestCluster::close}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/TestCluster.java#L283] > method is not waiting for all the instances to finish shutting down and > cleaning up before continuing on > 3. I'm not sure if this is an issue assuming we clear all metrics, but > [{{TableMetrics::release}}|https://github.com/apache/cassandra/blob/4ae229f5cd270c2b43475b3f752a7b228de260ea/src/java/org/apache/cassandra/metrics/TableMetrics.java#L951] > does not release all the metric references (which could leak them) > I am working on a patch which shuts down everything and assures that we do > not leak memory. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14525) streaming failure during bootstrap makes new node into inconsistent state
[ https://issues.apache.org/jira/browse/CASSANDRA-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737561#comment-16737561 ] Ariel Weisberg commented on CASSANDRA-14525: I think test_resume secondary_indexes_test.py:TestPreJoinCallback.test_resume has the same issue. > streaming failure during bootstrap makes new node into inconsistent state > - > > Key: CASSANDRA-14525 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14525 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Core >Reporter: Jaydeepkumar Chovatia >Assignee: Jaydeepkumar Chovatia >Priority: Major > Fix For: 2.2.14, 3.0.18, 3.11.4, 4.0 > > > If bootstrap fails for newly joining node (most common reason is due to > streaming failure) then Cassandra state remains in {{joining}} state which is > fine but Cassandra also enables Native transport which makes overall state > inconsistent. This further creates NullPointer exception if auth is enabled > on the new node, please find reproducible steps here: > For example if bootstrap fails due to streaming errors like > {quote}java.util.concurrent.ExecutionException: > org.apache.cassandra.streaming.StreamException: Stream failed > at > com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299) > ~[guava-18.0.jar:na] > at > com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286) > ~[guava-18.0.jar:na] > at > com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) > ~[guava-18.0.jar:na] > at > org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1256) > [apache-cassandra-3.0.16.jar:3.0.16] > at > org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:894) > [apache-cassandra-3.0.16.jar:3.0.16] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:660) > [apache-cassandra-3.0.16.jar:3.0.16] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:573) > [apache-cassandra-3.0.16.jar:3.0.16] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:330) > [apache-cassandra-3.0.16.jar:3.0.16] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567) > [apache-cassandra-3.0.16.jar:3.0.16] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:695) > [apache-cassandra-3.0.16.jar:3.0.16] > Caused by: org.apache.cassandra.streaming.StreamException: Stream failed > at > org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85) > ~[apache-cassandra-3.0.16.jar:3.0.16] > at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310) > ~[guava-18.0.jar:na] > at > com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457) > ~[guava-18.0.jar:na] > at > com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156) > ~[guava-18.0.jar:na] > at > com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145) > ~[guava-18.0.jar:na] > at > com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202) > ~[guava-18.0.jar:na] > at > org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:211) > ~[apache-cassandra-3.0.16.jar:3.0.16] > at > org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:187) > ~[apache-cassandra-3.0.16.jar:3.0.16] > at > org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:440) > ~[apache-cassandra-3.0.16.jar:3.0.16] > at > org.apache.cassandra.streaming.StreamSession.onError(StreamSession.java:540) > ~[apache-cassandra-3.0.16.jar:3.0.16] > at > org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:307) > ~[apache-cassandra-3.0.16.jar:3.0.16] > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79) > ~[apache-cassandra-3.0.16.jar:3.0.16] > at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121] > {quote} > then variable [StorageService.java::dataAvailable > |https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L892] > will be {{false}}. Since {{dataAvailable}} is {{false}} hence it will not > call [StorageService.java::finishJoiningRing > |https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L933] > and as a result >
[jira] [Commented] (CASSANDRA-14922) In JVM dtests need to clean up after instance shutdown
[ https://issues.apache.org/jira/browse/CASSANDRA-14922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737557#comment-16737557 ] Benedict commented on CASSANDRA-14922: -- Marking 'Ready to Commit' given [~ifesdjeen]'s comments. I'll give it another quick once over then commit, so I can rebase CASSANDRA-14931 and CASSANDRA-14937. > In JVM dtests need to clean up after instance shutdown > -- > > Key: CASSANDRA-14922 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14922 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Minor > Fix For: 4.0 > > Attachments: AllThreadsStopped.png, ClassLoadersRetaining.png, > Leaking_Metrics_On_Shutdown.png, MainClassRetaining.png, > MemoryReclaimedFix.png, Metaspace_Actually_Collected.png, > OnlyThreeRootsLeft.png, no_more_references.png > > > Currently the unit tests are failing on circleci ([example > one|https://circleci.com/gh/jolynch/cassandra/300#tests/containers/1], > [example > two|https://circleci.com/gh/rustyrazorblade/cassandra/44#tests/containers/1]) > because we use a small container (medium) for unit tests by default and the > in JVM dtests are leaking a few hundred megabytes of memory per test right > now. This is not a big deal because the dtest runs with the larger containers > continue to function fine as well as local testing as the number of in JVM > dtests is not yet high enough to cause a problem with more than 2GB of > available heap. However we should fix the memory leak so that going forwards > we can add more in JVM dtests without worry. > I've been working with [~ifesdjeen] to debug, and the issue appears to be > unreleased Table/Keyspace metrics (screenshot showing the leak attached). I > believe that we have a few potential issues that are leading to the leaks: > 1. The > [{{Instance::shutdown}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/Instance.java#L328-L354] > method is not successfully cleaning up all the metrics created by the > {{CassandraMetricsRegistry}} > 2. The > [{{TestCluster::close}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/TestCluster.java#L283] > method is not waiting for all the instances to finish shutting down and > cleaning up before continuing on > 3. I'm not sure if this is an issue assuming we clear all metrics, but > [{{TableMetrics::release}}|https://github.com/apache/cassandra/blob/4ae229f5cd270c2b43475b3f752a7b228de260ea/src/java/org/apache/cassandra/metrics/TableMetrics.java#L951] > does not release all the metric references (which could leak them) > I am working on a patch which shuts down everything and assures that we do > not leak memory. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14525) streaming failure during bootstrap makes new node into inconsistent state
[ https://issues.apache.org/jira/browse/CASSANDRA-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737556#comment-16737556 ] Ariel Weisberg commented on CASSANDRA-14525: This breaks bootstrap_test.py:TestBootstrap.test_resumable_bootstrap. The test expects the cluster to start the native interface when bootstrap fails. > streaming failure during bootstrap makes new node into inconsistent state > - > > Key: CASSANDRA-14525 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14525 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Core >Reporter: Jaydeepkumar Chovatia >Assignee: Jaydeepkumar Chovatia >Priority: Major > Fix For: 2.2.14, 3.0.18, 3.11.4, 4.0 > > > If bootstrap fails for newly joining node (most common reason is due to > streaming failure) then Cassandra state remains in {{joining}} state which is > fine but Cassandra also enables Native transport which makes overall state > inconsistent. This further creates NullPointer exception if auth is enabled > on the new node, please find reproducible steps here: > For example if bootstrap fails due to streaming errors like > {quote}java.util.concurrent.ExecutionException: > org.apache.cassandra.streaming.StreamException: Stream failed > at > com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299) > ~[guava-18.0.jar:na] > at > com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286) > ~[guava-18.0.jar:na] > at > com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) > ~[guava-18.0.jar:na] > at > org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1256) > [apache-cassandra-3.0.16.jar:3.0.16] > at > org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:894) > [apache-cassandra-3.0.16.jar:3.0.16] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:660) > [apache-cassandra-3.0.16.jar:3.0.16] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:573) > [apache-cassandra-3.0.16.jar:3.0.16] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:330) > [apache-cassandra-3.0.16.jar:3.0.16] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567) > [apache-cassandra-3.0.16.jar:3.0.16] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:695) > [apache-cassandra-3.0.16.jar:3.0.16] > Caused by: org.apache.cassandra.streaming.StreamException: Stream failed > at > org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85) > ~[apache-cassandra-3.0.16.jar:3.0.16] > at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310) > ~[guava-18.0.jar:na] > at > com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457) > ~[guava-18.0.jar:na] > at > com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156) > ~[guava-18.0.jar:na] > at > com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145) > ~[guava-18.0.jar:na] > at > com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202) > ~[guava-18.0.jar:na] > at > org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:211) > ~[apache-cassandra-3.0.16.jar:3.0.16] > at > org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:187) > ~[apache-cassandra-3.0.16.jar:3.0.16] > at > org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:440) > ~[apache-cassandra-3.0.16.jar:3.0.16] > at > org.apache.cassandra.streaming.StreamSession.onError(StreamSession.java:540) > ~[apache-cassandra-3.0.16.jar:3.0.16] > at > org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:307) > ~[apache-cassandra-3.0.16.jar:3.0.16] > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79) > ~[apache-cassandra-3.0.16.jar:3.0.16] > at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_121] > {quote} > then variable [StorageService.java::dataAvailable > |https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L892] > will be {{false}}. Since {{dataAvailable}} is {{false}} hence it will not > call [StorageService.java::finishJoiningRing > |https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/service/StorageService.java#L933] > and as a result >
[jira] [Updated] (CASSANDRA-14922) In JVM dtests need to clean up after instance shutdown
[ https://issues.apache.org/jira/browse/CASSANDRA-14922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-14922: - Status: Ready to Commit (was: Patch Available) > In JVM dtests need to clean up after instance shutdown > -- > > Key: CASSANDRA-14922 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14922 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Minor > Fix For: 4.0 > > Attachments: AllThreadsStopped.png, ClassLoadersRetaining.png, > Leaking_Metrics_On_Shutdown.png, MainClassRetaining.png, > MemoryReclaimedFix.png, Metaspace_Actually_Collected.png, > OnlyThreeRootsLeft.png, no_more_references.png > > > Currently the unit tests are failing on circleci ([example > one|https://circleci.com/gh/jolynch/cassandra/300#tests/containers/1], > [example > two|https://circleci.com/gh/rustyrazorblade/cassandra/44#tests/containers/1]) > because we use a small container (medium) for unit tests by default and the > in JVM dtests are leaking a few hundred megabytes of memory per test right > now. This is not a big deal because the dtest runs with the larger containers > continue to function fine as well as local testing as the number of in JVM > dtests is not yet high enough to cause a problem with more than 2GB of > available heap. However we should fix the memory leak so that going forwards > we can add more in JVM dtests without worry. > I've been working with [~ifesdjeen] to debug, and the issue appears to be > unreleased Table/Keyspace metrics (screenshot showing the leak attached). I > believe that we have a few potential issues that are leading to the leaks: > 1. The > [{{Instance::shutdown}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/Instance.java#L328-L354] > method is not successfully cleaning up all the metrics created by the > {{CassandraMetricsRegistry}} > 2. The > [{{TestCluster::close}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/TestCluster.java#L283] > method is not waiting for all the instances to finish shutting down and > cleaning up before continuing on > 3. I'm not sure if this is an issue assuming we clear all metrics, but > [{{TableMetrics::release}}|https://github.com/apache/cassandra/blob/4ae229f5cd270c2b43475b3f752a7b228de260ea/src/java/org/apache/cassandra/metrics/TableMetrics.java#L951] > does not release all the metric references (which could leak them) > I am working on a patch which shuts down everything and assures that we do > not leak memory. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14922) In JVM dtests need to clean up after instance shutdown
[ https://issues.apache.org/jira/browse/CASSANDRA-14922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-14922: - Fix Version/s: 4.0 Status: Patch Available (was: Open) > In JVM dtests need to clean up after instance shutdown > -- > > Key: CASSANDRA-14922 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14922 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Minor > Fix For: 4.0 > > Attachments: AllThreadsStopped.png, ClassLoadersRetaining.png, > Leaking_Metrics_On_Shutdown.png, MainClassRetaining.png, > MemoryReclaimedFix.png, Metaspace_Actually_Collected.png, > OnlyThreeRootsLeft.png, no_more_references.png > > > Currently the unit tests are failing on circleci ([example > one|https://circleci.com/gh/jolynch/cassandra/300#tests/containers/1], > [example > two|https://circleci.com/gh/rustyrazorblade/cassandra/44#tests/containers/1]) > because we use a small container (medium) for unit tests by default and the > in JVM dtests are leaking a few hundred megabytes of memory per test right > now. This is not a big deal because the dtest runs with the larger containers > continue to function fine as well as local testing as the number of in JVM > dtests is not yet high enough to cause a problem with more than 2GB of > available heap. However we should fix the memory leak so that going forwards > we can add more in JVM dtests without worry. > I've been working with [~ifesdjeen] to debug, and the issue appears to be > unreleased Table/Keyspace metrics (screenshot showing the leak attached). I > believe that we have a few potential issues that are leading to the leaks: > 1. The > [{{Instance::shutdown}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/Instance.java#L328-L354] > method is not successfully cleaning up all the metrics created by the > {{CassandraMetricsRegistry}} > 2. The > [{{TestCluster::close}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/TestCluster.java#L283] > method is not waiting for all the instances to finish shutting down and > cleaning up before continuing on > 3. I'm not sure if this is an issue assuming we clear all metrics, but > [{{TableMetrics::release}}|https://github.com/apache/cassandra/blob/4ae229f5cd270c2b43475b3f752a7b228de260ea/src/java/org/apache/cassandra/metrics/TableMetrics.java#L951] > does not release all the metric references (which could leak them) > I am working on a patch which shuts down everything and assures that we do > not leak memory. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14922) In JVM dtests need to clean up after instance shutdown
[ https://issues.apache.org/jira/browse/CASSANDRA-14922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-14922: - Attachment: MemoryReclaimedFix.png > In JVM dtests need to clean up after instance shutdown > -- > > Key: CASSANDRA-14922 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14922 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Minor > Attachments: AllThreadsStopped.png, ClassLoadersRetaining.png, > Leaking_Metrics_On_Shutdown.png, MainClassRetaining.png, > MemoryReclaimedFix.png, Metaspace_Actually_Collected.png, > OnlyThreeRootsLeft.png, no_more_references.png > > > Currently the unit tests are failing on circleci ([example > one|https://circleci.com/gh/jolynch/cassandra/300#tests/containers/1], > [example > two|https://circleci.com/gh/rustyrazorblade/cassandra/44#tests/containers/1]) > because we use a small container (medium) for unit tests by default and the > in JVM dtests are leaking a few hundred megabytes of memory per test right > now. This is not a big deal because the dtest runs with the larger containers > continue to function fine as well as local testing as the number of in JVM > dtests is not yet high enough to cause a problem with more than 2GB of > available heap. However we should fix the memory leak so that going forwards > we can add more in JVM dtests without worry. > I've been working with [~ifesdjeen] to debug, and the issue appears to be > unreleased Table/Keyspace metrics (screenshot showing the leak attached). I > believe that we have a few potential issues that are leading to the leaks: > 1. The > [{{Instance::shutdown}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/Instance.java#L328-L354] > method is not successfully cleaning up all the metrics created by the > {{CassandraMetricsRegistry}} > 2. The > [{{TestCluster::close}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/TestCluster.java#L283] > method is not waiting for all the instances to finish shutting down and > cleaning up before continuing on > 3. I'm not sure if this is an issue assuming we clear all metrics, but > [{{TableMetrics::release}}|https://github.com/apache/cassandra/blob/4ae229f5cd270c2b43475b3f752a7b228de260ea/src/java/org/apache/cassandra/metrics/TableMetrics.java#L951] > does not release all the metric references (which could leak them) > I am working on a patch which shuts down everything and assures that we do > not leak memory. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14922) In JVM dtests need to clean up after instance shutdown
[ https://issues.apache.org/jira/browse/CASSANDRA-14922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737552#comment-16737552 ] Joseph Lynch commented on CASSANDRA-14922: -- {quote}The patch looks good, and I'd say [~jolynch] let's merge it, {quote} Ok, yea I agree let's merge what we have so that the unit tests can pass on trunk again and we can follow up in CASSANDRA-14969. I've put up a patch against trunk with what we have so far (including your changes from the demo branch). ||trunk|| |[d361ba9b|https://github.com/apache/cassandra/commit/d361ba9b846cf6dc9c3ef5daca7aab5a39ec8fcc]| |[!https://circleci.com/gh/jolynch/cassandra/tree/CASSANDRA-14922.png?circle-token= 1102a59698d04899ec971dd36e925928f7b521f5!|https://circleci.com/gh/jolynch/cassandra/tree/CASSANDRA-14922]| If I attach a profiler during an intellij "run this test until it fails" mode I can see that the memory is indeed getting cleaned up: !MemoryReclaimedFix.png! > In JVM dtests need to clean up after instance shutdown > -- > > Key: CASSANDRA-14922 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14922 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Minor > Attachments: AllThreadsStopped.png, ClassLoadersRetaining.png, > Leaking_Metrics_On_Shutdown.png, MainClassRetaining.png, > MemoryReclaimedFix.png, Metaspace_Actually_Collected.png, > OnlyThreeRootsLeft.png, no_more_references.png > > > Currently the unit tests are failing on circleci ([example > one|https://circleci.com/gh/jolynch/cassandra/300#tests/containers/1], > [example > two|https://circleci.com/gh/rustyrazorblade/cassandra/44#tests/containers/1]) > because we use a small container (medium) for unit tests by default and the > in JVM dtests are leaking a few hundred megabytes of memory per test right > now. This is not a big deal because the dtest runs with the larger containers > continue to function fine as well as local testing as the number of in JVM > dtests is not yet high enough to cause a problem with more than 2GB of > available heap. However we should fix the memory leak so that going forwards > we can add more in JVM dtests without worry. > I've been working with [~ifesdjeen] to debug, and the issue appears to be > unreleased Table/Keyspace metrics (screenshot showing the leak attached). I > believe that we have a few potential issues that are leading to the leaks: > 1. The > [{{Instance::shutdown}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/Instance.java#L328-L354] > method is not successfully cleaning up all the metrics created by the > {{CassandraMetricsRegistry}} > 2. The > [{{TestCluster::close}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/TestCluster.java#L283] > method is not waiting for all the instances to finish shutting down and > cleaning up before continuing on > 3. I'm not sure if this is an issue assuming we clear all metrics, but > [{{TableMetrics::release}}|https://github.com/apache/cassandra/blob/4ae229f5cd270c2b43475b3f752a7b228de260ea/src/java/org/apache/cassandra/metrics/TableMetrics.java#L951] > does not release all the metric references (which could leak them) > I am working on a patch which shuts down everything and assures that we do > not leak memory. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14688) Update protocol spec and class level doc with protocol checksumming details
[ https://issues.apache.org/jira/browse/CASSANDRA-14688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mck updated CASSANDRA-14688: Reviewers: Alex Petrov, mck Reviewer: (was: mck) > Update protocol spec and class level doc with protocol checksumming details > --- > > Key: CASSANDRA-14688 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14688 > Project: Cassandra > Issue Type: Task > Components: Legacy/Documentation and Website >Reporter: Sam Tunnicliffe >Assignee: Sam Tunnicliffe >Priority: Major > Fix For: 4.0 > > > CASSANDRA-13304 provides an option to add checksumming to the frame body of > native protocol messages. The native protocol spec needs to be updated to > reflect this ASAP. We should also verify that the javadoc comments describing > the on-wire format in > {{o.a.c.transport.frame.checksum.ChecksummingTransformer}} are up to date. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-14969) Clean up ThreadLocals directly instead of via reflection
Joseph Lynch created CASSANDRA-14969: Summary: Clean up ThreadLocals directly instead of via reflection Key: CASSANDRA-14969 URL: https://issues.apache.org/jira/browse/CASSANDRA-14969 Project: Cassandra Issue Type: Improvement Components: Test/dtest Reporter: Joseph Lynch In CASSANDRA-14922 we have to institute a bit of a hack via reflection to clean up thread local variables that are not properly {{destroyed}} in {{DistributedTestBase::cleanup}}. Let's make sure that all of the thread locals we have are cleaned up via {{destroy}} calls instead of relying on reflection here. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14953) Failed to reclaim the memory and too many MemtableReclaimMemory pending task
[ https://issues.apache.org/jira/browse/CASSANDRA-14953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737416#comment-16737416 ] Jeremy Hanna commented on CASSANDRA-14953: -- This appears to be a use case/configuration specific problem and not a bug with the software itself. I would engage with those on the Cassandra user list or stack overflow to troubleshoot further. See http://cassandra.apache.org/community/ for links to both. Jira is primarily meant for development and bugs rather than operational issues. > Failed to reclaim the memory and too many MemtableReclaimMemory pending task > > > Key: CASSANDRA-14953 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14953 > Project: Cassandra > Issue Type: Bug > Components: Local/Memtable > Environment: version : cassandra 2.1.15 > jdk: 8 > os:suse >Reporter: HUANG DUICAN >Priority: Major > Attachments: 1.PNG, 2.PNG, cassandra_20190105.zip > > > We found that Cassandra has a lot of write accumulation in the production > environment, and our business has experienced a lot of write failures. > Through the system.log, it was found that MemtableReclaimMemory was pending > at the beginning, and then a large number of MutationStage stacks appeared at > a certain moment. > Finally, the heap memory is full, the GC time reaches tens of seconds, the > node status is DN through nodetool, but the Cassandra process is still > running.We killed the node and restarted the node, and the above situation > disappeared. > > Also the number of Active MemtableReclaimMemory threads seems to stay at 1. > (you can see the 1.PNG) > a large number of MutationStage stacks appeared at a certain moment. > (you can see the 2.PNG) > > long GC time: > - MemtableReclaimMemory 1 156 24565 0 0 > - G1 Old Generation GC in 87121ms. G1 Old Gen: 51175946656 -> 50082999760; > - MutationStage 128 11931622 1983820772 0 0 > - CounterMutationStage 0 0 0 0 0 > - MemtableReclaimMemory 1 156 24565 0 0 > - G1 Young Generation GC in {color:#FF}969ms{color}. G1 Eden Space: > 1090519040 -> 0; G1 Old Gen: 50082999760 -> 51156741584; > - MutationStage 128 11953653 1983820772 0 0 > - CounterMutationStage 0 0 0 0 0 > - MemtableReclaimMemory 1 156 24565 0 0 > - G1 Old Generation GC in {color:#FF}84785ms{color}. G1 Old Gen: > 51173518800 -> 50180911432; > - MutationStage 128 11967484 1983820772 0 0 > - CounterMutationStage 0 0 0 0 0 > - MemtableReclaimMemory 1 156 24565 0 0 > - G1 Young Generation GC in 611ms. G1 Eden Space: 989855744 -> 0; G1 Old > Gen: 50180911432 -> 51153989960; > - MutationStage 128 11975849 1983820772 0 0 > - CounterMutationStage 0 0 0 0 0 > - MemtableReclaimMemory 1 156 24565 0 0 > - G1 Old Generation GC in {color:#FF}85845ms{color}. G1 Old Gen: > 51170767176 -> 50238295416; > - MutationStage 128 11978192 1983820772 0 0 > - CounterMutationStage 0 0 0 0 0 > - MemtableReclaimMemory 1 156 24565 0 0 > - G1 Young Generation GC in 602ms. G1 Eden Space: 939524096 -> 0; G1 Old > Gen: 50238295416 -> 51161042296; > - MutationStage 128 11994295 1983820772 0 0 > - CounterMutationStage 0 0 0 0 0 > - MemtableReclaimMemory 1 156 24565 0 0 > - G1 Old Generation GC in {color:#FF}85307ms{color}. G1 Old Gen: > 51177819512 -> 50288829624; Metaspace: 36544536 -> 36525696 > - MutationStage 128 12001932 1983820772 0 0 > - CounterMutationStage 0 0 0 0 0 > 66 - MutationStage 128 12004395 1983820772 0 0 > 66 - CounterMutationStage 0 0 0 0 0 > - MemtableReclaimMemory 1 156 24565 0 0 > 66 - MemtableReclaimMemory 1 156 24565 0 0 > - G1 Young Generation GC in 610ms. G1 Eden Space: 889192448 -> 0; G1 Old > Gen: 50288829624 -> 51178022072; > - MutationStage 128 12023677 1983820772 0 0 > Why is this happening? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14953) Failed to reclaim the memory and too many MemtableReclaimMemory pending task
[ https://issues.apache.org/jira/browse/CASSANDRA-14953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737416#comment-16737416 ] Jeremy Hanna edited comment on CASSANDRA-14953 at 1/8/19 6:55 PM: -- This appears to be a use case/configuration specific problem and not a bug with the software itself. I would engage with those on the Cassandra user list or stack overflow to troubleshoot further. See http://cassandra.apache.org/community/ for links to both. Jira is primarily meant for development and bugs rather than operational questions. was (Author: jeromatron): This appears to be a use case/configuration specific problem and not a bug with the software itself. I would engage with those on the Cassandra user list or stack overflow to troubleshoot further. See http://cassandra.apache.org/community/ for links to both. Jira is primarily meant for development and bugs rather than operational issues. > Failed to reclaim the memory and too many MemtableReclaimMemory pending task > > > Key: CASSANDRA-14953 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14953 > Project: Cassandra > Issue Type: Bug > Components: Local/Memtable > Environment: version : cassandra 2.1.15 > jdk: 8 > os:suse >Reporter: HUANG DUICAN >Priority: Major > Attachments: 1.PNG, 2.PNG, cassandra_20190105.zip > > > We found that Cassandra has a lot of write accumulation in the production > environment, and our business has experienced a lot of write failures. > Through the system.log, it was found that MemtableReclaimMemory was pending > at the beginning, and then a large number of MutationStage stacks appeared at > a certain moment. > Finally, the heap memory is full, the GC time reaches tens of seconds, the > node status is DN through nodetool, but the Cassandra process is still > running.We killed the node and restarted the node, and the above situation > disappeared. > > Also the number of Active MemtableReclaimMemory threads seems to stay at 1. > (you can see the 1.PNG) > a large number of MutationStage stacks appeared at a certain moment. > (you can see the 2.PNG) > > long GC time: > - MemtableReclaimMemory 1 156 24565 0 0 > - G1 Old Generation GC in 87121ms. G1 Old Gen: 51175946656 -> 50082999760; > - MutationStage 128 11931622 1983820772 0 0 > - CounterMutationStage 0 0 0 0 0 > - MemtableReclaimMemory 1 156 24565 0 0 > - G1 Young Generation GC in {color:#FF}969ms{color}. G1 Eden Space: > 1090519040 -> 0; G1 Old Gen: 50082999760 -> 51156741584; > - MutationStage 128 11953653 1983820772 0 0 > - CounterMutationStage 0 0 0 0 0 > - MemtableReclaimMemory 1 156 24565 0 0 > - G1 Old Generation GC in {color:#FF}84785ms{color}. G1 Old Gen: > 51173518800 -> 50180911432; > - MutationStage 128 11967484 1983820772 0 0 > - CounterMutationStage 0 0 0 0 0 > - MemtableReclaimMemory 1 156 24565 0 0 > - G1 Young Generation GC in 611ms. G1 Eden Space: 989855744 -> 0; G1 Old > Gen: 50180911432 -> 51153989960; > - MutationStage 128 11975849 1983820772 0 0 > - CounterMutationStage 0 0 0 0 0 > - MemtableReclaimMemory 1 156 24565 0 0 > - G1 Old Generation GC in {color:#FF}85845ms{color}. G1 Old Gen: > 51170767176 -> 50238295416; > - MutationStage 128 11978192 1983820772 0 0 > - CounterMutationStage 0 0 0 0 0 > - MemtableReclaimMemory 1 156 24565 0 0 > - G1 Young Generation GC in 602ms. G1 Eden Space: 939524096 -> 0; G1 Old > Gen: 50238295416 -> 51161042296; > - MutationStage 128 11994295 1983820772 0 0 > - CounterMutationStage 0 0 0 0 0 > - MemtableReclaimMemory 1 156 24565 0 0 > - G1 Old Generation GC in {color:#FF}85307ms{color}. G1 Old Gen: > 51177819512 -> 50288829624; Metaspace: 36544536 -> 36525696 > - MutationStage 128 12001932 1983820772 0 0 > - CounterMutationStage 0 0 0 0 0 > 66 - MutationStage 128 12004395 1983820772 0 0 > 66 - CounterMutationStage 0 0 0 0 0 > - MemtableReclaimMemory 1 156 24565 0 0 > 66 - MemtableReclaimMemory 1 156 24565 0 0 > - G1 Young Generation GC in 610ms. G1 Eden Space: 889192448 -> 0; G1 Old > Gen: 50288829624 -> 51178022072; > - MutationStage 128 12023677 1983820772 0 0 > Why is this happening? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14957) Rolling Restart Of Nodes Cause Dataloss Due To Schema Collision
[ https://issues.apache.org/jira/browse/CASSANDRA-14957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737379#comment-16737379 ] Jeremy Hanna commented on CASSANDRA-14957: -- The schema has to agree across the cluster. If a node is being restarted, it has to catch up with the schema before being able to process writes to the new table. Until then, it will probably have messages in the logs that it can't identify a table with a certain id. How did you determine that there was data loss outside of temporary inconsistency between nodes? If the writes succeeded on other nodes at the consistency level you specified, then there wasn't data loss. You just had a temporary inconsistency on the node being restarted. So the normal anti entropy operations like read repair and full repair should get it back into a consistent state. > Rolling Restart Of Nodes Cause Dataloss Due To Schema Collision > --- > > Key: CASSANDRA-14957 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14957 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Schema >Reporter: Avraham Kalvo >Priority: Major > > We were issuing a rolling restart on a mission-critical five node C* cluster. > The first node which was restarted got the following messages in its > system.log: > ``` > January 2nd 2019, 12:06:37.310 - INFO 12:06:35 Initializing > tasks_scheduler_external.tasks > ``` > ``` > WARN 12:06:39 UnknownColumnFamilyException reading from socket; closing > org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find table for > cfId bd7200a0-1567-11e8-8974-855d74ee356f. If a table was just created, this > is likely due to the schema not being fully propagated. Please wait for > schema agreement on table creation. > at > org.apache.cassandra.config.CFMetaData$Serializer.deserialize(CFMetaData.java:1336) > ~[apache-cassandra-3.0.10.jar:3.0.10] > at > org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:660) > ~[apache-cassandra-3.0.10.jar:3.0.10] > at > org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:635) > ~[apache-cassandra-3.0.10.jar:3.0.10] > at > org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:330) > ~[apache-cassandra-3.0.10.jar:3.0.10] > at > org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:349) > ~[apache-cassandra-3.0.10.jar:3.0.10] > at > org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:286) > ~[apache-cassandra-3.0.10.jar:3.0.10] > at org.apache.cassandra.net.MessageIn.read(MessageIn.java:98) > ~[apache-cassandra-3.0.10.jar:3.0.10] > at > org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:201) > ~[apache-cassandra-3.0.10.jar:3.0.10] > at > org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:178) > ~[apache-cassandra-3.0.10.jar:3.0.10] > at > org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:92) > ~[apache-cassandra-3.0.10.jar:3.0.10] > ``` > The latter was then repeated several times across the cluster. > It was then found out that the table in question > `tasks_scheduler_external.tasks` was created with a new schema version after > the entire cluster was restarted consecutively and schema agreement settled, > which started taking requests leaving the previous version of the schema > unavailable for any request, thus generating a data loss to our online system. > Data loss was recovered by manually copying SSTables from the previous > version directory of the schema to the new one followed by `nodetool refresh` > to the relevant table. > The above has repeated itself for several tables across various keyspaces. > One other thing to mention is that a repair was in place for the first node > to be restarted, which was obviously stopped as the daemon was shut down, but > this doesn't seem to do with the above at first glance. > Seems somewhat related to: > https://issues.apache.org/jira/browse/CASSANDRA-13559 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14931) Backport In-JVM dtests to 2.2, 3.0 and 3.11
[ https://issues.apache.org/jira/browse/CASSANDRA-14931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-14931: Reviewer: Alex Petrov > Backport In-JVM dtests to 2.2, 3.0 and 3.11 > --- > > Key: CASSANDRA-14931 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14931 > Project: Cassandra > Issue Type: Improvement > Components: Test/dtest >Reporter: Benedict >Assignee: Benedict >Priority: Major > Fix For: 2.2.14, 3.0.18, 3.11.4 > > > The In-JVM dtests are of significant value, and much of the testing we are > exploring with it can easily be utilised on all presently maintained > versions. We should backport the functionality to at least 3.0.x and 3.11.x > - and perhaps even consider 2.2. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14914) Deserialization Error
[ https://issues.apache.org/jira/browse/CASSANDRA-14914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737136#comment-16737136 ] EDSON VICENTE CARLI JUNIOR edited comment on CASSANDRA-14914 at 1/8/19 1:50 PM: No, I not used custom timestamp into my application was (Author: gandbranco): No, I not used no one custom timestamp into my application > Deserialization Error > - > > Key: CASSANDRA-14914 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14914 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Core > Environment: I use cassandra 3.9, but I tried to upgrade to 3.11 and > nothing has changed. >Reporter: EDSON VICENTE CARLI JUNIOR >Priority: Critical > Fix For: 3.11.x > > Attachments: mutation4465429258841992355dat > > > > I have a single cassandra, now this error appears when I start the server: > {code:java} > ERROR 11:18:45 Exiting due to error while processing commit log during > initialization. > org.apache.cassandra.db.commitlog.CommitLogReadHandler$CommitLogReadException: > Unexpected error deserializing mutation; saved to > /tmp/mutation4787806670239768067dat. This may be caused by replaying a > mutation against a table with the same name but incompatible schema. > Exception follows: org.apache.cassandra.serializers.MarshalException: A local > deletion time should not be negative > {code} > If I delete all the commitlog and saved_cached files the server goes up, but > the next day when I reboot the cassandra, the error occurs again. > The file mutationDDdat change name for each restart. I attachament a > example mutation file . > What's wrong? How to make cassandra stable again? > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14914) Deserialization Error
[ https://issues.apache.org/jira/browse/CASSANDRA-14914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737136#comment-16737136 ] EDSON VICENTE CARLI JUNIOR commented on CASSANDRA-14914: No, I not used no one custom timestamp into my application > Deserialization Error > - > > Key: CASSANDRA-14914 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14914 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Core > Environment: I use cassandra 3.9, but I tried to upgrade to 3.11 and > nothing has changed. >Reporter: EDSON VICENTE CARLI JUNIOR >Priority: Critical > Fix For: 3.11.x > > Attachments: mutation4465429258841992355dat > > > > I have a single cassandra, now this error appears when I start the server: > {code:java} > ERROR 11:18:45 Exiting due to error while processing commit log during > initialization. > org.apache.cassandra.db.commitlog.CommitLogReadHandler$CommitLogReadException: > Unexpected error deserializing mutation; saved to > /tmp/mutation4787806670239768067dat. This may be caused by replaying a > mutation against a table with the same name but incompatible schema. > Exception follows: org.apache.cassandra.serializers.MarshalException: A local > deletion time should not be negative > {code} > If I delete all the commitlog and saved_cached files the server goes up, but > the next day when I reboot the cassandra, the error occurs again. > The file mutationDDdat change name for each restart. I attachament a > example mutation file . > What's wrong? How to make cassandra stable again? > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14922) In JVM dtests need to clean up after instance shutdown
[ https://issues.apache.org/jira/browse/CASSANDRA-14922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16737083#comment-16737083 ] Alex Petrov commented on CASSANDRA-14922: - The patch looks good, and I'd say [~jolynch] let's merge it, since tests have been failing for a while now, unless there's something else you wanted to include in the patch immediately. I've had a couple of minor suggestions. All of the issues are easier to see / reproducible with a very small heap, ~256Mb: * Hints are leaking direct memory * Threadlocals are leaked * FastThreadLocalThread thread locals are leaked (sorry for a tongue-twister) I've put together a small [demo|https://github.com/apache/cassandra/compare/trunk...ifesdjeen:CASSANDRA-14922] just for demonstration purposes if you wanted to see the impact of suggested changes. > In JVM dtests need to clean up after instance shutdown > -- > > Key: CASSANDRA-14922 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14922 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Minor > Attachments: AllThreadsStopped.png, ClassLoadersRetaining.png, > Leaking_Metrics_On_Shutdown.png, MainClassRetaining.png, > Metaspace_Actually_Collected.png, OnlyThreeRootsLeft.png, > no_more_references.png > > > Currently the unit tests are failing on circleci ([example > one|https://circleci.com/gh/jolynch/cassandra/300#tests/containers/1], > [example > two|https://circleci.com/gh/rustyrazorblade/cassandra/44#tests/containers/1]) > because we use a small container (medium) for unit tests by default and the > in JVM dtests are leaking a few hundred megabytes of memory per test right > now. This is not a big deal because the dtest runs with the larger containers > continue to function fine as well as local testing as the number of in JVM > dtests is not yet high enough to cause a problem with more than 2GB of > available heap. However we should fix the memory leak so that going forwards > we can add more in JVM dtests without worry. > I've been working with [~ifesdjeen] to debug, and the issue appears to be > unreleased Table/Keyspace metrics (screenshot showing the leak attached). I > believe that we have a few potential issues that are leading to the leaks: > 1. The > [{{Instance::shutdown}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/Instance.java#L328-L354] > method is not successfully cleaning up all the metrics created by the > {{CassandraMetricsRegistry}} > 2. The > [{{TestCluster::close}}|https://github.com/apache/cassandra/blob/f22fec927de7ac29120c2f34de5b8cc1c695/test/distributed/org/apache/cassandra/distributed/TestCluster.java#L283] > method is not waiting for all the instances to finish shutting down and > cleaning up before continuing on > 3. I'm not sure if this is an issue assuming we clear all metrics, but > [{{TableMetrics::release}}|https://github.com/apache/cassandra/blob/4ae229f5cd270c2b43475b3f752a7b228de260ea/src/java/org/apache/cassandra/metrics/TableMetrics.java#L951] > does not release all the metric references (which could leak them) > I am working on a patch which shuts down everything and assures that we do > not leak memory. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14937) Multi-version In-JVM dtests
[ https://issues.apache.org/jira/browse/CASSANDRA-14937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-14937: Reviewer: Alex Petrov > Multi-version In-JVM dtests > --- > > Key: CASSANDRA-14937 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14937 > Project: Cassandra > Issue Type: New Feature > Components: Test/dtest >Reporter: Benedict >Assignee: Benedict >Priority: Major > Fix For: 2.2.x, 3.0.x, 3.11.x > > > In order to support more sophisticated upgrade tests, including complex fuzz > tests that can span a sequence of version upgrades, we propose abstracting a > cross-version API for the in-jvm dtests. This will permit starting a node > with an arbitrary compatible C* version, stopping the node, and restarting it > with another C* version. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14905) if SizeEstimatesRecorder misses a 'onDropTable' notification, the size_estimates table will never be cleared for that table.
[ https://issues.apache.org/jira/browse/CASSANDRA-14905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736995#comment-16736995 ] Aleksandr Sorokoumov commented on CASSANDRA-14905: -- Thanks for the review! It'll be great if both authors can get the credit. Otherwise please give it to Joel. > if SizeEstimatesRecorder misses a 'onDropTable' notification, the > size_estimates table will never be cleared for that table. > > > Key: CASSANDRA-14905 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14905 > Project: Cassandra > Issue Type: Bug > Components: Observability/Metrics >Reporter: Aleksandr Sorokoumov >Assignee: Aleksandr Sorokoumov >Priority: Minor > Fix For: 3.0.x, 3.11.x, 4.0.x > > Attachments: 14905-3.0-dtest.png, 14905-3.0-testall.png, > 14905-3.11-dtest.png, 14905-3.11-testall.png, 14905-4.0-dtest.png, > 14905-4.0-testall.png > > > if a node is down when a keyspace/table is dropped, it will receive the > schema notification before the size estimates listener is registered, so the > entries for the dropped keyspace/table will never be cleaned from the table. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14956) Paged Range Slice queries with DISTINCT can drop rows from results
[ https://issues.apache.org/jira/browse/CASSANDRA-14956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736989#comment-16736989 ] Sam Tunnicliffe commented on CASSANDRA-14956: - Pushed a 2.1 branch after discussion on dev@ about one last 2.1 release before EOL. Unfortunately, CircleCI no longer supports v1.0 job configuration so a CI run is going to need the v2.0 config backporting (which we may want to do before a release anyway). [14956-2.1|https://github.com/beobal/cassandra/tree/14956-2.1] > Paged Range Slice queries with DISTINCT can drop rows from results > -- > > Key: CASSANDRA-14956 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14956 > Project: Cassandra > Issue Type: Bug > Components: CQL/Interpreter >Reporter: Sam Tunnicliffe >Assignee: Sam Tunnicliffe >Priority: Major > Fix For: 2.2.14 > > > If we have a partition where the first CQL row is fully deleted (possibly via > TTLs), and that partition happens to fall on the page boundary of a paged > range query which is using SELECT DISTINCT, the next live partition *after* > it is omitted from the result set. This is due to over fetching of the pages > and a bug in trimming those pages where overlap occurs. > This does not affect 3.0+. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14688) Update protocol spec and class level doc with protocol checksumming details
[ https://issues.apache.org/jira/browse/CASSANDRA-14688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16736941#comment-16736941 ] Alex Petrov commented on CASSANDRA-14688: - Thank you for adding this much needed doc! +1 Patch looks good; just have a couple of nits: * Bytes for lengths seem to be added in [big-endian order|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/transport/frame/checksum/ChecksummingTransformer.java#L353] for checksumming (which is a default for protocol, too) and the code is equivalent to doing {{ByteBuffer#putInt}}. Since ordering was handled explicitly here, despite {{ByteBuffer}} overload for {{Checksum#of}}, do we want to specify endianness here? * This is more a code comment though, but since it's always good to have mapping from code to protocol documentation, currently [numCompressedChunks |https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/transport/frame/checksum/ChecksummingTransformer.java#L177] is a variable that represents number of all chunks, not only compressed. Maybe we'd like to change it to reduce ambiguity. The rest of comments of even smaller significance are in the patch. > Update protocol spec and class level doc with protocol checksumming details > --- > > Key: CASSANDRA-14688 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14688 > Project: Cassandra > Issue Type: Task > Components: Legacy/Documentation and Website >Reporter: Sam Tunnicliffe >Assignee: Sam Tunnicliffe >Priority: Major > Fix For: 4.0 > > > CASSANDRA-13304 provides an option to add checksumming to the frame body of > native protocol messages. The native protocol spec needs to be updated to > reflect this ASAP. We should also verify that the javadoc comments describing > the on-wire format in > {{o.a.c.transport.frame.checksum.ChecksummingTransformer}} are up to date. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org