[jira] [Assigned] (CASSANDRA-12728) Handling partially written hint files
[ https://issues.apache.org/jira/browse/CASSANDRA-12728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko reassigned CASSANDRA-12728: - Assignee: Aleksey Yeschenko > Handling partially written hint files > - > > Key: CASSANDRA-12728 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12728 > Project: Cassandra > Issue Type: Bug >Reporter: Sharvanath Pathak >Assignee: Aleksey Yeschenko > > {noformat} > ERROR [HintsDispatcher:1] 2016-09-28 17:44:43,397 > HintsDispatchExecutor.java:225 - Failed to dispatch hints file > d5d7257c-9f81-49b2-8633-6f9bda6e3dea-1474892654160-1.hints: file is corrupted > ({}) > org.apache.cassandra.io.FSReadError: java.io.EOFException > at > org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:282) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:252) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:156) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:137) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:119) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:91) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:259) > [apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:242) > [apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:220) > [apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:199) > [apache-cassandra-3.0.6.jar:3.0.6] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_77] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > [na:1.8.0_77] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > [na:1.8.0_77] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_77] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77] > Caused by: java.io.EOFException: null > at > org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:68) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.ChecksummedDataInput.readFully(ChecksummedDataInput.java:126) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsReader$BuffersIterator.readBuffer(HintsReader.java:310) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNextInternal(HintsReader.java:301) > ~[apache-cassandra-3.0.6.jar:3.0.6] > at > org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:278) > ~[apache-cassandra-3.0.6.jar:3.0.6] > ... 15 common frames omitted > {noformat} > We've found out that the hint file was truncated because there was a hard > reboot around the time of last write to the file. I think we basically need > to handle partially written hint files. Also, the CRC file does not exist in > this case (probably because it crashed while writing the hints file). May be > ignoring and cleaning up such partially written hint files can be a way to > fix this? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12705) Add column definition kind to system schema dropped columns
[ https://issues.apache.org/jira/browse/CASSANDRA-12705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15532143#comment-15532143 ] Sylvain Lebresne commented on CASSANDRA-12705: -- fyi, trunk is now 4.0 if you want to rebase. > Add column definition kind to system schema dropped columns > --- > > Key: CASSANDRA-12705 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12705 > Project: Cassandra > Issue Type: Bug > Components: Distributed Metadata >Reporter: Stefania >Assignee: Stefania > Fix For: 4.0 > > > Both regular and static columns can currently be dropped by users, but this > information is currently not stored in {{SchemaKeyspace.DroppedColumns}}. As > a consequence, {{CFMetadata.getDroppedColumnDefinition}} returns a regular > column and this has caused problems such as CASSANDRA-12582. > We should add the column kind to {{SchemaKeyspace.DroppedColumns}} so that > {{CFMetadata.getDroppedColumnDefinition}} can create the correct column > definition. However, altering schema tables would cause inter-node > communication failures during a rolling upgrade, see CASSANDRA-12236. > Therefore we should wait for a full schema migration when upgrading to the > next major version. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
cassandra git commit: Change version to 4.0 after creation of 3.X branch
Repository: cassandra Updated Branches: refs/heads/trunk 25d4c7baa -> 78ff37707 Change version to 4.0 after creation of 3.X branch Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/78ff3770 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/78ff3770 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/78ff3770 Branch: refs/heads/trunk Commit: 78ff3770756c83cd466fe11eb0bf9a5bb7c4 Parents: 25d4c7b Author: Sylvain LebresneAuthored: Thu Sep 29 10:19:34 2016 +0200 Committer: Sylvain Lebresne Committed: Thu Sep 29 10:19:34 2016 +0200 -- CHANGES.txt | 3 +++ NEWS.txt| 10 ++ build.xml | 2 +- 3 files changed, 14 insertions(+), 1 deletion(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/78ff3770/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index a9e46f7..03456a0 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,3 +1,6 @@ +4.0 + + 3.10 * Upgrade metrics-reporter dependencies (CASSANDRA-12089) * Tune compaction thread count via nodetool (CASSANDRA-12248) http://git-wip-us.apache.org/repos/asf/cassandra/blob/78ff3770/NEWS.txt -- diff --git a/NEWS.txt b/NEWS.txt index 9ab7c26..e1e952b 100644 --- a/NEWS.txt +++ b/NEWS.txt @@ -13,6 +13,16 @@ restore snapshots created with the previous major version using the 'sstableloader' tool. You can upgrade the file format of your snapshots using the provided 'sstableupgrade' tool. +4.0 +=== + +New features + + +Upgrading +- + + 3.10 http://git-wip-us.apache.org/repos/asf/cassandra/blob/78ff3770/build.xml -- diff --git a/build.xml b/build.xml index 1808808..21cbc85 100644 --- a/build.xml +++ b/build.xml @@ -25,7 +25,7 @@ - + http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=tree"/>
[cassandra] Git Push Summary
Repository: cassandra Updated Branches: refs/heads/3.X [created] 25d4c7baa
[jira] [Comment Edited] (CASSANDRA-12461) Add hooks to StorageService shutdown
[ https://issues.apache.org/jira/browse/CASSANDRA-12461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15531974#comment-15531974 ] Alex Petrov edited comment on CASSANDRA-12461 at 9/29/16 7:41 AM: -- +1 on changes. The initial patch was marked as a "bug", I've re-labeled it as an "improvement". But since we have fixed several issues with drain process ([CASSANDRA-12509], for example), might be good to have it in 3.0? Or should I remove hooks, only leave bugfix for 3.0 patch (will be a trivial change, too), just don't want to create confusion.. Backport to 3.0 was trivial, but we can just skip committing if anything. CI triggered for both branches (although your latest branch only squashed, without any changes on top). |[12461-trunk|https://github.com/ifesdjeen/cassandra/tree/12461-trunk]|[dtest|http://cassci.datastax.com/job/ifesdjeen-12461-trunk-dtest/]|[utest|http://cassci.datastax.com/job/ifesdjeen-12461-trunk-testall/]| |[12461-3.0|https://github.com/ifesdjeen/cassandra/tree/12461-3.0]|[dtest|http://cassci.datastax.com/job/ifesdjeen-12461-3.0-dtest/]|[utest|http://cassci.datastax.com/job/ifesdjeen-12461-3.0-testall/]| was (Author: ifesdjeen): +1 on changes. The initial patch was marked as a "bug", I've re-labeled it as an "improvement". But since we have fixed several issues with drain process ([CASSANDRA-12509], for example), might be good to have it in 3.0? Or should I remove hooks, only leave bugfix for 3.0 patch (will be a trivial change, too), just don't want to create confusion.. Backport to 3.0 was trivial, but we can just skip committing if anything. CI triggered for both branches (although your latest branch only squashed, without any changes on top). |[12461-trunk|https://github.com/ifesdjeen/cassandra/tree/12461-trunk]|[dtest|http://cassci.datastax.com/job/ifesdjeen-12461-trunk-dtest/]|[utest|http://cassci.datastax.com/job/ifesdjeen-12461-trunk-testall/]| |[12461-trunk|https://github.com/ifesdjeen/cassandra/tree/12461-trunk]|[3.0|http://cassci.datastax.com/job/ifesdjeen-12461-trunk-3.0/]|[utest|http://cassci.datastax.com/job/ifesdjeen-12461-trunk-testall/]||[12461-3.0|https://github.com/ifesdjeen/cassandra/tree/12461-3.0]|[dtest|http://cassci.datastax.com/job/ifesdjeen-12461-3.0-dtest/]|[utest|http://cassci.datastax.com/job/ifesdjeen-12461-3.0-testall/]| > Add hooks to StorageService shutdown > > > Key: CASSANDRA-12461 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12461 > Project: Cassandra > Issue Type: Bug >Reporter: Anthony Cozzie >Assignee: Anthony Cozzie > Fix For: 3.x > > Attachments: > 0001-CASSANDRA-12461-add-C-support-for-shutdown-runnables.patch > > > The JVM will usually run shutdown hooks in parallel. This can lead to > synchronization problems between Cassandra, services that depend on it, and > services it depends on. This patch adds some simple support for shutdown > hooks to StorageService. > This should nearly solve CASSANDRA-12011 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-12461) Add hooks to StorageService shutdown
[ https://issues.apache.org/jira/browse/CASSANDRA-12461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15531998#comment-15531998 ] Stefania edited comment on CASSANDRA-12461 at 9/29/16 7:38 AM: --- +1 for committing the full patch to 3.0 since it fixes a number issues, including things like CASSANDRA-12397. I'll do another quick round tomorrow, and if CI results are clean, we can commit. was (Author: stefania): +1 for committing the full patch to 3.0 since it fixes a number issues, including making problems like CASSANDRA-12397 much harder to reproduce. I'll do another quick round tomorrow, and if CI results are clean, we can commit. > Add hooks to StorageService shutdown > > > Key: CASSANDRA-12461 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12461 > Project: Cassandra > Issue Type: Bug >Reporter: Anthony Cozzie >Assignee: Anthony Cozzie > Fix For: 3.x > > Attachments: > 0001-CASSANDRA-12461-add-C-support-for-shutdown-runnables.patch > > > The JVM will usually run shutdown hooks in parallel. This can lead to > synchronization problems between Cassandra, services that depend on it, and > services it depends on. This patch adds some simple support for shutdown > hooks to StorageService. > This should nearly solve CASSANDRA-12011 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12715) Fix exceptions with the new vnode allocation.
[ https://issues.apache.org/jira/browse/CASSANDRA-12715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15532039#comment-15532039 ] Branimir Lambov commented on CASSANDRA-12715: - Does this test fail if the patch isn't applied? > Fix exceptions with the new vnode allocation. > - > > Key: CASSANDRA-12715 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12715 > Project: Cassandra > Issue Type: Bug >Reporter: Dikang Gu >Assignee: Dikang Gu > Fix For: 3.0.x, 3.x > > > Problem: see exceptions when bootstrapping nodes using the new vnode > allocation algorithm. I'm able to reproduce it in trunk as well: > {code} > INFO [main] 2016-09-26 15:36:54,978 StorageService.java:1437 - JOINING: > calculation complete, ready to bootstrap > INFO [main] 2016-09-26 15:36:54,978 StorageService.java:1437 - JOINING: > getting bootstrap token > ERROR [main] 2016-09-26 15:36:54,989 CassandraDaemon.java:752 - Exception > encountered during startup > java.lang.AssertionError: null > at > org.apache.cassandra.locator.TokenMetadata.getTopology(TokenMetadata.java:1209) > ~[main/:na] > at > org.apache.cassandra.dht.tokenallocator.TokenAllocation.getStrategy(TokenAllocation.java:201) > ~[main/:na] > at > org.apache.cassandra.dht.tokenallocator.TokenAllocation.getStrategy(TokenAllocation.java:164) > ~[main/:na] > at > org.apache.cassandra.dht.tokenallocator.TokenAllocation.allocateTokens(TokenAllocation.java:54) > ~[main/:na] > at > org.apache.cassandra.dht.BootStrapper.allocateTokens(BootStrapper.java:207) > ~[main/:na] > at > org.apache.cassandra.dht.BootStrapper.getBootstrapTokens(BootStrapper.java:174) > ~[main/:na] > at > org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:929) > ~[main/:na] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:697) > ~[main/:na] > at > org.apache.cassandra.service.StorageService.initServer(StorageService.java:582) > ~[main/:na] > at > org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:392) > [main/:na] > at > org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:601) > [main/:na] > at > org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:735) > [main/:na] > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12461) Add hooks to StorageService shutdown
[ https://issues.apache.org/jira/browse/CASSANDRA-12461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15531998#comment-15531998 ] Stefania commented on CASSANDRA-12461: -- +1 for committing the full patch to 3.0 since it fixes a number issues, including making problems like CASSANDRA-12397 much harder to reproduce. I'll do another quick round tomorrow, and if CI results are clean, we can commit. > Add hooks to StorageService shutdown > > > Key: CASSANDRA-12461 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12461 > Project: Cassandra > Issue Type: Bug >Reporter: Anthony Cozzie >Assignee: Anthony Cozzie > Fix For: 3.x > > Attachments: > 0001-CASSANDRA-12461-add-C-support-for-shutdown-runnables.patch > > > The JVM will usually run shutdown hooks in parallel. This can lead to > synchronization problems between Cassandra, services that depend on it, and > services it depends on. This patch adds some simple support for shutdown > hooks to StorageService. > This should nearly solve CASSANDRA-12011 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-12461) Add hooks to StorageService shutdown
[ https://issues.apache.org/jira/browse/CASSANDRA-12461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15531974#comment-15531974 ] Alex Petrov commented on CASSANDRA-12461: - +1 on changes. The initial patch was marked as a "bug", I've re-labeled it as an "improvement". But since we have fixed several issues with drain process ([CASSANDRA-12509], for example), might be good to have it in 3.0? Or should I remove hooks, only leave bugfix for 3.0 patch (will be a trivial change, too), just don't want to create confusion.. Backport to 3.0 was trivial, but we can just skip committing if anything. CI triggered for both branches (although your latest branch only squashed, without any changes on top). |[12461-trunk|https://github.com/ifesdjeen/cassandra/tree/12461-trunk]|[dtest|http://cassci.datastax.com/job/ifesdjeen-12461-trunk-dtest/]|[utest|http://cassci.datastax.com/job/ifesdjeen-12461-trunk-testall/]| |[12461-trunk|https://github.com/ifesdjeen/cassandra/tree/12461-trunk]|[3.0|http://cassci.datastax.com/job/ifesdjeen-12461-trunk-3.0/]|[utest|http://cassci.datastax.com/job/ifesdjeen-12461-trunk-testall/]||[12461-3.0|https://github.com/ifesdjeen/cassandra/tree/12461-3.0]|[dtest|http://cassci.datastax.com/job/ifesdjeen-12461-3.0-dtest/]|[utest|http://cassci.datastax.com/job/ifesdjeen-12461-3.0-testall/]| > Add hooks to StorageService shutdown > > > Key: CASSANDRA-12461 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12461 > Project: Cassandra > Issue Type: Bug >Reporter: Anthony Cozzie >Assignee: Anthony Cozzie > Fix For: 3.x > > Attachments: > 0001-CASSANDRA-12461-add-C-support-for-shutdown-runnables.patch > > > The JVM will usually run shutdown hooks in parallel. This can lead to > synchronization problems between Cassandra, services that depend on it, and > services it depends on. This patch adds some simple support for shutdown > hooks to StorageService. > This should nearly solve CASSANDRA-12011 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-12730) Thousands of empty SSTables created during repair - TMOF death
Benjamin Roth created CASSANDRA-12730: - Summary: Thousands of empty SSTables created during repair - TMOF death Key: CASSANDRA-12730 URL: https://issues.apache.org/jira/browse/CASSANDRA-12730 Project: Cassandra Issue Type: Bug Components: Local Write-Read Paths Reporter: Benjamin Roth Priority: Critical Last night I ran a repair on a keyspace with 7 tables and 4 MVs each containing a few hundret million records. After a few hours a node died because of "too many open files". Normally one would just raise the limit, but: We already set this to 100k. The problem was that the repair created roughly over 100k SSTables for a certain MV. The strange thing is that these SSTables had almost no data (like 53bytes, 90bytes, ...). Some of them (<5%) had a few 100 KB, very few (<1% had normal sizes like >= few MB). I could understand, that SSTables queue up as they are flushed and not compacted in time but then they should have at least a few MB (depending on config and avail mem), right? Of course then the node runs out of FDs and I guess it is not a good idea to raise the limit even higher as I expect that this would just create even more empty SSTables before dying at last. Only 1 CF (MV) was affected. All other CFs (also MVs) behave sanely. Empty SSTables have been created equally over time. 100-150 every minute. Among the empty SSTables there are also Tables that look normal like having few MBs. I didn't see any errors or exceptions in the logs until TMOF occured. Just tons of streams due to the repair (which I actually run over cs-reaper as subrange, full repairs). After having restarted that node (and no more repair running), the number of SSTables went down again as they are compacted away slowly. According to [~zznate] this issue may relate to CASSANDRA-10342 + CASSANDRA-8641 -- This message was sent by Atlassian JIRA (v6.3.4#6332)