[jira] [Assigned] (CASSANDRA-12728) Handling partially written hint files

2016-09-29 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko reassigned CASSANDRA-12728:
-

Assignee: Aleksey Yeschenko

> Handling partially written hint files
> -
>
> Key: CASSANDRA-12728
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12728
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Sharvanath Pathak
>Assignee: Aleksey Yeschenko
>
> {noformat}
> ERROR [HintsDispatcher:1] 2016-09-28 17:44:43,397 
> HintsDispatchExecutor.java:225 - Failed to dispatch hints file 
> d5d7257c-9f81-49b2-8633-6f9bda6e3dea-1474892654160-1.hints: file is corrupted 
> ({})
> org.apache.cassandra.io.FSReadError: java.io.EOFException
> at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:282)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:252)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:156)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:137)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:119) 
> ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:91) 
> ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:259)
>  [apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:242)
>  [apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:220)
>  [apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:199)
>  [apache-cassandra-3.0.6.jar:3.0.6]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_77]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> [na:1.8.0_77]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_77]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_77]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77]
> Caused by: java.io.EOFException: null
> at 
> org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:68)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.ChecksummedDataInput.readFully(ChecksummedDataInput.java:126)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402) 
> ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.readBuffer(HintsReader.java:310)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNextInternal(HintsReader.java:301)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> at 
> org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:278)
>  ~[apache-cassandra-3.0.6.jar:3.0.6]
> ... 15 common frames omitted
> {noformat}
> We've found out that the hint file was truncated because there was a hard 
> reboot around the time of last write to the file. I think we basically need 
> to handle partially written hint files. Also, the CRC file does not exist in 
> this case (probably because it crashed while writing the hints file). May be 
> ignoring and cleaning up such partially written hint files can be a way to 
> fix this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12705) Add column definition kind to system schema dropped columns

2016-09-29 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15532143#comment-15532143
 ] 

Sylvain Lebresne commented on CASSANDRA-12705:
--

fyi, trunk is now 4.0 if you want to rebase.

> Add column definition kind to system schema dropped columns
> ---
>
> Key: CASSANDRA-12705
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12705
> Project: Cassandra
>  Issue Type: Bug
>  Components: Distributed Metadata
>Reporter: Stefania
>Assignee: Stefania
> Fix For: 4.0
>
>
> Both regular and static columns can currently be dropped by users, but this 
> information is currently not stored in {{SchemaKeyspace.DroppedColumns}}. As 
> a consequence, {{CFMetadata.getDroppedColumnDefinition}} returns a regular 
> column and this has caused problems such as CASSANDRA-12582.
> We should add the column kind to {{SchemaKeyspace.DroppedColumns}} so that 
> {{CFMetadata.getDroppedColumnDefinition}} can create the correct column 
> definition. However, altering schema tables would cause inter-node 
> communication failures during a rolling upgrade, see CASSANDRA-12236. 
> Therefore we should wait for a full schema migration when upgrading to the 
> next major version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


cassandra git commit: Change version to 4.0 after creation of 3.X branch

2016-09-29 Thread slebresne
Repository: cassandra
Updated Branches:
  refs/heads/trunk 25d4c7baa -> 78ff37707


Change version to 4.0 after creation of 3.X branch


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/78ff3770
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/78ff3770
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/78ff3770

Branch: refs/heads/trunk
Commit: 78ff3770756c83cd466fe11eb0bf9a5bb7c4
Parents: 25d4c7b
Author: Sylvain Lebresne 
Authored: Thu Sep 29 10:19:34 2016 +0200
Committer: Sylvain Lebresne 
Committed: Thu Sep 29 10:19:34 2016 +0200

--
 CHANGES.txt |  3 +++
 NEWS.txt| 10 ++
 build.xml   |  2 +-
 3 files changed, 14 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/78ff3770/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index a9e46f7..03456a0 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,3 +1,6 @@
+4.0
+
+
 3.10
  * Upgrade metrics-reporter dependencies (CASSANDRA-12089)
  * Tune compaction thread count via nodetool (CASSANDRA-12248)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/78ff3770/NEWS.txt
--
diff --git a/NEWS.txt b/NEWS.txt
index 9ab7c26..e1e952b 100644
--- a/NEWS.txt
+++ b/NEWS.txt
@@ -13,6 +13,16 @@ restore snapshots created with the previous major version 
using the
 'sstableloader' tool. You can upgrade the file format of your snapshots
 using the provided 'sstableupgrade' tool.
 
+4.0
+===
+
+New features
+
+
+Upgrading
+-
+
+
 3.10
 
 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/78ff3770/build.xml
--
diff --git a/build.xml b/build.xml
index 1808808..21cbc85 100644
--- a/build.xml
+++ b/build.xml
@@ -25,7 +25,7 @@
 
 
 
-
+
 
 
 http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=tree"/>



[cassandra] Git Push Summary

2016-09-29 Thread slebresne
Repository: cassandra
Updated Branches:
  refs/heads/3.X [created] 25d4c7baa


[jira] [Comment Edited] (CASSANDRA-12461) Add hooks to StorageService shutdown

2016-09-29 Thread Alex Petrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15531974#comment-15531974
 ] 

Alex Petrov edited comment on CASSANDRA-12461 at 9/29/16 7:41 AM:
--

+1 on changes. 

The initial patch was marked as a "bug", I've re-labeled it as an 
"improvement". But since we have fixed several issues with drain process 
([CASSANDRA-12509], for example), might be good to have it in 3.0? Or should I 
remove hooks, only leave bugfix for 3.0 patch (will be a trivial change, too), 
just don't want to create confusion.. 

Backport to 3.0 was trivial, but we can just skip committing if anything. CI 
triggered for both branches (although your latest branch only squashed, without 
any changes on top).

|[12461-trunk|https://github.com/ifesdjeen/cassandra/tree/12461-trunk]|[dtest|http://cassci.datastax.com/job/ifesdjeen-12461-trunk-dtest/]|[utest|http://cassci.datastax.com/job/ifesdjeen-12461-trunk-testall/]|
|[12461-3.0|https://github.com/ifesdjeen/cassandra/tree/12461-3.0]|[dtest|http://cassci.datastax.com/job/ifesdjeen-12461-3.0-dtest/]|[utest|http://cassci.datastax.com/job/ifesdjeen-12461-3.0-testall/]|


was (Author: ifesdjeen):
+1 on changes. 

The initial patch was marked as a "bug", I've re-labeled it as an 
"improvement". But since we have fixed several issues with drain process 
([CASSANDRA-12509], for example), might be good to have it in 3.0? Or should I 
remove hooks, only leave bugfix for 3.0 patch (will be a trivial change, too), 
just don't want to create confusion.. 

Backport to 3.0 was trivial, but we can just skip committing if anything. CI 
triggered for both branches (although your latest branch only squashed, without 
any changes on top).

|[12461-trunk|https://github.com/ifesdjeen/cassandra/tree/12461-trunk]|[dtest|http://cassci.datastax.com/job/ifesdjeen-12461-trunk-dtest/]|[utest|http://cassci.datastax.com/job/ifesdjeen-12461-trunk-testall/]|
|[12461-trunk|https://github.com/ifesdjeen/cassandra/tree/12461-trunk]|[3.0|http://cassci.datastax.com/job/ifesdjeen-12461-trunk-3.0/]|[utest|http://cassci.datastax.com/job/ifesdjeen-12461-trunk-testall/]||[12461-3.0|https://github.com/ifesdjeen/cassandra/tree/12461-3.0]|[dtest|http://cassci.datastax.com/job/ifesdjeen-12461-3.0-dtest/]|[utest|http://cassci.datastax.com/job/ifesdjeen-12461-3.0-testall/]|

> Add hooks to StorageService shutdown
> 
>
> Key: CASSANDRA-12461
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12461
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Anthony Cozzie
>Assignee: Anthony Cozzie
> Fix For: 3.x
>
> Attachments: 
> 0001-CASSANDRA-12461-add-C-support-for-shutdown-runnables.patch
>
>
> The JVM will usually run shutdown hooks in parallel.  This can lead to 
> synchronization problems between Cassandra, services that depend on it, and 
> services it depends on.  This patch adds some simple support for shutdown 
> hooks to StorageService.
> This should nearly solve CASSANDRA-12011



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-12461) Add hooks to StorageService shutdown

2016-09-29 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15531998#comment-15531998
 ] 

Stefania edited comment on CASSANDRA-12461 at 9/29/16 7:38 AM:
---

+1 for committing the full patch to 3.0 since it fixes a number issues, 
including things like CASSANDRA-12397.

I'll do another quick round tomorrow, and if CI results are clean, we can 
commit.


was (Author: stefania):
+1 for committing the full patch to 3.0 since it fixes a number issues, 
including making problems like CASSANDRA-12397 much harder to reproduce.

I'll do another quick round tomorrow, and if CI results are clean, we can 
commit.

> Add hooks to StorageService shutdown
> 
>
> Key: CASSANDRA-12461
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12461
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Anthony Cozzie
>Assignee: Anthony Cozzie
> Fix For: 3.x
>
> Attachments: 
> 0001-CASSANDRA-12461-add-C-support-for-shutdown-runnables.patch
>
>
> The JVM will usually run shutdown hooks in parallel.  This can lead to 
> synchronization problems between Cassandra, services that depend on it, and 
> services it depends on.  This patch adds some simple support for shutdown 
> hooks to StorageService.
> This should nearly solve CASSANDRA-12011



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12715) Fix exceptions with the new vnode allocation.

2016-09-29 Thread Branimir Lambov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15532039#comment-15532039
 ] 

Branimir Lambov commented on CASSANDRA-12715:
-

Does this test fail if the patch isn't applied?

> Fix exceptions with the new vnode allocation.
> -
>
> Key: CASSANDRA-12715
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12715
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Dikang Gu
>Assignee: Dikang Gu
> Fix For: 3.0.x, 3.x
>
>
> Problem: see exceptions when bootstrapping nodes using the new vnode 
> allocation algorithm. I'm able to reproduce it in trunk as well:
> {code}
> INFO  [main] 2016-09-26 15:36:54,978 StorageService.java:1437 - JOINING: 
> calculation complete, ready to bootstrap
> INFO  [main] 2016-09-26 15:36:54,978 StorageService.java:1437 - JOINING: 
> getting bootstrap token
> ERROR [main] 2016-09-26 15:36:54,989 CassandraDaemon.java:752 - Exception 
> encountered during startup
> java.lang.AssertionError: null
> at 
> org.apache.cassandra.locator.TokenMetadata.getTopology(TokenMetadata.java:1209)
>  ~[main/:na]
> at 
> org.apache.cassandra.dht.tokenallocator.TokenAllocation.getStrategy(TokenAllocation.java:201)
>  ~[main/:na]
> at 
> org.apache.cassandra.dht.tokenallocator.TokenAllocation.getStrategy(TokenAllocation.java:164)
>  ~[main/:na]
> at 
> org.apache.cassandra.dht.tokenallocator.TokenAllocation.allocateTokens(TokenAllocation.java:54)
>  ~[main/:na]
> at 
> org.apache.cassandra.dht.BootStrapper.allocateTokens(BootStrapper.java:207) 
> ~[main/:na]
> at 
> org.apache.cassandra.dht.BootStrapper.getBootstrapTokens(BootStrapper.java:174)
>  ~[main/:na]
> at 
> org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:929)
>  ~[main/:na]
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:697)
>  ~[main/:na]
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:582)
>  ~[main/:na]
> at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:392) 
> [main/:na]
> at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:601)
>  [main/:na]
> at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:735) 
> [main/:na]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12461) Add hooks to StorageService shutdown

2016-09-29 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15531998#comment-15531998
 ] 

Stefania commented on CASSANDRA-12461:
--

+1 for committing the full patch to 3.0 since it fixes a number issues, 
including making problems like CASSANDRA-12397 much harder to reproduce.

I'll do another quick round tomorrow, and if CI results are clean, we can 
commit.

> Add hooks to StorageService shutdown
> 
>
> Key: CASSANDRA-12461
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12461
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Anthony Cozzie
>Assignee: Anthony Cozzie
> Fix For: 3.x
>
> Attachments: 
> 0001-CASSANDRA-12461-add-C-support-for-shutdown-runnables.patch
>
>
> The JVM will usually run shutdown hooks in parallel.  This can lead to 
> synchronization problems between Cassandra, services that depend on it, and 
> services it depends on.  This patch adds some simple support for shutdown 
> hooks to StorageService.
> This should nearly solve CASSANDRA-12011



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12461) Add hooks to StorageService shutdown

2016-09-29 Thread Alex Petrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15531974#comment-15531974
 ] 

Alex Petrov commented on CASSANDRA-12461:
-

+1 on changes. 

The initial patch was marked as a "bug", I've re-labeled it as an 
"improvement". But since we have fixed several issues with drain process 
([CASSANDRA-12509], for example), might be good to have it in 3.0? Or should I 
remove hooks, only leave bugfix for 3.0 patch (will be a trivial change, too), 
just don't want to create confusion.. 

Backport to 3.0 was trivial, but we can just skip committing if anything. CI 
triggered for both branches (although your latest branch only squashed, without 
any changes on top).

|[12461-trunk|https://github.com/ifesdjeen/cassandra/tree/12461-trunk]|[dtest|http://cassci.datastax.com/job/ifesdjeen-12461-trunk-dtest/]|[utest|http://cassci.datastax.com/job/ifesdjeen-12461-trunk-testall/]|
|[12461-trunk|https://github.com/ifesdjeen/cassandra/tree/12461-trunk]|[3.0|http://cassci.datastax.com/job/ifesdjeen-12461-trunk-3.0/]|[utest|http://cassci.datastax.com/job/ifesdjeen-12461-trunk-testall/]||[12461-3.0|https://github.com/ifesdjeen/cassandra/tree/12461-3.0]|[dtest|http://cassci.datastax.com/job/ifesdjeen-12461-3.0-dtest/]|[utest|http://cassci.datastax.com/job/ifesdjeen-12461-3.0-testall/]|

> Add hooks to StorageService shutdown
> 
>
> Key: CASSANDRA-12461
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12461
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Anthony Cozzie
>Assignee: Anthony Cozzie
> Fix For: 3.x
>
> Attachments: 
> 0001-CASSANDRA-12461-add-C-support-for-shutdown-runnables.patch
>
>
> The JVM will usually run shutdown hooks in parallel.  This can lead to 
> synchronization problems between Cassandra, services that depend on it, and 
> services it depends on.  This patch adds some simple support for shutdown 
> hooks to StorageService.
> This should nearly solve CASSANDRA-12011



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-12730) Thousands of empty SSTables created during repair - TMOF death

2016-09-29 Thread Benjamin Roth (JIRA)
Benjamin Roth created CASSANDRA-12730:
-

 Summary: Thousands of empty SSTables created during repair - TMOF 
death
 Key: CASSANDRA-12730
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12730
 Project: Cassandra
  Issue Type: Bug
  Components: Local Write-Read Paths
Reporter: Benjamin Roth
Priority: Critical


Last night I ran a repair on a keyspace with 7 tables and 4 MVs each containing 
a few hundret million records. After a few hours a node died because of "too 
many open files".
Normally one would just raise the limit, but: We already set this to 100k. The 
problem was that the repair created roughly over 100k SSTables for a certain 
MV. The strange thing is that these SSTables had almost no data (like 53bytes, 
90bytes, ...). Some of them (<5%) had a few 100 KB, very few (<1% had normal 
sizes like >= few MB). I could understand, that SSTables queue up as they are 
flushed and not compacted in time but then they should have at least a few MB 
(depending on config and avail mem), right?
Of course then the node runs out of FDs and I guess it is not a good idea to 
raise the limit even higher as I expect that this would just create even more 
empty SSTables before dying at last.

Only 1 CF (MV) was affected. All other CFs (also MVs) behave sanely. Empty 
SSTables have been created equally over time. 100-150 every minute. Among the 
empty SSTables there are also Tables that look normal like having few MBs.
I didn't see any errors or exceptions in the logs until TMOF occured. Just tons 
of streams due to the repair (which I actually run over cs-reaper as subrange, 
full repairs).
After having restarted that node (and no more repair running), the number of 
SSTables went down again as they are compacted away slowly.

According to [~zznate] this issue may relate to CASSANDRA-10342 + CASSANDRA-8641



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


<    1   2