[jira] [Updated] (CASSANDRA-12035) Structure for tpstats output (JSON, YAML)

2016-06-20 Thread Shogo Hoshii (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shogo Hoshii updated CASSANDRA-12035:
-
Assignee: (was: Shogo Hoshii)

> Structure for tpstats output (JSON, YAML)
> -
>
> Key: CASSANDRA-12035
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12035
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Hiroyuki Nishi
>Priority: Minor
> Attachments: CASSANDRA-12035-trunk.patch, 
> tablestats_sample_result.json, tablestats_sample_result.txt, 
> tablestats_sample_result.yaml, tpstats_sample_result.json, 
> tpstats_sample_result.txt, tpstats_sample_result.yaml
>
>
> In CASSANDRA-5977, some extra output formats such as JSON and YAML were added 
> for nodetool tablestats. 
> Similarly, I would like to add the output formats in nodetool tpstats.
> Also, I tried to refactor the tablestats's code about the output formats to 
> integrate the existing code with my code.
> Please review the attached patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (CASSANDRA-12040) If a level compaction fails due to no space it should schedule the next one

2016-06-20 Thread sankalp kohli (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sankalp kohli reassigned CASSANDRA-12040:
-

Assignee: sankalp kohli

>   If a level compaction fails due to no space it should schedule the next one
> -
>
> Key: CASSANDRA-12040
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12040
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: sankalp kohli
>Priority: Minor
>
> If a level compaction fails the space check, it aborts but next time the 
> compactions are scheduled it will attempt the same one. It should skip it and 
> go to the next so it can find smaller compactions to do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12043) Syncing most recent commit in CAS across replicas can cause all CAS queries in the CQL partition to fail

2016-06-20 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340972#comment-15340972
 ] 

sankalp kohli commented on CASSANDRA-12043:
---

cc [~slebresne] 

> Syncing most recent commit in CAS across replicas can cause all CAS queries 
> in the CQL partition to fail
> 
>
> Key: CASSANDRA-12043
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12043
> Project: Cassandra
>  Issue Type: Bug
>Reporter: sankalp kohli
>
> We update the most recent commit on requiredParticipant replicas if out of 
> sync during the prepare round in beginAndRepairPaxos method. We keep doing 
> this in a loop till the requiredParticipant replicas have the same most 
> recent commit or we hit timeout. 
> Say we have 3 machines A,B and C and gc grace on the table is 10 days. We do 
> a CAS write at time 0 and it went to A and B but not to C.  C will get the 
> hint later but will not update the most recent commit in paxos table. This is 
> how CAS hints work. 
> In the paxos table whose gc_grace=0, most_recent_commit in A and B will be 
> inserted with timestamp 0 and with a TTL of 10 days. After 10 days, this 
> insert will become a tombstone at time 0 till it is compacted away since 
> gc_grace=0.
> Do a CAS read after say 1 day on the same CQL partition and this time prepare 
> phase involved A and C. most_recent_commit on C for this CQL partition is 
> empty. A sends the most_recent_commit to C with a timestamp of 0 and with a 
> TTL of 10 days. This most_recent_commit on C will expire on 11th day since it 
> is inserted after 1 day. 
> most_recent_commit are now in sync on A,B and C, however A and B 
> most_recent_commit will expire on 10th day whereas for C it will expire on 
> 11th day since it was inserted one day later. 
> Do another CAS read after 10days when most_recent_commit on A and B have 
> expired and is treated as tombstones till compacted. In this CAS read, say A 
> and C are involved in prepare phase. most_recent_commit will not match 
> between them since it is expired in A and is still there on C. This will 
> cause most_recent_commit to be applied to A with a timestamp of 0 and TTL of 
> 10 days. If A has not compacted away the original most_recent_commit which 
> has expired, this new write to most_recent_commit wont be visible on reads 
> since there is a tombstone with same timestamp(Delete wins over data with 
> same timestamp). 
> Another round of prepare will follow and again A would say it does not know 
> about most_recent_write(covered by original write which is not a tombstone) 
> and C will again try to send the write to A. This can keep going on till the 
> request timeouts or only A and B are involved in the prepare phase. 
> When A’s original most_recent_commit which is now a tombstone is compacted, 
> all the inserts which it was covering will come live. This will in turn again 
> get played to another replica. This ping pong can keep going on for a long 
> time. 
> The issue is that most_recent_commit is expiring at different times across 
> replicas. When they get replayed to a replica to make it in sync, we again 
> set the TTL from that point.  
> During the CAS read which timed out, most_recent_commit was being sent to 
> another replica in a loop. Even in successful requests, it will try to loop 
> for a couple of times if involving A and C and then when the replicas which 
> respond are A and B, it will succeed. So this will have impact on latencies 
> as well. 
> These timeouts gets worse when a machine is down as no progress can be made 
> as the machine with unexpired commit is always involved in the CAS prepare 
> round. Also with range movements, the new machine gaining range has empty 
> most recent commit and gets the commit at a later time causing same issue. 
> Repro steps:
> 1. Paxos TTL is max(3 hours, gc_grace) as defined in 
> SystemKeyspace.paxosTtl(). Change this method to not put a minimum TTL of 3 
> hours. 
> Method  SystemKeyspace.paxosTtl() will look like return 
> metadata.getGcGraceSeconds();   instead of return Math.max(3 * 3600, 
> metadata.getGcGraceSeconds());
> We are doing this so that we dont need to wait for 3 hours. 
> Create a 3 node cluster with the code change suggested above with machines 
> A,B and C
> CREATE KEYSPACE  test WITH REPLICATION = { 'class' : 'SimpleStrategy', 
> 'replication_factor' : 3 };
> use test;
> CREATE TABLE users (a int PRIMARY KEY,b int);
> alter table users WITH gc_grace_seconds=120;
> consistency QUORUM;
> bring down machine C
> INSERT INTO users (user_name, password ) VALUES ( 1,1) IF NOT EXISTS;
> Nodetool flush on machine A and B
> Bring up the down machine B 
> consistency SERIAL;
> tracing on;
> wait 80 seconds
> Bring up machine C
> select * 

[jira] [Created] (CASSANDRA-12043) Syncing most recent commit in CAS across replicas can cause all CAS queries in the CQL partition to fail

2016-06-20 Thread sankalp kohli (JIRA)
sankalp kohli created CASSANDRA-12043:
-

 Summary: Syncing most recent commit in CAS across replicas can 
cause all CAS queries in the CQL partition to fail
 Key: CASSANDRA-12043
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12043
 Project: Cassandra
  Issue Type: Bug
Reporter: sankalp kohli


We update the most recent commit on requiredParticipant replicas if out of sync 
during the prepare round in beginAndRepairPaxos method. We keep doing this in a 
loop till the requiredParticipant replicas have the same most recent commit or 
we hit timeout. 

Say we have 3 machines A,B and C and gc grace on the table is 10 days. We do a 
CAS write at time 0 and it went to A and B but not to C.  C will get the hint 
later but will not update the most recent commit in paxos table. This is how 
CAS hints work. 
In the paxos table whose gc_grace=0, most_recent_commit in A and B will be 
inserted with timestamp 0 and with a TTL of 10 days. After 10 days, this insert 
will become a tombstone at time 0 till it is compacted away since gc_grace=0.

Do a CAS read after say 1 day on the same CQL partition and this time prepare 
phase involved A and C. most_recent_commit on C for this CQL partition is 
empty. A sends the most_recent_commit to C with a timestamp of 0 and with a TTL 
of 10 days. This most_recent_commit on C will expire on 11th day since it is 
inserted after 1 day. 

most_recent_commit are now in sync on A,B and C, however A and B 
most_recent_commit will expire on 10th day whereas for C it will expire on 11th 
day since it was inserted one day later. 

Do another CAS read after 10days when most_recent_commit on A and B have 
expired and is treated as tombstones till compacted. In this CAS read, say A 
and C are involved in prepare phase. most_recent_commit will not match between 
them since it is expired in A and is still there on C. This will cause 
most_recent_commit to be applied to A with a timestamp of 0 and TTL of 10 days. 
If A has not compacted away the original most_recent_commit which has expired, 
this new write to most_recent_commit wont be visible on reads since there is a 
tombstone with same timestamp(Delete wins over data with same timestamp). 

Another round of prepare will follow and again A would say it does not know 
about most_recent_write(covered by original write which is not a tombstone) and 
C will again try to send the write to A. This can keep going on till the 
request timeouts or only A and B are involved in the prepare phase. 

When A’s original most_recent_commit which is now a tombstone is compacted, all 
the inserts which it was covering will come live. This will in turn again get 
played to another replica. This ping pong can keep going on for a long time. 

The issue is that most_recent_commit is expiring at different times across 
replicas. When they get replayed to a replica to make it in sync, we again set 
the TTL from that point.  
During the CAS read which timed out, most_recent_commit was being sent to 
another replica in a loop. Even in successful requests, it will try to loop for 
a couple of times if involving A and C and then when the replicas which respond 
are A and B, it will succeed. So this will have impact on latencies as well. 

These timeouts gets worse when a machine is down as no progress can be made as 
the machine with unexpired commit is always involved in the CAS prepare round. 
Also with range movements, the new machine gaining range has empty most recent 
commit and gets the commit at a later time causing same issue. 

Repro steps:
1. Paxos TTL is max(3 hours, gc_grace) as defined in SystemKeyspace.paxosTtl(). 
Change this method to not put a minimum TTL of 3 hours. 
Method  SystemKeyspace.paxosTtl() will look like return 
metadata.getGcGraceSeconds();   instead of return Math.max(3 * 3600, 
metadata.getGcGraceSeconds());
We are doing this so that we dont need to wait for 3 hours. 

Create a 3 node cluster with the code change suggested above with machines A,B 
and C
CREATE KEYSPACE  test WITH REPLICATION = { 'class' : 'SimpleStrategy', 
'replication_factor' : 3 };
use test;
CREATE TABLE users (a int PRIMARY KEY,b int);
alter table users WITH gc_grace_seconds=120;
consistency QUORUM;
bring down machine C
INSERT INTO users (user_name, password ) VALUES ( 1,1) IF NOT EXISTS;
Nodetool flush on machine A and B
Bring up the down machine B 
consistency SERIAL;
tracing on;
wait 80 seconds
Bring up machine C
select * from users where user_name = 1;
Wait 40 seconds 
select * from users where user_name = 1;  //All queries from this point forward 
will timeout. 

One of the potential fixes could be to set the TTL based on the remaining time 
left on another replicas. This will be TTL-timestamp of write. This timestamp 
is calculated from ballot which uses server time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (CASSANDRA-12035) Structure for tpstats output (JSON, YAML)

2016-06-20 Thread Shogo Hoshii (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shogo Hoshii reassigned CASSANDRA-12035:


Assignee: Shogo Hoshii

> Structure for tpstats output (JSON, YAML)
> -
>
> Key: CASSANDRA-12035
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12035
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Hiroyuki Nishi
>Assignee: Shogo Hoshii
>Priority: Minor
> Attachments: CASSANDRA-12035-trunk.patch, 
> tablestats_sample_result.json, tablestats_sample_result.txt, 
> tablestats_sample_result.yaml, tpstats_sample_result.json, 
> tpstats_sample_result.txt, tpstats_sample_result.yaml
>
>
> In CASSANDRA-5977, some extra output formats such as JSON and YAML were added 
> for nodetool tablestats. 
> Similarly, I would like to add the output formats in nodetool tpstats.
> Also, I tried to refactor the tablestats's code about the output formats to 
> integrate the existing code with my code.
> Please review the attached patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11611) dtest failure in topology_test.TestTopology.crash_during_decommission_test

2016-06-20 Thread Philip Thompson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340882#comment-15340882
 ] 

Philip Thompson commented on CASSANDRA-11611:
-

This has begun happening on Linux as well, fyi

> dtest failure in topology_test.TestTopology.crash_during_decommission_test
> --
>
> Key: CASSANDRA-11611
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11611
> Project: Cassandra
>  Issue Type: Test
>Reporter: Jim Witschey
>Assignee: DS Test Eng
>  Labels: dtest, windows
>
> Looks like some kind of streaming error. Example failure:
> http://cassci.datastax.com/job/trunk_dtest_win32/382/testReport/topology_test/TestTopology/crash_during_decommission_test
> Failed on CassCI build trunk_dtest_win32 #382
> {code}
> Error Message
> Unexpected error in log, see stdout
>  >> begin captured logging << 
> dtest: DEBUG: cluster ccm directory: d:\temp\dtest-ce_wos
> dtest: DEBUG: Custom init_config not found. Setting defaults.
> dtest: DEBUG: Done setting configuration options:
> {   'initial_token': None,
> 'num_tokens': '32',
> 'phi_convict_threshold': 5,
> 'range_request_timeout_in_ms': 1,
> 'read_request_timeout_in_ms': 1,
> 'request_timeout_in_ms': 1,
> 'truncate_request_timeout_in_ms': 1,
> 'write_request_timeout_in_ms': 1}
> dtest: DEBUG: Status as reported by node 127.0.0.2
> dtest: DEBUG: Datacenter: datacenter1
> 
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  AddressLoad   Tokens   Owns (effective)  Host ID  
>  Rack
> UN  127.0.0.1  98.73 KiB  32   78.4% 
> b8c55c71-bf3d-462b-8c17-3c88d7ac2284  rack1
> UN  127.0.0.2  162.38 KiB  32   65.9% 
> 71aacf1d-8e2f-44cf-b354-f10c71313ec6  rack1
> UN  127.0.0.3  98.71 KiB  32   55.7% 
> 3a4529a3-dc7f-445c-aec3-94417c920fdf  rack1
> dtest: DEBUG: Restarting node2
> dtest: DEBUG: Status as reported by node 127.0.0.2
> dtest: DEBUG: Datacenter: datacenter1
> 
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  AddressLoad   Tokens   Owns (effective)  Host ID  
>  Rack
> UL  127.0.0.1  98.73 KiB  32   78.4% 
> b8c55c71-bf3d-462b-8c17-3c88d7ac2284  rack1
> UN  127.0.0.2  222.26 KiB  32   65.9% 
> 71aacf1d-8e2f-44cf-b354-f10c71313ec6  rack1
> UN  127.0.0.3  98.71 KiB  32   55.7% 
> 3a4529a3-dc7f-445c-aec3-94417c920fdf  rack1
> dtest: DEBUG: Restarting node2
> dtest: DEBUG: Status as reported by node 127.0.0.2
> dtest: DEBUG: Datacenter: datacenter1
> 
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  AddressLoad   Tokens   Owns (effective)  Host ID  
>  Rack
> UL  127.0.0.1  174.2 KiB  32   78.4% 
> b8c55c71-bf3d-462b-8c17-3c88d7ac2284  rack1
> UN  127.0.0.2  336.69 KiB  32   65.9% 
> 71aacf1d-8e2f-44cf-b354-f10c71313ec6  rack1
> UN  127.0.0.3  116.7 KiB  32   55.7% 
> 3a4529a3-dc7f-445c-aec3-94417c920fdf  rack1
> dtest: DEBUG: Restarting node2
> dtest: DEBUG: Status as reported by node 127.0.0.2
> dtest: DEBUG: Datacenter: datacenter1
> 
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  AddressLoad   Tokens   Owns (effective)  Host ID  
>  Rack
> UL  127.0.0.1  174.2 KiB  32   78.4% 
> b8c55c71-bf3d-462b-8c17-3c88d7ac2284  rack1
> UN  127.0.0.2  360.82 KiB  32   65.9% 
> 71aacf1d-8e2f-44cf-b354-f10c71313ec6  rack1
> UN  127.0.0.3  116.7 KiB  32   55.7% 
> 3a4529a3-dc7f-445c-aec3-94417c920fdf  rack1
> dtest: DEBUG: Restarting node2
> dtest: DEBUG: Status as reported by node 127.0.0.2
> dtest: DEBUG: Datacenter: datacenter1
> 
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  AddressLoad   Tokens   Owns (effective)  Host ID  
>  Rack
> UL  127.0.0.1  174.2 KiB  32   78.4% 
> b8c55c71-bf3d-462b-8c17-3c88d7ac2284  rack1
> UN  127.0.0.2  240.54 KiB  32   65.9% 
> 71aacf1d-8e2f-44cf-b354-f10c71313ec6  rack1
> UN  127.0.0.3  116.7 KiB  32   55.7% 
> 3a4529a3-dc7f-445c-aec3-94417c920fdf  rack1
> dtest: DEBUG: Restarting node2
> dtest: DEBUG: Decommission failed with exception: Nodetool command 
> 'D:\jenkins\workspace\trunk_dtest_win32\cassandra\bin\nodetool.bat -h 
> localhost -p 7100 decommission' failed; exit status: 2; stderr: error: Stream 
> 

[jira] [Created] (CASSANDRA-12042) Decouple messaging protocol versioning from individual message types

2016-06-20 Thread Aleksey Yeschenko (JIRA)
Aleksey Yeschenko created CASSANDRA-12042:
-

 Summary: Decouple messaging protocol versioning from individual 
message types
 Key: CASSANDRA-12042
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12042
 Project: Cassandra
  Issue Type: Improvement
  Components: Streaming and Messaging
Reporter: Aleksey Yeschenko
Priority: Blocker
 Fix For: 4.0


At the moment we have a single constant - {{MessagingService.current_version}} 
defining serialization format for *everything*, including every possible 
message type.

In practice it means that any tiniest change to any message requires bumping 
the global {{MessagingService}} version.

This is problematic for several reasons, the primary of which is currently the 
schema propagation barrier between differently versioned C* nodes. In tick-tock 
world, it means that any change (say, to a read command message), would require 
a messaging service bump, putting nodes on split versions of the service, and 
making schema changes during this now considered minor upgrade, impossible, 
which is not neat.

I propose that starting with 4.0 we version all messages individually instead, 
and separately version messaging service protocol itself - which will basically 
amount to just framing, once CASSANDRA-8457 is completed.

In practice, this might be implemented the following way:

# We use an extra byte with each message to specify the version of that 
particular message type encoding
# Instead of relying on messaging service of the sending note (determining 
which can be racy, especially during upgrades), we use that byte to determine 
the version of the message during deserialisation
# On sending side, we can use the gossipped version of Cassandra itself - not 
the messaging service version - to determine the maximum supported message type 
version of the destination node

In the new world, I expect the framing protocol version to change very rarely 
after 4.0, if ever, and most message types to change extremely rarely as well, 
with schema, read, and write messages to change version most often.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10202) simplify CommitLogSegmentManager

2016-06-20 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340800#comment-15340800
 ] 

Ariel Weisberg commented on CASSANDRA-10202:


bq. We also ideally want lock-free swapping-in of the new segment, no? 
Currently we don't have it, but until we reach pure-TPC (probably never) en 
route fewer application threads exposes us to a higher risk of gumming up the 
system.
This is the part I don't follow. It's an operation that occurs every N writes, 
when the segment needs swapping, and the operation itself is generally going to 
be fast because the segment will already exist so it shouldn't have to wait on 
anything or even make a system call.

Most allocations will only have to do a volatile read to fetch the current 
segment and an atomic increment on the segment's offset value, or whatever it's 
labeled inside each segment, and only if that fails or there is no space will 
it have to lock and check the condition.

TPC doesn't mean you have to religiously remove every single lock in the system 
just the ones that are frequently contended or held a long time. IOW anything 
that will cause stalls. Either fewer long stalls, or many small stalls. It will 
be a while before we get whacked by Amdahl in this case.

I know you know this already. I am just stating the reasoning I think should be 
applied in this case. Were the bugs related to concurrency or was it just 
incorrect handling of the spare segments and allocation in general? It was 
always pretty finicky looking.

> simplify CommitLogSegmentManager
> 
>
> Key: CASSANDRA-10202
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10202
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: Jonathan Ellis
>Assignee: Branimir Lambov
>Priority: Minor
>
> Now that we only keep one active segment around we can simplify this from the 
> old recycling design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-06-20 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15337288#comment-15337288
 ] 

Paulo Motta edited comment on CASSANDRA-8523 at 6/21/16 12:30 AM:
--

Due to the limitations of forwarding writes to replacement nodes with the same 
IP, I propose initially adding this support only to replacement nodes with a 
different IP, since it's much simpler and we can do it in a backward-compatible 
way so it can probably go on 2.2+.

After CASSANDRA-11559, we can extend this support to nodes with the same IP 
quite easily by setting an inactive flag on nodes being replaced and ignore 
these nodes on read.

The central idea is:
{quote}
* Add a new non-dead gossip state for replace BOOT_REPLACE
* When receiving BOOT_REPLACE, other node adds the replacing node as 
bootstrapping endpoint
* Pending ranges are calculated, and writes are sent to the replacing node 
during replace
* When replacing node changes state to NORMAL, the old node is removed and the 
new node becomes a natural endpoint on TokenMetadata
{quote}

Since it's no longer necessary to forward hints to the replacement node when 
{{replace_address != broadcast_address}}, the replacement node does not need to 
inherit the same ID of the original node.

The replacing process remains unchanged when the replacement node has the same 
IP as the original node. If that's the case, I added a warn message so users 
know they need to run repair if the node is down for longer than 
{{max_hint_window_in_ms}}:
{noformat}
Writes will not be redirected to this node while it is performing replace 
because it has the same address as the node to be replaced ({}). 
If that node has been down for longer than max_hint_window_in_ms, repair must 
be run after the replacement process in order to make this node consistent.
{noformat}

I adapted current dtests to test replace_address for both the old and the new 
path, and when {{replace_address != broadcast_address}} make sure writes are 
being redirected to the replacement node.

Initial patch and tests below (will provide 2.2+ patches after initial review):
||2.2||dtest||
|[branch|https://github.com/apache/cassandra/compare/cassandra-2.2...pauloricardomg:2.2-8523]|[branch|https://github.com/riptano/cassandra-dtest/compare/master...pauloricardomg:8523]|
|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.2-8523-testall/lastCompletedBuild/testReport/]|
|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.2-8523-dtest/lastCompletedBuild/testReport/]|


was (Author: pauloricardomg):
Due to the limitations of forwarding writes to replacement nodes with the same 
IP, I propose initially adding this support only to replacement nodes with a 
different IP, since it's much simpler and we can do it in a backward-compatible 
way so it can probably go on 2.2+.

After CASSANDRA-11559, we can extend this support to nodes with the same IP 
quite easily by setting an inactive flag on nodes being replaced and ignore 
these nodes on read.

The central idea is:
{quote}
* Add a new non-dead gossip state for replace BOOT_REPLACE
* When receiving BOOT_REPLACE, other node adds the replacing node as 
bootstrapping endpoint
* Pending ranges are calculated, and writes are sent to the replacing node 
during replace
* When replacing node changes state to NORMAL, the old node is removed and the 
new node becomes a natural endpoint on TokenMetadata
* The final step is to change the original node state to REMOVED_TOKEN so other 
nodes evict the original node from gossip
{quote}

Since it's no longer necessary to forward hints to the replacement node when 
{{replace_address != broadcast_address}}, the replacement node does not need to 
inherit the same ID of the original node.

The replacing process remains unchanged when the replacement node has the same 
IP as the original node. If that's the case, I added a warn message so users 
know they need to run repair if the node is down for longer than 
{{max_hint_window_in_ms}}:
{noformat}
Writes will not be redirected to this node while it is performing replace 
because it has the same address as the node to be replaced ({}). 
If that node has been down for longer than max_hint_window_in_ms, repair must 
be run after the replacement process in order to make this node consistent.
{noformat}

I adapted current dtests to test replace_address for both the old and the new 
path, and when {{replace_address != broadcast_address}} make sure writes are 
being redirected to the replacement node.

Initial patch and tests below (will provide 2.2+ patches after initial review):
||2.2||dtest||
|[branch|https://github.com/apache/cassandra/compare/cassandra-2.2...pauloricardomg:2.2-8523]|[branch|https://github.com/riptano/cassandra-dtest/compare/master...pauloricardomg:8523]|

[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-06-20 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340771#comment-15340771
 ] 

Paulo Motta commented on CASSANDRA-8523:


bq. The hints that you might care about are writes dropped during the 
replacement on the replacing node. 
Even these should not be a problem, because they are also forwarded to the 
replacement node, and if they fail they are hinted to replacement node ID.

I rebased and resubmitted dtests and fixed minor typo on the original patch. I 
also realized it's not necessary to change original node state to 
{{REMOVED_TOKEN}} because it's already removed from gossip when the replacement 
node changes its state to {{NORMAL}}, so I removed that step.

Also added new dtests to test replace in a mixed-version environment to verify 
backward compatibility and to check for the warning message when replacing a 
node with the same address.

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11957) Implement seek() of org.apache.cassandra.db.commitlog.EncryptedFileSegmentInputStream

2016-06-20 Thread Imran Chaudhry (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Imran Chaudhry updated CASSANDRA-11957:
---
Attachment: 11957-trunk.txt

> Implement seek() of 
> org.apache.cassandra.db.commitlog.EncryptedFileSegmentInputStream 
> --
>
> Key: CASSANDRA-11957
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11957
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination, Local Write-Read Paths
>Reporter: Imran Chaudhry
>Assignee: Imran Chaudhry
>Priority: Critical
> Fix For: 3.x
>
> Attachments: 11957-trunk.txt
>
>
> CDC needs the seek() method of 
> org.apache.cassandra.db.commitlog.EncryptedFileSegmentInputStream implemented 
> (currently throws an exception.)
> CommitLogs are read using this stream and the seek() method needs to be 
> implemented so that mutations which are appended to the currently active 
> commitlog can be read out in realtime.
>  
> Current implementation is:
>   
> public void seek(long position)
> {
> // implement this when we actually need it
> throw new UnsupportedOperationException();
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11957) Implement seek() of org.apache.cassandra.db.commitlog.EncryptedFileSegmentInputStream

2016-06-20 Thread Imran Chaudhry (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Imran Chaudhry updated CASSANDRA-11957:
---
Attachment: (was: 11957-trunk.txt)

> Implement seek() of 
> org.apache.cassandra.db.commitlog.EncryptedFileSegmentInputStream 
> --
>
> Key: CASSANDRA-11957
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11957
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination, Local Write-Read Paths
>Reporter: Imran Chaudhry
>Assignee: Imran Chaudhry
>Priority: Critical
> Fix For: 3.x
>
> Attachments: 11957-trunk.txt
>
>
> CDC needs the seek() method of 
> org.apache.cassandra.db.commitlog.EncryptedFileSegmentInputStream implemented 
> (currently throws an exception.)
> CommitLogs are read using this stream and the seek() method needs to be 
> implemented so that mutations which are appended to the currently active 
> commitlog can be read out in realtime.
>  
> Current implementation is:
>   
> public void seek(long position)
> {
> // implement this when we actually need it
> throw new UnsupportedOperationException();
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12032) Update to Netty 4.0.37

2016-06-20 Thread Jeremiah Jordan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340548#comment-15340548
 ] 

Jeremiah Jordan commented on CASSANDRA-12032:
-

Patch is missing updates to the license files and the build dependancies for 
pom generation.

> Update to Netty 4.0.37
> --
>
> Key: CASSANDRA-12032
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12032
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Robert Stupp
>Assignee: Robert Stupp
> Fix For: 3.x
>
>
> Update Netty to 4.0.37
> (no C* code changes in this ticket)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11957) Implement seek() of org.apache.cassandra.db.commitlog.EncryptedFileSegmentInputStream

2016-06-20 Thread Imran Chaudhry (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Imran Chaudhry updated CASSANDRA-11957:
---
Attachment: 11957-trunk.txt

> Implement seek() of 
> org.apache.cassandra.db.commitlog.EncryptedFileSegmentInputStream 
> --
>
> Key: CASSANDRA-11957
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11957
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination, Local Write-Read Paths
>Reporter: Imran Chaudhry
>Assignee: Imran Chaudhry
>Priority: Critical
> Fix For: 3.x
>
> Attachments: 11957-trunk.txt
>
>
> CDC needs the seek() method of 
> org.apache.cassandra.db.commitlog.EncryptedFileSegmentInputStream implemented 
> (currently throws an exception.)
> CommitLogs are read using this stream and the seek() method needs to be 
> implemented so that mutations which are appended to the currently active 
> commitlog can be read out in realtime.
>  
> Current implementation is:
>   
> public void seek(long position)
> {
> // implement this when we actually need it
> throw new UnsupportedOperationException();
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11957) Implement seek() of org.apache.cassandra.db.commitlog.EncryptedFileSegmentInputStream

2016-06-20 Thread Imran Chaudhry (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Imran Chaudhry updated CASSANDRA-11957:
---
Reproduced In: 3.x
   Tester: Joshua McKenzie
   Status: Patch Available  (was: In Progress)

1) Added seek() to  EncryptedFileSegmentInputStream 
2) Added unit test for seek to SeqmentReaderTest 
3) Modified CommitLogReader to use EncryptedFileSegmentInputStream.seek()

> Implement seek() of 
> org.apache.cassandra.db.commitlog.EncryptedFileSegmentInputStream 
> --
>
> Key: CASSANDRA-11957
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11957
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination, Local Write-Read Paths
>Reporter: Imran Chaudhry
>Assignee: Imran Chaudhry
>Priority: Critical
> Fix For: 3.x
>
>
> CDC needs the seek() method of 
> org.apache.cassandra.db.commitlog.EncryptedFileSegmentInputStream implemented 
> (currently throws an exception.)
> CommitLogs are read using this stream and the seek() method needs to be 
> implemented so that mutations which are appended to the currently active 
> commitlog can be read out in realtime.
>  
> Current implementation is:
>   
> public void seek(long position)
> {
> // implement this when we actually need it
> throw new UnsupportedOperationException();
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (CASSANDRA-8844) Change Data Capture (CDC)

2016-06-20 Thread Adi Kancherla (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adi Kancherla updated CASSANDRA-8844:
-
Comment: was deleted

(was: Thanks Carl. Is there a ref impl for a daemon or client that reads the 
cdc logs and pushes changes to an external system?)

> Change Data Capture (CDC)
> -
>
> Key: CASSANDRA-8844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8844
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Coordination, Local Write-Read Paths
>Reporter: Tupshin Harper
>Assignee: Joshua McKenzie
>Priority: Critical
> Fix For: 3.8
>
>
> "In databases, change data capture (CDC) is a set of software design patterns 
> used to determine (and track) the data that has changed so that action can be 
> taken using the changed data. Also, Change data capture (CDC) is an approach 
> to data integration that is based on the identification, capture and delivery 
> of the changes made to enterprise data sources."
> -Wikipedia
> As Cassandra is increasingly being used as the Source of Record (SoR) for 
> mission critical data in large enterprises, it is increasingly being called 
> upon to act as the central hub of traffic and data flow to other systems. In 
> order to try to address the general need, we (cc [~brianmhess]), propose 
> implementing a simple data logging mechanism to enable per-table CDC patterns.
> h2. The goals:
> # Use CQL as the primary ingestion mechanism, in order to leverage its 
> Consistency Level semantics, and in order to treat it as the single 
> reliable/durable SoR for the data.
> # To provide a mechanism for implementing good and reliable 
> (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) 
> continuous semi-realtime feeds of mutations going into a Cassandra cluster.
> # To eliminate the developmental and operational burden of users so that they 
> don't have to do dual writes to other systems.
> # For users that are currently doing batch export from a Cassandra system, 
> give them the opportunity to make that realtime with a minimum of coding.
> h2. The mechanism:
> We propose a durable logging mechanism that functions similar to a commitlog, 
> with the following nuances:
> - Takes place on every node, not just the coordinator, so RF number of copies 
> are logged.
> - Separate log per table.
> - Per-table configuration. Only tables that are specified as CDC_LOG would do 
> any logging.
> - Per DC. We are trying to keep the complexity to a minimum to make this an 
> easy enhancement, but most likely use cases would prefer to only implement 
> CDC logging in one (or a subset) of the DCs that are being replicated to
> - In the critical path of ConsistencyLevel acknowledgment. Just as with the 
> commitlog, failure to write to the CDC log should fail that node's write. If 
> that means the requested consistency level was not met, then clients *should* 
> experience UnavailableExceptions.
> - Be written in a Row-centric manner such that it is easy for consumers to 
> reconstitute rows atomically.
> - Written in a simple format designed to be consumed *directly* by daemons 
> written in non JVM languages
> h2. Nice-to-haves
> I strongly suspect that the following features will be asked for, but I also 
> believe that they can be deferred for a subsequent release, and to guage 
> actual interest.
> - Multiple logs per table. This would make it easy to have multiple 
> "subscribers" to a single table's changes. A workaround would be to create a 
> forking daemon listener, but that's not a great answer.
> - Log filtering. Being able to apply filters, including UDF-based filters 
> would make Casandra a much more versatile feeder into other systems, and 
> again, reduce complexity that would otherwise need to be built into the 
> daemons.
> h2. Format and Consumption
> - Cassandra would only write to the CDC log, and never delete from it. 
> - Cleaning up consumed logfiles would be the client daemon's responibility
> - Logfile size should probably be configurable.
> - Logfiles should be named with a predictable naming schema, making it 
> triivial to process them in order.
> - Daemons should be able to checkpoint their work, and resume from where they 
> left off. This means they would have to leave some file artifact in the CDC 
> log's directory.
> - A sophisticated daemon should be able to be written that could 
> -- Catch up, in written-order, even when it is multiple logfiles behind in 
> processing
> -- Be able to continuously "tail" the most recent logfile and get 
> low-latency(ms?) access to the data as it is written.
> h2. Alternate approach
> In order to make consuming a change log easy and efficient to do with low 
> latency, the following could supplement the approach outlined above
> - Instead of 

[jira] [Commented] (CASSANDRA-12035) Structure for tpstats output (JSON, YAML)

2016-06-20 Thread Mahdi Mohammadi (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340474#comment-15340474
 ] 

Mahdi Mohammadi commented on CASSANDRA-12035:
-

Please assign the ticket to yourself.

> Structure for tpstats output (JSON, YAML)
> -
>
> Key: CASSANDRA-12035
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12035
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Hiroyuki Nishi
>Priority: Minor
> Attachments: CASSANDRA-12035-trunk.patch, 
> tablestats_sample_result.json, tablestats_sample_result.txt, 
> tablestats_sample_result.yaml, tpstats_sample_result.json, 
> tpstats_sample_result.txt, tpstats_sample_result.yaml
>
>
> In CASSANDRA-5977, some extra output formats such as JSON and YAML were added 
> for nodetool tablestats. 
> Similarly, I would like to add the output formats in nodetool tpstats.
> Also, I tried to refactor the tablestats's code about the output formats to 
> integrate the existing code with my code.
> Please review the attached patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11988) NullPointerExpception when reading/compacting table

2016-06-20 Thread Nimi Wariboko Jr. (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340433#comment-15340433
 ] 

Nimi Wariboko Jr. commented on CASSANDRA-11988:
---

Also, for anyone that comes across this, it looks like a temporarily pushed out 
the issue by increasing gc_grace_seconds.

> NullPointerExpception when reading/compacting table
> ---
>
> Key: CASSANDRA-11988
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11988
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Nimi Wariboko Jr.
>Assignee: Carl Yeksigian
> Fix For: 3.6
>
>
> I have a table that suddenly refuses to be read or compacted. Issuing a read 
> on the table causes a NPE.
> On compaction, it returns the error
> {code}
> ERROR [CompactionExecutor:6] 2016-06-09 17:10:15,724 CassandraDaemon.java:213 
> - Exception in thread Thread[CompactionExecutor:6,1,main]
> java.lang.NullPointerException: null
>   at 
> org.apache.cassandra.db.transform.UnfilteredRows.isEmpty(UnfilteredRows.java:38)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:64)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:24)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:76)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.compaction.CompactionIterator.hasNext(CompactionIterator.java:226)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:182)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:82)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:264)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_45]
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_45]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
> {code}
> Schema:
> {code}
> CREATE TABLE cmpayments.report_payments (
> reportid timeuuid,
> userid timeuuid,
> adjustedearnings decimal,
> deleted set static,
> earnings map,
> gross map,
> organizationid text,
> payall timestamp static,
> status text,
> PRIMARY KEY (reportid, userid)
> ) WITH CLUSTERING ORDER BY (userid ASC)
> AND bloom_filter_fp_chance = 0.01
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
> AND comment = ''
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32', 'min_threshold': '4'}
> AND compression = {'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND crc_check_chance = 1.0
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99PERCENTILE';
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11988) NullPointerExpception when reading/compacting table

2016-06-20 Thread Nimi Wariboko Jr. (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340416#comment-15340416
 ] 

Nimi Wariboko Jr. commented on CASSANDRA-11988:
---

Okay good to know,

I just did an initial test, and while the CQL Representation seems to work 
fine, the JSON output returns some errors.

Nimi

> NullPointerExpception when reading/compacting table
> ---
>
> Key: CASSANDRA-11988
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11988
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Nimi Wariboko Jr.
>Assignee: Carl Yeksigian
> Fix For: 3.6
>
>
> I have a table that suddenly refuses to be read or compacted. Issuing a read 
> on the table causes a NPE.
> On compaction, it returns the error
> {code}
> ERROR [CompactionExecutor:6] 2016-06-09 17:10:15,724 CassandraDaemon.java:213 
> - Exception in thread Thread[CompactionExecutor:6,1,main]
> java.lang.NullPointerException: null
>   at 
> org.apache.cassandra.db.transform.UnfilteredRows.isEmpty(UnfilteredRows.java:38)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:64)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:24)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:76)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.compaction.CompactionIterator.hasNext(CompactionIterator.java:226)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:182)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:82)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:264)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_45]
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_45]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
> {code}
> Schema:
> {code}
> CREATE TABLE cmpayments.report_payments (
> reportid timeuuid,
> userid timeuuid,
> adjustedearnings decimal,
> deleted set static,
> earnings map,
> gross map,
> organizationid text,
> payall timestamp static,
> status text,
> PRIMARY KEY (reportid, userid)
> ) WITH CLUSTERING ORDER BY (userid ASC)
> AND bloom_filter_fp_chance = 0.01
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
> AND comment = ''
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32', 'min_threshold': '4'}
> AND compression = {'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND crc_check_chance = 1.0
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99PERCENTILE';
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11988) NullPointerExpception when reading/compacting table

2016-06-20 Thread Carl Yeksigian (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340413#comment-15340413
 ] 

Carl Yeksigian commented on CASSANDRA-11988:


[~nimi]: Is there currently an issue with sstabledump?

I believe this issue is only be affecting the read path where the static row 
has been tombstoned, and sstabledump doesn't use the same read path. If you are 
having issues, that might be related, or it might be something else.

> NullPointerExpception when reading/compacting table
> ---
>
> Key: CASSANDRA-11988
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11988
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Nimi Wariboko Jr.
>Assignee: Carl Yeksigian
> Fix For: 3.6
>
>
> I have a table that suddenly refuses to be read or compacted. Issuing a read 
> on the table causes a NPE.
> On compaction, it returns the error
> {code}
> ERROR [CompactionExecutor:6] 2016-06-09 17:10:15,724 CassandraDaemon.java:213 
> - Exception in thread Thread[CompactionExecutor:6,1,main]
> java.lang.NullPointerException: null
>   at 
> org.apache.cassandra.db.transform.UnfilteredRows.isEmpty(UnfilteredRows.java:38)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:64)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:24)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:76)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.compaction.CompactionIterator.hasNext(CompactionIterator.java:226)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:182)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:82)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:264)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_45]
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_45]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
> {code}
> Schema:
> {code}
> CREATE TABLE cmpayments.report_payments (
> reportid timeuuid,
> userid timeuuid,
> adjustedearnings decimal,
> deleted set static,
> earnings map,
> gross map,
> organizationid text,
> payall timestamp static,
> status text,
> PRIMARY KEY (reportid, userid)
> ) WITH CLUSTERING ORDER BY (userid ASC)
> AND bloom_filter_fp_chance = 0.01
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
> AND comment = ''
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32', 'min_threshold': '4'}
> AND compression = {'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND crc_check_chance = 1.0
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99PERCENTILE';
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11971) More uses of DataOutputBuffer.RECYCLER

2016-06-20 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340414#comment-15340414
 ] 

T Jake Luciani commented on CASSANDRA-11971:


+1

> More uses of DataOutputBuffer.RECYCLER
> --
>
> Key: CASSANDRA-11971
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11971
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Minor
> Fix For: 3.x
>
>
> There are a few more possible use cases for {{DataOutputBuffer.RECYCLER}}, 
> which prevents a couple of (larger) allocations.
> (Will provide a patch soon)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8844) Change Data Capture (CDC)

2016-06-20 Thread Carl Yeksigian (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340395#comment-15340395
 ] 

Carl Yeksigian commented on CASSANDRA-8844:
---

[~finda...@gmail.com]: No, but I've created CASSANDRA-12041 to track progress 
on it.

> Change Data Capture (CDC)
> -
>
> Key: CASSANDRA-8844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8844
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Coordination, Local Write-Read Paths
>Reporter: Tupshin Harper
>Assignee: Joshua McKenzie
>Priority: Critical
> Fix For: 3.8
>
>
> "In databases, change data capture (CDC) is a set of software design patterns 
> used to determine (and track) the data that has changed so that action can be 
> taken using the changed data. Also, Change data capture (CDC) is an approach 
> to data integration that is based on the identification, capture and delivery 
> of the changes made to enterprise data sources."
> -Wikipedia
> As Cassandra is increasingly being used as the Source of Record (SoR) for 
> mission critical data in large enterprises, it is increasingly being called 
> upon to act as the central hub of traffic and data flow to other systems. In 
> order to try to address the general need, we (cc [~brianmhess]), propose 
> implementing a simple data logging mechanism to enable per-table CDC patterns.
> h2. The goals:
> # Use CQL as the primary ingestion mechanism, in order to leverage its 
> Consistency Level semantics, and in order to treat it as the single 
> reliable/durable SoR for the data.
> # To provide a mechanism for implementing good and reliable 
> (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) 
> continuous semi-realtime feeds of mutations going into a Cassandra cluster.
> # To eliminate the developmental and operational burden of users so that they 
> don't have to do dual writes to other systems.
> # For users that are currently doing batch export from a Cassandra system, 
> give them the opportunity to make that realtime with a minimum of coding.
> h2. The mechanism:
> We propose a durable logging mechanism that functions similar to a commitlog, 
> with the following nuances:
> - Takes place on every node, not just the coordinator, so RF number of copies 
> are logged.
> - Separate log per table.
> - Per-table configuration. Only tables that are specified as CDC_LOG would do 
> any logging.
> - Per DC. We are trying to keep the complexity to a minimum to make this an 
> easy enhancement, but most likely use cases would prefer to only implement 
> CDC logging in one (or a subset) of the DCs that are being replicated to
> - In the critical path of ConsistencyLevel acknowledgment. Just as with the 
> commitlog, failure to write to the CDC log should fail that node's write. If 
> that means the requested consistency level was not met, then clients *should* 
> experience UnavailableExceptions.
> - Be written in a Row-centric manner such that it is easy for consumers to 
> reconstitute rows atomically.
> - Written in a simple format designed to be consumed *directly* by daemons 
> written in non JVM languages
> h2. Nice-to-haves
> I strongly suspect that the following features will be asked for, but I also 
> believe that they can be deferred for a subsequent release, and to guage 
> actual interest.
> - Multiple logs per table. This would make it easy to have multiple 
> "subscribers" to a single table's changes. A workaround would be to create a 
> forking daemon listener, but that's not a great answer.
> - Log filtering. Being able to apply filters, including UDF-based filters 
> would make Casandra a much more versatile feeder into other systems, and 
> again, reduce complexity that would otherwise need to be built into the 
> daemons.
> h2. Format and Consumption
> - Cassandra would only write to the CDC log, and never delete from it. 
> - Cleaning up consumed logfiles would be the client daemon's responibility
> - Logfile size should probably be configurable.
> - Logfiles should be named with a predictable naming schema, making it 
> triivial to process them in order.
> - Daemons should be able to checkpoint their work, and resume from where they 
> left off. This means they would have to leave some file artifact in the CDC 
> log's directory.
> - A sophisticated daemon should be able to be written that could 
> -- Catch up, in written-order, even when it is multiple logfiles behind in 
> processing
> -- Be able to continuously "tail" the most recent logfile and get 
> low-latency(ms?) access to the data as it is written.
> h2. Alternate approach
> In order to make consuming a change log easy and efficient to do with low 
> latency, the following could supplement the approach outlined above
> - Instead of writing to a logfile, by 

[jira] [Commented] (CASSANDRA-10202) simplify CommitLogSegmentManager

2016-06-20 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340404#comment-15340404
 ] 

Benedict commented on CASSANDRA-10202:
--

Have you looked at the code that came prior to #3578 wrt this logic? It's a 
long time ago, but I remember it being pretty damned bad back then too.

bq. battle-tested concurrent collection

Like I say, this attitude caused us to introduce bugs because we didn't 
understand either these collections' behaviour, or because the increased 
complexity of hammering the given screw was too unclear.  It *looked* easier to 
understand, but in fact it was not, and in being so duplicitous we were fooled. 
 By taking ownership of the complexity, the alternatives avoided these pitfalls.

And, further, we have these custom algorithms everywhere in the codebase.  The 
problem we have is that whenever good programming practices of abstraction, 
separation and isolation of concerns (and better testability) is employed to 
get a handle on the complexity, it inherently gets _labelled_ a custom 
algorithm and is anathema.  Thus, again, we simply _disguise_ the complexity, 
and make ourselves feel better without achieving any positive end result.

Anyway, I'm going to go back to feeling sorry for myself.  I've said plenty on 
this topic, here and in the distant past.

P.S. #7282 is a completely different topic, one of algorithmic complexity and 
long-term puzzle-piecing rather than correctness/safety.  But we've already had 
that discussion, and I've no interest in working it into Cassandra anymore.

> simplify CommitLogSegmentManager
> 
>
> Key: CASSANDRA-10202
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10202
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: Jonathan Ellis
>Assignee: Branimir Lambov
>Priority: Minor
>
> Now that we only keep one active segment around we can simplify this from the 
> old recycling design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8844) Change Data Capture (CDC)

2016-06-20 Thread Adi Kancherla (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340403#comment-15340403
 ] 

Adi Kancherla commented on CASSANDRA-8844:
--

Thanks Carl. Is there a ref impl for a daemon or client that reads the cdc logs 
and pushes changes to an external system?

> Change Data Capture (CDC)
> -
>
> Key: CASSANDRA-8844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8844
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Coordination, Local Write-Read Paths
>Reporter: Tupshin Harper
>Assignee: Joshua McKenzie
>Priority: Critical
> Fix For: 3.8
>
>
> "In databases, change data capture (CDC) is a set of software design patterns 
> used to determine (and track) the data that has changed so that action can be 
> taken using the changed data. Also, Change data capture (CDC) is an approach 
> to data integration that is based on the identification, capture and delivery 
> of the changes made to enterprise data sources."
> -Wikipedia
> As Cassandra is increasingly being used as the Source of Record (SoR) for 
> mission critical data in large enterprises, it is increasingly being called 
> upon to act as the central hub of traffic and data flow to other systems. In 
> order to try to address the general need, we (cc [~brianmhess]), propose 
> implementing a simple data logging mechanism to enable per-table CDC patterns.
> h2. The goals:
> # Use CQL as the primary ingestion mechanism, in order to leverage its 
> Consistency Level semantics, and in order to treat it as the single 
> reliable/durable SoR for the data.
> # To provide a mechanism for implementing good and reliable 
> (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) 
> continuous semi-realtime feeds of mutations going into a Cassandra cluster.
> # To eliminate the developmental and operational burden of users so that they 
> don't have to do dual writes to other systems.
> # For users that are currently doing batch export from a Cassandra system, 
> give them the opportunity to make that realtime with a minimum of coding.
> h2. The mechanism:
> We propose a durable logging mechanism that functions similar to a commitlog, 
> with the following nuances:
> - Takes place on every node, not just the coordinator, so RF number of copies 
> are logged.
> - Separate log per table.
> - Per-table configuration. Only tables that are specified as CDC_LOG would do 
> any logging.
> - Per DC. We are trying to keep the complexity to a minimum to make this an 
> easy enhancement, but most likely use cases would prefer to only implement 
> CDC logging in one (or a subset) of the DCs that are being replicated to
> - In the critical path of ConsistencyLevel acknowledgment. Just as with the 
> commitlog, failure to write to the CDC log should fail that node's write. If 
> that means the requested consistency level was not met, then clients *should* 
> experience UnavailableExceptions.
> - Be written in a Row-centric manner such that it is easy for consumers to 
> reconstitute rows atomically.
> - Written in a simple format designed to be consumed *directly* by daemons 
> written in non JVM languages
> h2. Nice-to-haves
> I strongly suspect that the following features will be asked for, but I also 
> believe that they can be deferred for a subsequent release, and to guage 
> actual interest.
> - Multiple logs per table. This would make it easy to have multiple 
> "subscribers" to a single table's changes. A workaround would be to create a 
> forking daemon listener, but that's not a great answer.
> - Log filtering. Being able to apply filters, including UDF-based filters 
> would make Casandra a much more versatile feeder into other systems, and 
> again, reduce complexity that would otherwise need to be built into the 
> daemons.
> h2. Format and Consumption
> - Cassandra would only write to the CDC log, and never delete from it. 
> - Cleaning up consumed logfiles would be the client daemon's responibility
> - Logfile size should probably be configurable.
> - Logfiles should be named with a predictable naming schema, making it 
> triivial to process them in order.
> - Daemons should be able to checkpoint their work, and resume from where they 
> left off. This means they would have to leave some file artifact in the CDC 
> log's directory.
> - A sophisticated daemon should be able to be written that could 
> -- Catch up, in written-order, even when it is multiple logfiles behind in 
> processing
> -- Be able to continuously "tail" the most recent logfile and get 
> low-latency(ms?) access to the data as it is written.
> h2. Alternate approach
> In order to make consuming a change log easy and efficient to do with low 
> latency, the following could supplement the approach outlined above
> - 

[jira] [Updated] (CASSANDRA-11575) Add out-of-process testing for CDC

2016-06-20 Thread Carl Yeksigian (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Yeksigian updated CASSANDRA-11575:
---
Attachment: 11575.tgz

Given that there were a few changes to the API between when this was written 
and when CDC was committed, I've updated the source to reflect that.

> Add out-of-process testing for CDC
> --
>
> Key: CASSANDRA-11575
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11575
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Coordination, Local Write-Read Paths
>Reporter: Carl Yeksigian
>Assignee: Carl Yeksigian
> Fix For: 3.x
>
> Attachments: 11575.tgz, 11575.tgz
>
>
> There are currently no dtests for the new cdc feature. We should have some, 
> at least to ensure that the cdc files have a lifecycle that makes sense, and 
> make sure that things like a continually cleaning daemon and a lazy daemon 
> have the properties we expect; for this, we don't need to actually process 
> the files, but make sure they fit the characteristics we expect from them. A 
> more complex daemon would need to be written in Java.
> I already hit a problem where if the cdc is over capacity, the cdc properly 
> throws the WTE, but it will not reset after the overflow directory is 
> undersize again. It is supposed to correct the size within 250ms and allow 
> more writes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11575) Add out-of-process testing for CDC

2016-06-20 Thread Carl Yeksigian (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Yeksigian updated CASSANDRA-11575:
---
Status: Patch Available  (was: Open)

> Add out-of-process testing for CDC
> --
>
> Key: CASSANDRA-11575
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11575
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Coordination, Local Write-Read Paths
>Reporter: Carl Yeksigian
>Assignee: Carl Yeksigian
> Fix For: 3.x
>
> Attachments: 11575.tgz, 11575.tgz
>
>
> There are currently no dtests for the new cdc feature. We should have some, 
> at least to ensure that the cdc files have a lifecycle that makes sense, and 
> make sure that things like a continually cleaning daemon and a lazy daemon 
> have the properties we expect; for this, we don't need to actually process 
> the files, but make sure they fit the characteristics we expect from them. A 
> more complex daemon would need to be written in Java.
> I already hit a problem where if the cdc is over capacity, the cdc properly 
> throws the WTE, but it will not reset after the overflow directory is 
> undersize again. It is supposed to correct the size within 250ms and allow 
> more writes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-12041) Add CDC to describe table

2016-06-20 Thread Carl Yeksigian (JIRA)
Carl Yeksigian created CASSANDRA-12041:
--

 Summary: Add CDC to describe table
 Key: CASSANDRA-12041
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12041
 Project: Cassandra
  Issue Type: Sub-task
  Components: Tools
Reporter: Carl Yeksigian
Assignee: Joshua McKenzie


Currently we do not output CDC with {{DESCRIBE TABLE}}, but should include that 
for 3.8+ tables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11988) NullPointerExpception when reading/compacting table

2016-06-20 Thread Nimi Wariboko Jr. (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340383#comment-15340383
 ] 

Nimi Wariboko Jr. commented on CASSANDRA-11988:
---

Carl Yeksigian,

Are you aware if your fix will work with sstabledump? I need to find a way to 
export/dump this data. This bug also prevents me from bootstrapping new nodes 
(they fail to stream this table), and obviously I can't work with the 
problematic partitions in the mean time.

> NullPointerExpception when reading/compacting table
> ---
>
> Key: CASSANDRA-11988
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11988
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Nimi Wariboko Jr.
>Assignee: Carl Yeksigian
> Fix For: 3.6
>
>
> I have a table that suddenly refuses to be read or compacted. Issuing a read 
> on the table causes a NPE.
> On compaction, it returns the error
> {code}
> ERROR [CompactionExecutor:6] 2016-06-09 17:10:15,724 CassandraDaemon.java:213 
> - Exception in thread Thread[CompactionExecutor:6,1,main]
> java.lang.NullPointerException: null
>   at 
> org.apache.cassandra.db.transform.UnfilteredRows.isEmpty(UnfilteredRows.java:38)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:64)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.partitions.PurgeFunction.applyToPartition(PurgeFunction.java:24)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:76)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.compaction.CompactionIterator.hasNext(CompactionIterator.java:226)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:182)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:82)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:264)
>  ~[apache-cassandra-3.6.jar:3.6]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_45]
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_45]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
> {code}
> Schema:
> {code}
> CREATE TABLE cmpayments.report_payments (
> reportid timeuuid,
> userid timeuuid,
> adjustedearnings decimal,
> deleted set static,
> earnings map,
> gross map,
> organizationid text,
> payall timestamp static,
> status text,
> PRIMARY KEY (reportid, userid)
> ) WITH CLUSTERING ORDER BY (userid ASC)
> AND bloom_filter_fp_chance = 0.01
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
> AND comment = ''
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32', 'min_threshold': '4'}
> AND compression = {'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND crc_check_chance = 1.0
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99PERCENTILE';
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8844) Change Data Capture (CDC)

2016-06-20 Thread Adi Kancherla (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340382#comment-15340382
 ] 

Adi Kancherla commented on CASSANDRA-8844:
--

Yes, I have it enabled in yaml and I see the cdc_raw directory. Also I see a 
few entries in the CL under cdc_raw about the inserts I made to the table (I 
read whatever I could although the format is not human readable). So I guess 
cdc is enabled on the table. Is cqlsh update to describe table being worked on?

> Change Data Capture (CDC)
> -
>
> Key: CASSANDRA-8844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8844
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Coordination, Local Write-Read Paths
>Reporter: Tupshin Harper
>Assignee: Joshua McKenzie
>Priority: Critical
> Fix For: 3.8
>
>
> "In databases, change data capture (CDC) is a set of software design patterns 
> used to determine (and track) the data that has changed so that action can be 
> taken using the changed data. Also, Change data capture (CDC) is an approach 
> to data integration that is based on the identification, capture and delivery 
> of the changes made to enterprise data sources."
> -Wikipedia
> As Cassandra is increasingly being used as the Source of Record (SoR) for 
> mission critical data in large enterprises, it is increasingly being called 
> upon to act as the central hub of traffic and data flow to other systems. In 
> order to try to address the general need, we (cc [~brianmhess]), propose 
> implementing a simple data logging mechanism to enable per-table CDC patterns.
> h2. The goals:
> # Use CQL as the primary ingestion mechanism, in order to leverage its 
> Consistency Level semantics, and in order to treat it as the single 
> reliable/durable SoR for the data.
> # To provide a mechanism for implementing good and reliable 
> (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) 
> continuous semi-realtime feeds of mutations going into a Cassandra cluster.
> # To eliminate the developmental and operational burden of users so that they 
> don't have to do dual writes to other systems.
> # For users that are currently doing batch export from a Cassandra system, 
> give them the opportunity to make that realtime with a minimum of coding.
> h2. The mechanism:
> We propose a durable logging mechanism that functions similar to a commitlog, 
> with the following nuances:
> - Takes place on every node, not just the coordinator, so RF number of copies 
> are logged.
> - Separate log per table.
> - Per-table configuration. Only tables that are specified as CDC_LOG would do 
> any logging.
> - Per DC. We are trying to keep the complexity to a minimum to make this an 
> easy enhancement, but most likely use cases would prefer to only implement 
> CDC logging in one (or a subset) of the DCs that are being replicated to
> - In the critical path of ConsistencyLevel acknowledgment. Just as with the 
> commitlog, failure to write to the CDC log should fail that node's write. If 
> that means the requested consistency level was not met, then clients *should* 
> experience UnavailableExceptions.
> - Be written in a Row-centric manner such that it is easy for consumers to 
> reconstitute rows atomically.
> - Written in a simple format designed to be consumed *directly* by daemons 
> written in non JVM languages
> h2. Nice-to-haves
> I strongly suspect that the following features will be asked for, but I also 
> believe that they can be deferred for a subsequent release, and to guage 
> actual interest.
> - Multiple logs per table. This would make it easy to have multiple 
> "subscribers" to a single table's changes. A workaround would be to create a 
> forking daemon listener, but that's not a great answer.
> - Log filtering. Being able to apply filters, including UDF-based filters 
> would make Casandra a much more versatile feeder into other systems, and 
> again, reduce complexity that would otherwise need to be built into the 
> daemons.
> h2. Format and Consumption
> - Cassandra would only write to the CDC log, and never delete from it. 
> - Cleaning up consumed logfiles would be the client daemon's responibility
> - Logfile size should probably be configurable.
> - Logfiles should be named with a predictable naming schema, making it 
> triivial to process them in order.
> - Daemons should be able to checkpoint their work, and resume from where they 
> left off. This means they would have to leave some file artifact in the CDC 
> log's directory.
> - A sophisticated daemon should be able to be written that could 
> -- Catch up, in written-order, even when it is multiple logfiles behind in 
> processing
> -- Be able to continuously "tail" the most recent logfile and get 
> low-latency(ms?) access to the data as it 

[jira] [Commented] (CASSANDRA-4663) Streaming sends one file at a time serially.

2016-06-20 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340379#comment-15340379
 ] 

Jason Brown commented on CASSANDRA-4663:


FWIW, the work I'm doing in #8457 (move internode messaging to netty) will need 
to pull along changing the streaming subsystem to netty, as well. I'm currently 
working on that (it can and will be a separate ticket from #8457), and I've 
been thinking about the entire streaming workflow, including sending multiple 
files in parallel. In my estimation, there are assumptions baked into the 
existing streaming workflow that makes sending files in parallel a non-trivial 
task; however, that does not mean it's impossible nor potentially beneficial, 
as my work is targeting 4.0. 

> Streaming sends one file at a time serially. 
> -
>
> Key: CASSANDRA-4663
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4663
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Priority: Minor
>
> This is not fast enough when someone is using SSD and may be 10G link. We 
> should try to create multiple connections and send multiple files in 
> parallel. 
> Current approach under utilize the link(even 1G).
> This change will improve the bootstrapping time of a node. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10202) simplify CommitLogSegmentManager

2016-06-20 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340371#comment-15340371
 ] 

Joshua McKenzie commented on CASSANDRA-10202:
-

bq. I can assure you the prior implementation was no less custom
I (probably a bit too snarkily) was alluding to that when I stated "which is 
saying something".

bq. by gaining this it becomes a target for criticism
Honestly, having repeatedly run into races and timing issues with tests and 
changes for CDC, the segment allocation logic in the CommitLog is a target for 
criticism in my mind as the trade-off between complexity and value gained from 
this implementation falls on the side of "not worth it" to me, specifically 
w/regards to new file allocation and swapping.

bq. on that front I think your arguments are pretty fundamentally flawed
To enumerate them, my arguements are:
* I'm -1 on us including our own implementation of a concurrent linked list 
unless we can strongly justify the inclusion of that complexity.
** I stand by this. While I don't love what we *have* from a subtlety / 
side-effect management perspective, at least its been in there for awhile and 
had some bugs flushed out.
* We have to maintain this code
** This is less an argument and more something I think we need to remind 
ourselves of when debating adding a new, custom implementation of a relatively 
statefully complex customized collection to the code-base.
* I find this container even more unnecessarily complex to reason about than 
our current CommitLogSegmentManager.advanceAllocatingFrom
** Key here is "unnecessarily complex", and I mean this specifically w/regards 
to the segment allocation logic. Going back and looking at the #'s from 
CASSANDRA-3578, it's clear there's a marked improvement in CPU utilization and 
ops/sec throughput, however I'm skeptical as to how much of that is due to the 
logic surrounding new segment creation and signalling vs. the multi-threaded 
CommitLogSegment.Allocation logic.
* This reminds me a lot of CASSANDRA-7282 where we're taking out a 
battle-tested concurrent collection in favor of writing our own from scratch
** I don't mean to pick at old wounds or re-start old battles, but the marginal 
gains in performance we get from these changes seems *heavily* outweighed by 
the developer man-hours that go into maintaining them and fixing the subtle and 
complex bugs that come along with these types of implementations.

bq. the project is still ideologically opposed to custom algorithms
I think it's less that the project is ideologically opposed to custom 
algorithms, and more that specific vocal people (I am completely guilty of 
this) are very complexity averse in a code-base of this size and existing 
complexity and want very strong justifications for decisions that appear to be 
adding more complexity than the perceived performance benefits they grant.

> simplify CommitLogSegmentManager
> 
>
> Key: CASSANDRA-10202
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10202
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: Jonathan Ellis
>Assignee: Branimir Lambov
>Priority: Minor
>
> Now that we only keep one active segment around we can simplify this from the 
> old recycling design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8844) Change Data Capture (CDC)

2016-06-20 Thread Carl Yeksigian (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340366#comment-15340366
 ] 

Carl Yeksigian commented on CASSANDRA-8844:
---

It is a setting per table, but it also requires that the node has it enabled in 
its yaml as well. There needs to be a corresponding update to cqlsh's 
{{DESCRIBE TABLE}} which was overlooked in this patch.

> Change Data Capture (CDC)
> -
>
> Key: CASSANDRA-8844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8844
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Coordination, Local Write-Read Paths
>Reporter: Tupshin Harper
>Assignee: Joshua McKenzie
>Priority: Critical
> Fix For: 3.8
>
>
> "In databases, change data capture (CDC) is a set of software design patterns 
> used to determine (and track) the data that has changed so that action can be 
> taken using the changed data. Also, Change data capture (CDC) is an approach 
> to data integration that is based on the identification, capture and delivery 
> of the changes made to enterprise data sources."
> -Wikipedia
> As Cassandra is increasingly being used as the Source of Record (SoR) for 
> mission critical data in large enterprises, it is increasingly being called 
> upon to act as the central hub of traffic and data flow to other systems. In 
> order to try to address the general need, we (cc [~brianmhess]), propose 
> implementing a simple data logging mechanism to enable per-table CDC patterns.
> h2. The goals:
> # Use CQL as the primary ingestion mechanism, in order to leverage its 
> Consistency Level semantics, and in order to treat it as the single 
> reliable/durable SoR for the data.
> # To provide a mechanism for implementing good and reliable 
> (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) 
> continuous semi-realtime feeds of mutations going into a Cassandra cluster.
> # To eliminate the developmental and operational burden of users so that they 
> don't have to do dual writes to other systems.
> # For users that are currently doing batch export from a Cassandra system, 
> give them the opportunity to make that realtime with a minimum of coding.
> h2. The mechanism:
> We propose a durable logging mechanism that functions similar to a commitlog, 
> with the following nuances:
> - Takes place on every node, not just the coordinator, so RF number of copies 
> are logged.
> - Separate log per table.
> - Per-table configuration. Only tables that are specified as CDC_LOG would do 
> any logging.
> - Per DC. We are trying to keep the complexity to a minimum to make this an 
> easy enhancement, but most likely use cases would prefer to only implement 
> CDC logging in one (or a subset) of the DCs that are being replicated to
> - In the critical path of ConsistencyLevel acknowledgment. Just as with the 
> commitlog, failure to write to the CDC log should fail that node's write. If 
> that means the requested consistency level was not met, then clients *should* 
> experience UnavailableExceptions.
> - Be written in a Row-centric manner such that it is easy for consumers to 
> reconstitute rows atomically.
> - Written in a simple format designed to be consumed *directly* by daemons 
> written in non JVM languages
> h2. Nice-to-haves
> I strongly suspect that the following features will be asked for, but I also 
> believe that they can be deferred for a subsequent release, and to guage 
> actual interest.
> - Multiple logs per table. This would make it easy to have multiple 
> "subscribers" to a single table's changes. A workaround would be to create a 
> forking daemon listener, but that's not a great answer.
> - Log filtering. Being able to apply filters, including UDF-based filters 
> would make Casandra a much more versatile feeder into other systems, and 
> again, reduce complexity that would otherwise need to be built into the 
> daemons.
> h2. Format and Consumption
> - Cassandra would only write to the CDC log, and never delete from it. 
> - Cleaning up consumed logfiles would be the client daemon's responibility
> - Logfile size should probably be configurable.
> - Logfiles should be named with a predictable naming schema, making it 
> triivial to process them in order.
> - Daemons should be able to checkpoint their work, and resume from where they 
> left off. This means they would have to leave some file artifact in the CDC 
> log's directory.
> - A sophisticated daemon should be able to be written that could 
> -- Catch up, in written-order, even when it is multiple logfiles behind in 
> processing
> -- Be able to continuously "tail" the most recent logfile and get 
> low-latency(ms?) access to the data as it is written.
> h2. Alternate approach
> In order to make consuming a change log easy and efficient to do 

[jira] [Commented] (CASSANDRA-11272) NullPointerException (NPE) during bootstrap startup in StorageService.java

2016-06-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340361#comment-15340361
 ] 

ASF GitHub Bot commented on CASSANDRA-11272:


Github user zhiyanshao closed the pull request at:

https://github.com/apache/cassandra/pull/71


> NullPointerException (NPE) during bootstrap startup in StorageService.java
> --
>
> Key: CASSANDRA-11272
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11272
> Project: Cassandra
>  Issue Type: Bug
>  Components: Lifecycle
> Environment: debian jesse up to date
>Reporter: Jason Kania
>Assignee: Alex Petrov
> Fix For: 2.2.7, 3.7, 3.0.7, 3.8
>
>
> After bootstrapping fails due to stream closed error, the following error 
> results:
> {code}
> Feb 27, 2016 8:06:38 PM com.google.common.util.concurrent.ExecutionList 
> executeListener
> SEVERE: RuntimeException while executing runnable 
> com.google.common.util.concurrent.Futures$6@3d61813b with executor INSTANCE
> java.lang.NullPointerException
> at 
> org.apache.cassandra.service.StorageService$2.onFailure(StorageService.java:1284)
> at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310)
> at 
> com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
> at 
> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
> at 
> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
> at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:210)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:186)
> at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:430)
> at 
> org.apache.cassandra.streaming.StreamSession.onError(StreamSession.java:525)
> at 
> org.apache.cassandra.streaming.StreamSession.doRetry(StreamSession.java:645)
> at 
> org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:70)
> at 
> org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:39)
> at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:59)
> at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:261)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8844) Change Data Capture (CDC)

2016-06-20 Thread Adi Kancherla (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340359#comment-15340359
 ] 

Adi Kancherla commented on CASSANDRA-8844:
--

How to confirm if cdc is enabled on a CF? I did ALTER TABLE  WITH 
CDC = true; and did a DESCRIBE TABLE ; and there is nothing related 
to cdc

> Change Data Capture (CDC)
> -
>
> Key: CASSANDRA-8844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8844
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Coordination, Local Write-Read Paths
>Reporter: Tupshin Harper
>Assignee: Joshua McKenzie
>Priority: Critical
> Fix For: 3.8
>
>
> "In databases, change data capture (CDC) is a set of software design patterns 
> used to determine (and track) the data that has changed so that action can be 
> taken using the changed data. Also, Change data capture (CDC) is an approach 
> to data integration that is based on the identification, capture and delivery 
> of the changes made to enterprise data sources."
> -Wikipedia
> As Cassandra is increasingly being used as the Source of Record (SoR) for 
> mission critical data in large enterprises, it is increasingly being called 
> upon to act as the central hub of traffic and data flow to other systems. In 
> order to try to address the general need, we (cc [~brianmhess]), propose 
> implementing a simple data logging mechanism to enable per-table CDC patterns.
> h2. The goals:
> # Use CQL as the primary ingestion mechanism, in order to leverage its 
> Consistency Level semantics, and in order to treat it as the single 
> reliable/durable SoR for the data.
> # To provide a mechanism for implementing good and reliable 
> (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) 
> continuous semi-realtime feeds of mutations going into a Cassandra cluster.
> # To eliminate the developmental and operational burden of users so that they 
> don't have to do dual writes to other systems.
> # For users that are currently doing batch export from a Cassandra system, 
> give them the opportunity to make that realtime with a minimum of coding.
> h2. The mechanism:
> We propose a durable logging mechanism that functions similar to a commitlog, 
> with the following nuances:
> - Takes place on every node, not just the coordinator, so RF number of copies 
> are logged.
> - Separate log per table.
> - Per-table configuration. Only tables that are specified as CDC_LOG would do 
> any logging.
> - Per DC. We are trying to keep the complexity to a minimum to make this an 
> easy enhancement, but most likely use cases would prefer to only implement 
> CDC logging in one (or a subset) of the DCs that are being replicated to
> - In the critical path of ConsistencyLevel acknowledgment. Just as with the 
> commitlog, failure to write to the CDC log should fail that node's write. If 
> that means the requested consistency level was not met, then clients *should* 
> experience UnavailableExceptions.
> - Be written in a Row-centric manner such that it is easy for consumers to 
> reconstitute rows atomically.
> - Written in a simple format designed to be consumed *directly* by daemons 
> written in non JVM languages
> h2. Nice-to-haves
> I strongly suspect that the following features will be asked for, but I also 
> believe that they can be deferred for a subsequent release, and to guage 
> actual interest.
> - Multiple logs per table. This would make it easy to have multiple 
> "subscribers" to a single table's changes. A workaround would be to create a 
> forking daemon listener, but that's not a great answer.
> - Log filtering. Being able to apply filters, including UDF-based filters 
> would make Casandra a much more versatile feeder into other systems, and 
> again, reduce complexity that would otherwise need to be built into the 
> daemons.
> h2. Format and Consumption
> - Cassandra would only write to the CDC log, and never delete from it. 
> - Cleaning up consumed logfiles would be the client daemon's responibility
> - Logfile size should probably be configurable.
> - Logfiles should be named with a predictable naming schema, making it 
> triivial to process them in order.
> - Daemons should be able to checkpoint their work, and resume from where they 
> left off. This means they would have to leave some file artifact in the CDC 
> log's directory.
> - A sophisticated daemon should be able to be written that could 
> -- Catch up, in written-order, even when it is multiple logfiles behind in 
> processing
> -- Be able to continuously "tail" the most recent logfile and get 
> low-latency(ms?) access to the data as it is written.
> h2. Alternate approach
> In order to make consuming a change log easy and efficient to do with low 
> latency, the following could supplement the approach 

[jira] [Comment Edited] (CASSANDRA-10202) simplify CommitLogSegmentManager

2016-06-20 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340354#comment-15340354
 ] 

Benedict edited comment on CASSANDRA-10202 at 6/20/16 8:40 PM:
---

We also ideally want lock-free swapping-in of the new segment, no?  Currently 
we don't have it, but until we reach _pure_-TPC (probably never) en route fewer 
application threads exposes us to a higher risk of gumming up the system.

But yes, we could do full mutex, but it is still significantly safer to move it 
all into one structure where that is well managed.  The prior art has it 
clumsily littered amongst all the other code.  Thing is, once you do that you 
essentially have the new algorithm, just with one of the methods wrapped in an 
unnecessary mutex call.

I do agree the code should be tested better, but that is true of everything - 
the current code is trusted only on the word of commitlog-stress, making this 
as trustworthy, but it is always better to improve that.  

I would however reiterate I don't necessarily think the patch entirely warrants 
inclusion, I just want the discussion to be a bit more informed.

On the topic of generic linked-lists, I have two view points: 1) I've attempted 
to integrate any number of generic linked-lists, and they are universally 
rejected\*, so I gave up and tried to stick to hyper-safety-oriented structures 
that have functionality hamstrung as far as possible in light of the use case 
constraints; 2) those constraints matter for readability and function, too, and 
you can end up with a more powerful linked-list for your situation despite a 
less powerful overall structure, as well as one that tells you more about the 
behaviour of its users.

I'd point out that this whole code area is massively concurrent, as is the 
whole project.  This linked-list is by far the easiest part of this code, and 
most of the project, to reason about concurrency-wise.  If we do not trust 
ourselves to write it, we should probably start introspecting about what that 
means.

NB: I must admit I haven't read the code in question for a while, and am typing 
this all from memory, in bed recovering from flu, so I might just be delirious. 
 It could all be terrible.

\* Notably, I can recall at least two serious bugs that would have been avoided 
with one of these structures has they been included when proferred. One 
occurred in this code, the other was down to a pathological and unexpected 
behaviour in ConcurrentLinkedQueue, the most battle-tested structure around (it 
was an understood behaviour by the author, just undocumented and unexpected).


was (Author: benedict):
We also ideally want lock-free swapping-in of the new segment, no?  Currently 
we don't have it, but until we reach _pure_-TPC (probably never) fen route ewer 
application threads exposes us to a higher risk of gumming up the system.

But yes, we could do full mutex, but it is still significantly safer to move it 
all into one structure where that is well managed.  The prior art has it 
clumsily littered amongst all the other code.  Thing is, once you do that you 
essentially have the new algorithm, just with one of the methods wrapped in an 
unnecessary mutex call.

I do agree the code should be tested better, but that is true of everything - 
the current code is trusted only on the word of commitlog-stress, making this 
as trustworthy, but it is always better to improve that.  

I would however reiterate I don't necessarily think the patch entirely warrants 
inclusion, I just want the discussion to be a bit more informed.

On the topic of generic linked-lists, I have two view points: 1) I've attempted 
to integrate any number of generic linked-lists, and they are universally 
rejected\*, so I gave up and tried to stick to hyper-safety-oriented structures 
that have functionality hamstrung as far as possible in light of the use case 
constraints; 2) those constraints matter for readability and function, too, and 
you can end up with a more powerful linked-list for your situation despite a 
less powerful overall structure, as well as one that tells you more about the 
behaviour of its users.

I'd point out that this whole code area is massively concurrent, as is the 
whole project.  This linked-list is by far the easiest part of this code, and 
most of the project, to reason about concurrency-wise.  If we do not trust 
ourselves to write it, we should probably start introspecting about what that 
means.

NB: I must admit I haven't read the code in question for a while, and am typing 
this all from memory, in bed recovering from flu, so I might just be delirious. 
 It could all be terrible.

\* Notably, I can recall at least two serious bugs that would have been avoided 
with one of these structures has they been included when proferred. One 
occurred in this code, the other was down to a pathological and unexpected 
behaviour 

[jira] [Commented] (CASSANDRA-10202) simplify CommitLogSegmentManager

2016-06-20 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340354#comment-15340354
 ] 

Benedict commented on CASSANDRA-10202:
--

We also ideally want lock-free swapping-in of the new segment, no?  Currently 
we don't have it, but until we reach _pure_-TPC (probably never) fen route ewer 
application threads exposes us to a higher risk of gumming up the system.

But yes, we could do full mutex, but it is still significantly safer to move it 
all into one structure where that is well managed.  The prior art has it 
clumsily littered amongst all the other code.  Thing is, once you do that you 
essentially have the new algorithm, just with one of the methods wrapped in an 
unnecessary mutex call.

I do agree the code should be tested better, but that is true of everything - 
the current code is trusted only on the word of commitlog-stress, making this 
as trustworthy, but it is always better to improve that.  

I would however reiterate I don't necessarily think the patch entirely warrants 
inclusion, I just want the discussion to be a bit more informed.

On the topic of generic linked-lists, I have two view points: 1) I've attempted 
to integrate any number of generic linked-lists, and they are universally 
rejected\*, so I gave up and tried to stick to hyper-safety-oriented structures 
that have functionality hamstrung as far as possible in light of the use case 
constraints; 2) those constraints matter for readability and function, too, and 
you can end up with a more powerful linked-list for your situation despite a 
less powerful overall structure, as well as one that tells you more about the 
behaviour of its users.

I'd point out that this whole code area is massively concurrent, as is the 
whole project.  This linked-list is by far the easiest part of this code, and 
most of the project, to reason about concurrency-wise.  If we do not trust 
ourselves to write it, we should probably start introspecting about what that 
means.

NB: I must admit I haven't read the code in question for a while, and am typing 
this all from memory, in bed recovering from flu, so I might just be delirious. 
 It could all be terrible.

\* Notably, I can recall at least two serious bugs that would have been avoided 
with one of these structures has they been included when proferred. One 
occurred in this code, the other was down to a pathological and unexpected 
behaviour in ConcurrentLinkedQueue, the most battle-tested structure around (it 
was an understood behaviour by the author, just undocumented and unexpected).

> simplify CommitLogSegmentManager
> 
>
> Key: CASSANDRA-10202
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10202
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: Jonathan Ellis
>Assignee: Branimir Lambov
>Priority: Minor
>
> Now that we only keep one active segment around we can simplify this from the 
> old recycling design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-12040) If a level compaction fails due to no space it should schedule the next one

2016-06-20 Thread sankalp kohli (JIRA)
sankalp kohli created CASSANDRA-12040:
-

 Summary:   If a level compaction fails due to no space it should 
schedule the next one
 Key: CASSANDRA-12040
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12040
 Project: Cassandra
  Issue Type: Improvement
Reporter: sankalp kohli
Priority: Minor


If a level compaction fails the space check, it aborts but next time the 
compactions are scheduled it will attempt the same one. It should skip it and 
go to the next so it can find smaller compactions to do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-10202) simplify CommitLogSegmentManager

2016-06-20 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340331#comment-15340331
 ] 

Ariel Weisberg edited comment on CASSANDRA-10202 at 6/20/16 8:23 PM:
-

Why can't we use mutual exclusion the entire time? The only lock free access we 
really need is to allocate from the current active segment right? After that 
can we use double checked locking to deal with swapping in a new segment?

I am not against implementing our own, but I think we should set the testing 
bar somewhere that makes it look like it does what it is supposed to do under 
concurrent access. Maybe 
[jcstress|http://openjdk.java.net/projects/code-tools/jcstress/] can help with 
testing this kind of thing.

I am also not a fan of doing the work to implement an intrusive linked list, 
but not in a generic reusable way. What I don't want to see is a proliferation 
of custom implementations that are inlined into single use areas. It would be 
fine if it didn't implement the entire List API to start.


was (Author: aweisberg):
Why can't we use mutual exclusion the entire time? The only lock free access we 
really need is to allocate from the current active segment right? After that 
can we use double checked locking to deal with swapping in a new segment?

> simplify CommitLogSegmentManager
> 
>
> Key: CASSANDRA-10202
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10202
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: Jonathan Ellis
>Assignee: Branimir Lambov
>Priority: Minor
>
> Now that we only keep one active segment around we can simplify this from the 
> old recycling design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10202) simplify CommitLogSegmentManager

2016-06-20 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340331#comment-15340331
 ] 

Ariel Weisberg commented on CASSANDRA-10202:


Why can't we use mutual exclusion the entire time? The only lock free access we 
really need is to allocate from the current active segment right? After that 
can we use double checked locking to deal with swapping in a new segment?

> simplify CommitLogSegmentManager
> 
>
> Key: CASSANDRA-10202
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10202
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: Jonathan Ellis
>Assignee: Branimir Lambov
>Priority: Minor
>
> Now that we only keep one active segment around we can simplify this from the 
> old recycling design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11870) Consider allocating direct buffers bypassing ByteBuffer.allocateDirect

2016-06-20 Thread T Jake Luciani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

T Jake Luciani updated CASSANDRA-11870:
---
Status: Open  (was: Patch Available)

> Consider allocating direct buffers bypassing ByteBuffer.allocateDirect
> --
>
> Key: CASSANDRA-11870
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11870
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Minor
> Fix For: 3.x
>
>
> As outlined in CASSANDRA-11818, {{ByteBuffer.allocateDirect}} uses 
> {{Bits.reserveMemory}}, which is there to respect the JVM setting 
> {{-XX:MaxDirectMemorySize=...}}.
> {{Bits.reserveMemory}} first tries an "optimistic" {{tryReserveMemory}} and 
> exits immediately on success. However, if that somehow doesn't succeed, it 
> triggers a {{System.gc()}}, which is bad IMO (however, kind of how direct 
> buffers work in Java). After that GC it sleeps and tries to reserve the 
> memory up to 9 times - up to 511 ms - and then throws 
> {{OutOfMemoryError("Direct buffer memory")}}.
> This is unnecessary for us since we always immediately "free" direct buffers 
> as soon as we no longer need them.
> Proposal: Manage direct-memory reservations in our own code and skip 
> {{Bits.reserveMemory}} that way.
> (However, Netty direct buffers are not under our control.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11870) Consider allocating direct buffers bypassing ByteBuffer.allocateDirect

2016-06-20 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340326#comment-15340326
 ] 

T Jake Luciani commented on CASSANDRA-11870:


I'm trying to weigh the value of this approach, is this something people 
experience in the wild?  It seems if you hit this you would simply bump the JVM 
flag.  You also have your own flag you need to size appropriately so how is 
this really any operationally better?  

This might not be appropriate for this issue but looking at the patch it seems 
we should try and consolidate calls to ByteBuffer.allocateDirect() with 
BufferPool.tryGet()/put() which gracefully handles running out of DirectMemory 
and recycling(). But is currently sized by the 
DatabaseDescriptor.getFileCacheSizeInMB() which seems wrong to me, since it's a 
generic buffer pool.

> Consider allocating direct buffers bypassing ByteBuffer.allocateDirect
> --
>
> Key: CASSANDRA-11870
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11870
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Minor
> Fix For: 3.x
>
>
> As outlined in CASSANDRA-11818, {{ByteBuffer.allocateDirect}} uses 
> {{Bits.reserveMemory}}, which is there to respect the JVM setting 
> {{-XX:MaxDirectMemorySize=...}}.
> {{Bits.reserveMemory}} first tries an "optimistic" {{tryReserveMemory}} and 
> exits immediately on success. However, if that somehow doesn't succeed, it 
> triggers a {{System.gc()}}, which is bad IMO (however, kind of how direct 
> buffers work in Java). After that GC it sleeps and tries to reserve the 
> memory up to 9 times - up to 511 ms - and then throws 
> {{OutOfMemoryError("Direct buffer memory")}}.
> This is unnecessary for us since we always immediately "free" direct buffers 
> as soon as we no longer need them.
> Proposal: Manage direct-memory reservations in our own code and skip 
> {{Bits.reserveMemory}} that way.
> (However, Netty direct buffers are not under our control.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12036) dtest failure in repair_tests.repair_test.TestRepair.repair_after_upgrade_test

2016-06-20 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340320#comment-15340320
 ] 

Paulo Motta commented on CASSANDRA-12036:
-

This is probably due to this test upgrading from 2.1.9, which is not properly 
supported on Windows.

I changed it to upgrade from 2.2.4 instead, which should probably fix it on 
Windows. 

Is it possible to trigger a custom win CI build with [this 
branch|https://github.com/pauloricardomg/cassandra-dtest/tree/12036]? (I was 
not able to do it via cassci parameters)

> dtest failure in repair_tests.repair_test.TestRepair.repair_after_upgrade_test
> --
>
> Key: CASSANDRA-12036
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12036
> Project: Cassandra
>  Issue Type: Test
>Reporter: Craig Kodman
>Assignee: DS Test Eng
>  Labels: dtest
>
> example failure:
> http://cassci.datastax.com/job/cassandra-3.0_dtest_win32/257/testReport/repair_tests.repair_test/TestRepair/repair_after_upgrade_test
> Failed on CassCI build cassandra-3.0_dtest_win32 #257



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10202) simplify CommitLogSegmentManager

2016-06-20 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340314#comment-15340314
 ] 

Benedict commented on CASSANDRA-10202:
--

For the record - the alternative may appear more difficult to follow at first 
blush, but I can assure you the prior implementation was no less custom - and 
in its history in fact had many subtle bugs that were missed despite its 
apparent ease to follow.  The difference is the new version named it what it 
was and abstracted it, and by gaining this it becomes a target for criticism.

The prior approach has multiple lists that are effectively chained together, 
clumsily, into a new composite linked list, which concurrency-wise is much more 
dangerous. Managing atomicity between these lists is difficult, as they are not 
designed to know of each other. These transitions were buggy at various times.  
It has no stress tests besides the commitlog stress test.

On the safety of the concurrent algorithm for maintaining this logical linked 
list, I'm pretty certain the new algorithm was easier to verify.  I've written 
a *lot* of variants of concurrent linked lists. This one kept concurrency 
tightly to an absolute minimum - much less than the prior one - by only 
permitting very few operations to access it in a lock-free manner.  The 
remainder used mutual exclusion.  The nice bit here is that those instances of 
mutual exclusion were not on the critical path (but *are* in the prior version 
- which is possibly a meaningful difference, especially for mid-way 
thread-per-core, where stalling threads might cause more of a latency spike).

So on that front I think your arguments are pretty fundamentally flawed.  

A reasonable rejection might be that we have a "battle tested custom linked 
list" already so why replace it with another? But we have a lot of battle 
tested things that are very broken, so I don't think that works as an argument 
either.  Our battles don't appear to have high information value.

However, it did not simplify the rest of the CommitLog very much at all, and so 
on that front it was a failure.  Possibly it made the rest worse, I can't 
recall. I was on the fence on the changes when I made them, so I have no 
problem with you dismissing them, but I do have an issue with you doing it for 
entirely the wrong reasons.

I think it is a real shame the project is still ideologically opposed to custom 
algorithms.  A core piece of infrastructure with weird characteristics like 
Cassandra needs them.  Whether or not we need this one, I have no strong 
opinion, but this seems like a blindly dogmatic or emotive rejection rather 
than a well considered one.




> simplify CommitLogSegmentManager
> 
>
> Key: CASSANDRA-10202
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10202
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: Jonathan Ellis
>Assignee: Branimir Lambov
>Priority: Minor
>
> Now that we only keep one active segment around we can simplify this from the 
> old recycling design.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-11919) Failure in nodetool decommission

2016-06-20 Thread Paulo Motta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta resolved CASSANDRA-11919.
-
Resolution: Duplicate

> Failure in nodetool decommission
> 
>
> Key: CASSANDRA-11919
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11919
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
> Environment: Centos 6.6 x86_64, Cassandra 2.2.4
>Reporter: vin01
>Priority: Minor
> Fix For: 2.2.x
>
>
> I keep getting an exception while attempting "nodetool decommission".
> {code}
> ERROR [STREAM-IN-/[NODE_ON_WHICH_DECOMMISSION_RUNNING]] 2016-05-29 
> 13:08:39,040 StreamSession.java:524 - [Stream 
> #b2039080-25c2-11e6-bd92-d71331aaf180] Streaming error occurred
> java.lang.IllegalArgumentException: Unknown type 0
> at 
> org.apache.cassandra.streaming.messages.StreamMessage$Type.get(StreamMessage.java:96)
>  ~[apache-cassandra-2.2.4.jar:2.2.4]
> at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:57)
>  ~[apache-cassandra-2.2.4.jar:2.2.4]
> at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:261)
>  ~[apache-cassandra-2.2.4.jar:2.2.4]
> {code}
> Because of these, decommission process is not succeeding.
> Is interrupting the decommission process safe? Seems like i will have to 
> retry to make it work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11919) Failure in nodetool decommission

2016-06-20 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340255#comment-15340255
 ] 

Paulo Motta commented on CASSANDRA-11919:
-

Since this is 2.2.4, I'm quite confident this is as duplicate of 
CASSANDRA-10448 which was fixed on 2.2.5 by CASSANDRA-10961. Please reopen if 
this is reproducible on 2.2.5+.

> Failure in nodetool decommission
> 
>
> Key: CASSANDRA-11919
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11919
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
> Environment: Centos 6.6 x86_64, Cassandra 2.2.4
>Reporter: vin01
>Priority: Minor
> Fix For: 2.2.x
>
>
> I keep getting an exception while attempting "nodetool decommission".
> {code}
> ERROR [STREAM-IN-/[NODE_ON_WHICH_DECOMMISSION_RUNNING]] 2016-05-29 
> 13:08:39,040 StreamSession.java:524 - [Stream 
> #b2039080-25c2-11e6-bd92-d71331aaf180] Streaming error occurred
> java.lang.IllegalArgumentException: Unknown type 0
> at 
> org.apache.cassandra.streaming.messages.StreamMessage$Type.get(StreamMessage.java:96)
>  ~[apache-cassandra-2.2.4.jar:2.2.4]
> at 
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:57)
>  ~[apache-cassandra-2.2.4.jar:2.2.4]
> at 
> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:261)
>  ~[apache-cassandra-2.2.4.jar:2.2.4]
> {code}
> Because of these, decommission process is not succeeding.
> Is interrupting the decommission process safe? Seems like i will have to 
> retry to make it work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-4663) Streaming sends one file at a time serially.

2016-06-20 Thread Anubhav Kale (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340256#comment-15340256
 ] 

Anubhav Kale commented on CASSANDRA-4663:
-

Agree with Paulo. I don't like SS Tables blowing up. I will spend some time on 
sending multiple files at a time, and see what it offers. 

> Streaming sends one file at a time serially. 
> -
>
> Key: CASSANDRA-4663
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4663
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Priority: Minor
>
> This is not fast enough when someone is using SSD and may be 10G link. We 
> should try to create multiple connections and send multiple files in 
> parallel. 
> Current approach under utilize the link(even 1G).
> This change will improve the bootstrapping time of a node. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8700) replace the wiki with docs in the git repo

2016-06-20 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340216#comment-15340216
 ] 

T Jake Luciani commented on CASSANDRA-8700:
---

Monitoring/Metrics PR https://github.com/pcmanus/cassandra/pull/60

> replace the wiki with docs in the git repo
> --
>
> Key: CASSANDRA-8700
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8700
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation and Website
>Reporter: Jon Haddad
>Assignee: Sylvain Lebresne
>Priority: Blocker
> Fix For: 3.8
>
> Attachments: TombstonesAndGcGrace.md, bloom_filters.md, 
> compression.md, contributing.zip, getting_started.zip, hardware.md
>
>
> The wiki as it stands is pretty terrible.  It takes several minutes to apply 
> a single update, and as a result, it's almost never updated.  The information 
> there has very little context as to what version it applies to.  Most people 
> I've talked to that try to use the information they find there find it is 
> more confusing than helpful.
> I'd like to propose that instead of using the wiki, the doc directory in the 
> cassandra repo be used for docs (already used for CQL3 spec) in a format that 
> can be built to a variety of output formats like HTML / epub / etc.  I won't 
> start the bikeshedding on which markup format is preferable - but there are 
> several options that can work perfectly fine.  I've personally use sphinx w/ 
> restructured text, and markdown.  Both can build easily and as an added bonus 
> be pushed to readthedocs (or something similar) automatically.  For an 
> example, see cqlengine's documentation, which I think is already 
> significantly better than the wiki: 
> http://cqlengine.readthedocs.org/en/latest/
> In addition to being overall easier to maintain, putting the documentation in 
> the git repo adds context, since it evolves with the versions of Cassandra.
> If the wiki were kept even remotely up to date, I wouldn't bother with this, 
> but not having at least some basic documentation in the repo, or anywhere 
> associated with the project, is frustrating.
> For reference, the last 3 updates were:
> 1/15/15 - updating committers list
> 1/08/15 - updating contributers and how to contribute
> 12/16/14 - added a link to CQL docs from wiki frontpage (by me)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-4663) Streaming sends one file at a time serially.

2016-06-20 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340211#comment-15340211
 ] 

Jeff Jirsa commented on CASSANDRA-4663:
---

This seems very dangerous until CASSANDRA-11303 is merged (seems to be patch 
available).

 


> Streaming sends one file at a time serially. 
> -
>
> Key: CASSANDRA-4663
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4663
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Priority: Minor
>
> This is not fast enough when someone is using SSD and may be 10G link. We 
> should try to create multiple connections and send multiple files in 
> parallel. 
> Current approach under utilize the link(even 1G).
> This change will improve the bootstrapping time of a node. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-06-20 Thread Joel Knighton (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Knighton updated CASSANDRA-8523:
-
Comment: was deleted

(was: Assigning you to review [~rlow] - just saw the comment from two days ago. 
Let me know if anything has changed there; I'm available to review as well.)

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-06-20 Thread Joel Knighton (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340204#comment-15340204
 ] 

Joel Knighton commented on CASSANDRA-8523:
--

Assigning you to review [~rlow] - just saw the comment from two days ago. Let 
me know if anything has changed there; I'm available to review as well.

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-06-20 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340200#comment-15340200
 ] 

Joshua McKenzie commented on CASSANDRA-8523:


Missed that comment above - [~rlow] to review. Thanks!

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-06-20 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-8523:
---
Reviewer: Richard Low  (was: Richard Low)

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-06-20 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-8523:
---
Reviewer: Richard Low  (was: Joel Knighton)

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10821) OOM Killer terminates Cassandra when Compactions use too much memory then won't restart

2016-06-20 Thread Thom Bartold (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340195#comment-15340195
 ] 

Thom Bartold commented on CASSANDRA-10821:
--

Months later, in a new environment we are running into the OOM problem again.

We have upgraded to version 2.2.5 so the 'saved_chache_directory' problem is 
gone.

Our new configuration is 24 i2-xlarge AWS instances in 3 Availability zones. 
Our main data set is about 1 billion records stored 3 times across the cluster, 
using EC2 Snitch and Network topology 3 x us-east-1.

After an initial 'repair -pr' that ran from June 14 to June 17:

INFO  [Thread-5215] 2016-06-14 19:59:02,938 RepairRunnable.java:124 - Starting 
repair command #1, repairing keyspace overlordprod with repair options 
(parallelism: parallel, primary range: true, incremental: true, job threads: 1, 
ColumnFamilies: [], dataCenters: [], hosts: [], # of ranges: 256)
INFO  [CompactionExecutor:663] 2016-06-17 01:59:25,742 RepairRunnable.java:309 
- Repair command #1 finished in 2 days 6 hours 0 minutes 22 seconds
INFO  [Thread-10147] 2016-06-17 01:59:25,853 RepairRunnable.java:124 - Starting 
repair command #2, repairing keyspace system_traces with repair options 
(parallelism: parallel, primary range: true, incremental: true, job threads: 1, 
ColumnFamilies: [], dataCenters: [], hosts: [], # of ranges: 256)
INFO  [InternalResponseStage:70] 2016-06-17 02:25:40,721 
RepairRunnable.java:309 - Repair command #2 finished in 26 minutes 14 seconds

The number of compactions has gone up and down in the range 800-1500, but is 
now staying at around 800.

In the last two days, Cassandra has been OOM killed twice:

Jun 19 15:22:27 ip-10-242-145-240 kernel: [5329487.278454] Killed process 16391 
(java) total-vm:749879160kB, anon-rss:28776424kB, file-rss:85040kB
Jun 20 07:36:18 ip-10-242-145-240 kernel: [5387918.072086] Killed process 15716 
(java) total-vm:766699696kB, anon-rss:28955732kB, file-rss:118336kB

We believe it is because it is working on one really big compaction:

pending tasks: 811
 id   compaction type   keyspace  
table  completed  totalunit   progress
   c8ff1fb0-3716-11e6-917f-93c4050661a3Compaction   overlordprod   
document 4875839081 6753090774   bytes 72.20%
   f3344ec0-36b9-11e6-917f-93c4050661a3Compaction   overlordprod   
document   225617209930   436848813618   bytes 51.65%
   fcec5120-3717-11e6-917f-93c4050661a3Compaction   overlordprod   
document 202591977716765177436   bytes 12.08%
   d2a960b0-3717-11e6-917f-93c4050661a3Compaction   overlordprod   
document 2459846227 6033377128   bytes 40.77%
Active compaction remaining time :   3h49m53s

We should be able to verify that the same compaction does not complete after 
the next OOM kill.

Note that the i2-xlarge instances have only 800GB of disk, and this one is 
currently 85% full which may be part o the problem. Other nodes in the cluster 
are not using more than 70%.

Filesystem 1K-blocks  Used Available Use% Mounted on
/dev/xvdb  781029612 660898356 120131256  85% /data



> OOM Killer terminates Cassandra when Compactions use too much memory then 
> won't restart
> ---
>
> Key: CASSANDRA-10821
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10821
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
> Environment: EC2 32 x i2.xlarge split between us-east-1a,c and 
> us-west 2a,b
> Linux  4.1.10-17.31.amzn1.x86_64 #1 SMP Sat Oct 24 01:31:37 UTC 2015 x86_64 
> x86_64 x86_64 GNU/Linux
> Java(TM) SE Runtime Environment (build 1.8.0_65-b17)
> Java HotSpot(TM) 64-Bit Server VM (build 25.65-b01, mixed mode)
> Cassandra version: 2.2.3
>Reporter: Thom Bartold
>
> We were writing to the DB from EC2 instances in us-east-1 at a rate of about 
> 3000 per second, replication us-east:2 us-west:2, LeveledCompaction and 
> DeflateCompressor.
> After about 48 hours some nodes had over 800 pending compactions and a few of 
> them started getting killed for Linux OOM. Priam attempts to restart the 
> nodes, but they fail because of corrupted saved_cahce files.
> Loading has finished, and the cluster is mostly idle, but 6 of the nodes were 
> killed again last night by OOM.
> This is the log message where the node won't restart:
> ERROR [main] 2015-12-05 13:59:13,754 CassandraDaemon.java:635 - Detected 
> unreadable sstables /media/ephemeral0/cassandra/saved_caches/KeyCache-ca.db, 
> please check NEWS.txt and ensure that you have upgraded through all required 
> intermediate versions, running upgradesstables
> This is the dmesg where the node is terminated:
> [360803.234422] Out of memory: Kill process 10809 (java) score 949 

[jira] [Updated] (CASSANDRA-12025) dtest failure in paging_test.TestPagingData.test_paging_with_filtering_on_counter_columns

2016-06-20 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-12025:

Assignee: Alex Petrov

> dtest failure in 
> paging_test.TestPagingData.test_paging_with_filtering_on_counter_columns
> -
>
> Key: CASSANDRA-12025
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12025
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Sean McCarthy
>Assignee: Alex Petrov
>  Labels: dtest
> Fix For: 3.x
>
> Attachments: node1.log, node1_debug.log, node1_gc.log, node2.log, 
> node2_debug.log, node2_gc.log, node3.log, node3_debug.log, node3_gc.log
>
>
> example failure:
> http://cassci.datastax.com/job/trunk_dtest/1276/testReport/paging_test/TestPagingData/test_paging_with_filtering_on_counter_columns
> Failed on CassCI build trunk_dtest #1276
> {code}
> Error Message
> Lists differ: [[4, 7, 8, 9], [4, 9, 10, 11]] != [[4, 7, 8, 9], [4, 8, 9, 10], 
> ...
> First differing element 1:
> [4, 9, 10, 11]
> [4, 8, 9, 10]
> Second list contains 1 additional elements.
> First extra element 2:
> [4, 9, 10, 11]
> - [[4, 7, 8, 9], [4, 9, 10, 11]]
> + [[4, 7, 8, 9], [4, 8, 9, 10], [4, 9, 10, 11]]
> ?+++  
> {code}
> {code}
> Stacktrace
>   File "/usr/lib/python2.7/unittest/case.py", line 329, in run
> testMethod()
>   File "/home/automaton/cassandra-dtest/tools.py", line 288, in wrapped
> f(obj)
>   File "/home/automaton/cassandra-dtest/paging_test.py", line 1148, in 
> test_paging_with_filtering_on_counter_columns
> self._test_paging_with_filtering_on_counter_columns(session, True)
>   File "/home/automaton/cassandra-dtest/paging_test.py", line 1107, in 
> _test_paging_with_filtering_on_counter_columns
> [4, 9, 10, 11]])
>   File "/usr/lib/python2.7/unittest/case.py", line 513, in assertEqual
> assertion_func(first, second, msg=msg)
>   File "/usr/lib/python2.7/unittest/case.py", line 742, in assertListEqual
> self.assertSequenceEqual(list1, list2, msg, seq_type=list)
>   File "/usr/lib/python2.7/unittest/case.py", line 724, in assertSequenceEqual
> self.fail(msg)
>   File "/usr/lib/python2.7/unittest/case.py", line 410, in fail
> raise self.failureException(msg)
> {code}
> Logs are attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8523) Writes should be sent to a replacement node while it is streaming in data

2016-06-20 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-8523:
---
Reviewer: Joel Knighton

> Writes should be sent to a replacement node while it is streaming in data
> -
>
> Key: CASSANDRA-8523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8523
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Richard Wagner
>Assignee: Paulo Motta
> Fix For: 2.1.x
>
>
> In our operations, we make heavy use of replace_address (or 
> replace_address_first_boot) in order to replace broken nodes. We now realize 
> that writes are not sent to the replacement nodes while they are in hibernate 
> state and streaming in data. This runs counter to what our expectations were, 
> especially since we know that writes ARE sent to nodes when they are 
> bootstrapped into the ring.
> It seems like cassandra should arrange to send writes to a node that is in 
> the process of replacing another node, just like it does for a nodes that are 
> bootstraping. I hesitate to phrase this as "we should send writes to a node 
> in hibernate" because the concept of hibernate may be useful in other 
> contexts, as per CASSANDRA-8336. Maybe a new state is needed here?
> Among other things, the fact that we don't get writes during this period 
> makes subsequent repairs more expensive, proportional to the number of writes 
> that we miss (and depending on the amount of data that needs to be streamed 
> during replacement and the time it may take to rebuild secondary indexes, we 
> could miss many many hours worth of writes). It also leaves us more exposed 
> to consistency violations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-12039) Add a "post bootstrap task" to the index machinery

2016-06-20 Thread Sergio Bossa (JIRA)
Sergio Bossa created CASSANDRA-12039:


 Summary: Add a "post bootstrap task" to the index machinery
 Key: CASSANDRA-12039
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12039
 Project: Cassandra
  Issue Type: New Feature
Reporter: Sergio Bossa
Assignee: Sergio Bossa


Custom index implementations might need to be notified when the node finishes 
bootstrapping in order to execute some blocking tasks before the node itself 
goes into NORMAL state.

This is a proposal to add such functionality, which should roughly require the 
following:
1) Add a {{getPostBootstrapTask}} callback to the {{Index}} interface.
2) Add an {{executePostBootstrapBlockingTasks}} method to 
{{SecondaryIndexManager}} calling into the previously mentioned callback.
3) Hook that into {{StorageService#joinTokenRing}}.

Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11969) Prevent duplicate ctx.channel().attr() call

2016-06-20 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-11969:

Reviewer: T Jake Luciani

> Prevent duplicate ctx.channel().attr() call
> ---
>
> Key: CASSANDRA-11969
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11969
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Trivial
> Fix For: 3.x
>
>
> In {{Frame}} we can save one call to 
> {{ctx.channel().attr(Connection.attributeKey)}}.
> (Will provide a patch soon)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11971) More uses of DataOutputBuffer.RECYCLER

2016-06-20 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-11971:

Reviewer: T Jake Luciani

> More uses of DataOutputBuffer.RECYCLER
> --
>
> Key: CASSANDRA-11971
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11971
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Minor
> Fix For: 3.x
>
>
> There are a few more possible use cases for {{DataOutputBuffer.RECYCLER}}, 
> which prevents a couple of (larger) allocations.
> (Will provide a patch soon)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11970) Reuse DataOutputBuffer from ColumnIndex

2016-06-20 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-11970:

Reviewer: T Jake Luciani

> Reuse DataOutputBuffer from ColumnIndex
> ---
>
> Key: CASSANDRA-11970
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11970
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Minor
> Fix For: 3.x
>
>
> With a simple change, the {{DataOutputBuffer}} used in {{ColumnIndex}} can be 
> reused. This saves a couple of (larger) object allocations.
> (Will provide a patch soon)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11968) More metrics on native protocol requests & responses

2016-06-20 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-11968:

Reviewer: Sam Tunnicliffe

> More metrics on native protocol requests & responses
> 
>
> Key: CASSANDRA-11968
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11968
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Minor
> Fix For: 3.x
>
>
> Proposal to add more metrics to the native protocol:
> - number of requests per request-type
> - number of responses by response-type
> - size of request messages in bytes
> - size of response messages in bytes
> - number of in-flight requests (from request arrival to response)
> (Will provide a patch soon)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11967) Export metrics for prometheus in its native format

2016-06-20 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-11967:

Reviewer: Sam Tunnicliffe

> Export metrics for prometheus in its native format
> --
>
> Key: CASSANDRA-11967
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11967
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Minor
> Fix For: 3.x
>
>
> https://github.com/snazy/prometheus-metrics-exporter allows to export 
> codahale metrics for prometheus.io. In order to integrate this, a minor 
> change to C* is necessary to load the library.
> This eliminates the need to use the additional graphite-exporter tool and 
> therefore also allows prometheus to track the up/down status of C*.
> (Will provide the patch soon)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11972) Use byte[] instead of object tree in Frame.Header

2016-06-20 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-11972:

Reviewer: Aleksey Yeschenko

> Use byte[] instead of object tree in Frame.Header
> -
>
> Key: CASSANDRA-11972
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11972
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Minor
> Fix For: 3.x
>
>
> Replacing the object tree/references in {{Frame.Header}} with {{byte[9]}} 
> saves a couple of object allocations. Also, not allocating the 9 bytes for 
> the header off-heap is less expensive.
> (will provide a patch soon)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12026) update NEWS.txt to explain system schema exceptions during partial cluster upgrade

2016-06-20 Thread Carl Yeksigian (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Yeksigian updated CASSANDRA-12026:
---
Status: Ready to Commit  (was: Patch Available)

> update NEWS.txt to explain system schema exceptions during partial cluster 
> upgrade
> --
>
> Key: CASSANDRA-12026
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12026
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Russ Hatch
>Assignee: Joshua McKenzie
> Attachments: 12026_v1.txt
>
>
> Upgrade tests found this exception occuring during upgrades:
> {noformat}
> node2: ERROR [MessagingService-Incoming-/127.0.0.1] 2016-06-16 20:14:59,268 
> CassandraDaemon.java:217 - Exception in thread 
> Thread[MessagingService-Incoming-/127.0.0.1,5,main]
> java.lang.RuntimeException: Unknown column cdc during deserialization
>   at 
> org.apache.cassandra.db.Columns$Serializer.deserialize(Columns.java:433) 
> ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.db.SerializationHeader$Serializer.deserializeForMessaging(SerializationHeader.java:407)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.deserializeHeader(UnfilteredRowIteratorSerializer.java:192)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:668)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:656)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:341)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:350)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.service.MigrationManager$MigrationsSerializer.deserialize(MigrationManager.java:610)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.service.MigrationManager$MigrationsSerializer.deserialize(MigrationManager.java:593)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at org.apache.cassandra.net.MessageIn.read(MessageIn.java:114) 
> ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:190)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:178)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:92)
>  ~[apache-cassandra-3.7.jar:3.7]
> {noformat}
> Which is apparently normal and should subside after full cluster upgrade to 
> post-3.0 versions.
> NEWS.txt needs an update to let users know this is not a problem during their 
> upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12026) update NEWS.txt to explain system schema exceptions during partial cluster upgrade

2016-06-20 Thread Carl Yeksigian (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340147#comment-15340147
 ] 

Carl Yeksigian commented on CASSANDRA-12026:


+1

> update NEWS.txt to explain system schema exceptions during partial cluster 
> upgrade
> --
>
> Key: CASSANDRA-12026
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12026
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Russ Hatch
>Assignee: Joshua McKenzie
> Attachments: 12026_v1.txt
>
>
> Upgrade tests found this exception occuring during upgrades:
> {noformat}
> node2: ERROR [MessagingService-Incoming-/127.0.0.1] 2016-06-16 20:14:59,268 
> CassandraDaemon.java:217 - Exception in thread 
> Thread[MessagingService-Incoming-/127.0.0.1,5,main]
> java.lang.RuntimeException: Unknown column cdc during deserialization
>   at 
> org.apache.cassandra.db.Columns$Serializer.deserialize(Columns.java:433) 
> ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.db.SerializationHeader$Serializer.deserializeForMessaging(SerializationHeader.java:407)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.deserializeHeader(UnfilteredRowIteratorSerializer.java:192)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:668)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:656)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:341)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:350)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.service.MigrationManager$MigrationsSerializer.deserialize(MigrationManager.java:610)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.service.MigrationManager$MigrationsSerializer.deserialize(MigrationManager.java:593)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at org.apache.cassandra.net.MessageIn.read(MessageIn.java:114) 
> ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:190)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:178)
>  ~[apache-cassandra-3.7.jar:3.7]
>   at 
> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:92)
>  ~[apache-cassandra-3.7.jar:3.7]
> {noformat}
> Which is apparently normal and should subside after full cluster upgrade to 
> post-3.0 versions.
> NEWS.txt needs an update to let users know this is not a problem during their 
> upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11911) CQLSSTableWriter should allow for unset fields

2016-06-20 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-11911:

Reviewer: Benjamin Lerer

> CQLSSTableWriter should allow for unset fields
> --
>
> Key: CASSANDRA-11911
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11911
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: Cassandra 3.0.6
>Reporter: Matt Kopit
>Assignee: Alex Petrov
>  Labels: lhf
>
> If you are using CQLSSTableWriter to bulk load data into sstables the only 
> way to handle fields without values is by setting them to NULL, which results 
> in the generation of a tombstoned field in the resulting sstable. For a large 
> dataset this can result in a large number of tombstones.
> CQLSSTableWriter is currently instantiated with a single INSERT statement, so 
> it's not an option to modify the insert statement to specify different fields 
> on a per-row basis.
> Here are three potential solutions to this problem:
> 1. Change the default behavior of how NULLs are handled so those fields are 
> treated as UNSET and will never be written to the sstable.
> 2. Create a configuration option for CQLSSTableWriter that governs whether 
> NULLs should be ignored.
> 3. Invent a new constant that represents an UNSET value which can be used in 
> place of NULL.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12018) CDC follow-ups

2016-06-20 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-12018:

Description: 
h6. Platform independent implementation of DirectorySizeCalculator
On linux, simplify to 
{{Arrays.stream(path.listFiles()).mapToLong(File::length).sum();}}

h6. Refactor DirectorySizeCalculator
bq. I don't get the DirectorySizeCalculator. Why the alive and visited sets, 
the listFiles step? Either list the files and just loop through them, or do the 
walkFileTree operation – you are now doing the same work twice. Use a plain 
long instead of the atomic as the class is still thread-unsafe.

h6. TolerateErrorsInSection should not depend on previous SyncSegment status in 
CommitLogReader
bq. tolerateErrorsInSection &=: I don't think it was intended for the value to 
depend on previous iterations.

h6. Refactor interface of SImpleCachedBufferPool
bq. SimpleCachedBufferPool should provide getThreadLocalReusableBuffer(int 
size) which should automatically reallocate if the available size is less, and 
not expose a setter at all.

h6. Change CDC exception to WriteFailureException instead of 
WriteTimeoutException

h6. Remove unused CommitLogTest.testRecovery(byte[] logData)

h6. NoSpamLogger a message when at CDC capacity

  was:
h6. Platform independent implementation of DirectorySizeCalculator
On linux, simplify to 
{{Arrays.stream(path.listFiles()).mapToLong(File::length).sum();}}

h6. Refactor DirectorySizeCalculator
bq. I don't get the DirectorySizeCalculator. Why the alive and visited sets, 
the listFiles step? Either list the files and just loop through them, or do the 
walkFileTree operation – you are now doing the same work twice. Use a plain 
long instead of the atomic as the class is still thread-unsafe.

h6. TolerateErrorsInSection should not depend on previous SyncSegment status in 
CommitLogReader
bq. tolerateErrorsInSection &=: I don't think it was intended for the value to 
depend on previous iterations.

h6. Refactor interface of SImpleCachedBufferPool
bq. SimpleCachedBufferPool should provide getThreadLocalReusableBuffer(int 
size) which should automatically reallocate if the available size is less, and 
not expose a setter at all.

h6. Change CDC exception to WriteFailureException instead of 
WriteTimeoutException

h6. Remove unused CommitLogTest.testRecovery(byte[] logData)


> CDC follow-ups
> --
>
> Key: CASSANDRA-12018
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12018
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Joshua McKenzie
>Assignee: Joshua McKenzie
>Priority: Minor
>
> h6. Platform independent implementation of DirectorySizeCalculator
> On linux, simplify to 
> {{Arrays.stream(path.listFiles()).mapToLong(File::length).sum();}}
> h6. Refactor DirectorySizeCalculator
> bq. I don't get the DirectorySizeCalculator. Why the alive and visited sets, 
> the listFiles step? Either list the files and just loop through them, or do 
> the walkFileTree operation – you are now doing the same work twice. Use a 
> plain long instead of the atomic as the class is still thread-unsafe.
> h6. TolerateErrorsInSection should not depend on previous SyncSegment status 
> in CommitLogReader
> bq. tolerateErrorsInSection &=: I don't think it was intended for the value 
> to depend on previous iterations.
> h6. Refactor interface of SImpleCachedBufferPool
> bq. SimpleCachedBufferPool should provide getThreadLocalReusableBuffer(int 
> size) which should automatically reallocate if the available size is less, 
> and not expose a setter at all.
> h6. Change CDC exception to WriteFailureException instead of 
> WriteTimeoutException
> h6. Remove unused CommitLogTest.testRecovery(byte[] logData)
> h6. NoSpamLogger a message when at CDC capacity



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10869) paging_test.py:TestPagingWithDeletions.test_failure_threshold_deletions dtest fails on 2.1

2016-06-20 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340132#comment-15340132
 ] 

Joshua McKenzie commented on CASSANDRA-10869:
-

[~ifesdjeen]: in the future, please reference the C* JIRA on the dtest PR so 
maintainers of that repo can set themselves as reviewers here and close things 
out if it's a dtest change only.

[~philipthompson] set as reviewer.

> paging_test.py:TestPagingWithDeletions.test_failure_threshold_deletions dtest 
> fails on 2.1
> --
>
> Key: CASSANDRA-10869
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10869
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jim Witschey
>Assignee: Alex Petrov
>  Labels: dtest
>
> This test is failing hard on 2.1. Here is its history on the JDK8 job for 
> cassandra-2.1:
> http://cassci.datastax.com/job/cassandra-2.1_dtest_jdk8/lastCompletedBuild/testReport/paging_test/TestPagingWithDeletions/test_failure_threshold_deletions/history/
> and on the JDK7 job:
> http://cassci.datastax.com/job/cassandra-2.1_dtest/lastCompletedBuild/testReport/paging_test/TestPagingWithDeletions/test_failure_threshold_deletions/history/
> It fails because a read times out after ~1.5 minutes. If this is a test 
> error, it's specific to 2.1, because the test passes consistently on newer 
> versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10869) paging_test.py:TestPagingWithDeletions.test_failure_threshold_deletions dtest fails on 2.1

2016-06-20 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-10869:

   Resolution: Fixed
Fix Version/s: (was: 2.1.x)
   Status: Resolved  (was: Patch Available)

> paging_test.py:TestPagingWithDeletions.test_failure_threshold_deletions dtest 
> fails on 2.1
> --
>
> Key: CASSANDRA-10869
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10869
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jim Witschey
>Assignee: Alex Petrov
>  Labels: dtest
>
> This test is failing hard on 2.1. Here is its history on the JDK8 job for 
> cassandra-2.1:
> http://cassci.datastax.com/job/cassandra-2.1_dtest_jdk8/lastCompletedBuild/testReport/paging_test/TestPagingWithDeletions/test_failure_threshold_deletions/history/
> and on the JDK7 job:
> http://cassci.datastax.com/job/cassandra-2.1_dtest/lastCompletedBuild/testReport/paging_test/TestPagingWithDeletions/test_failure_threshold_deletions/history/
> It fails because a read times out after ~1.5 minutes. If this is a test 
> error, it's specific to 2.1, because the test passes consistently on newer 
> versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10869) paging_test.py:TestPagingWithDeletions.test_failure_threshold_deletions dtest fails on 2.1

2016-06-20 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-10869:

Reviewer: Philip Thompson

> paging_test.py:TestPagingWithDeletions.test_failure_threshold_deletions dtest 
> fails on 2.1
> --
>
> Key: CASSANDRA-10869
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10869
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jim Witschey
>Assignee: Alex Petrov
>  Labels: dtest
> Fix For: 2.1.x
>
>
> This test is failing hard on 2.1. Here is its history on the JDK8 job for 
> cassandra-2.1:
> http://cassci.datastax.com/job/cassandra-2.1_dtest_jdk8/lastCompletedBuild/testReport/paging_test/TestPagingWithDeletions/test_failure_threshold_deletions/history/
> and on the JDK7 job:
> http://cassci.datastax.com/job/cassandra-2.1_dtest/lastCompletedBuild/testReport/paging_test/TestPagingWithDeletions/test_failure_threshold_deletions/history/
> It fails because a read times out after ~1.5 minutes. If this is a test 
> error, it's specific to 2.1, because the test passes consistently on newer 
> versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12034) Special handling for Netty's direct memory allocation failure

2016-06-20 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-12034:

Reviewer: T Jake Luciani

> Special handling for Netty's direct memory allocation failure
> -
>
> Key: CASSANDRA-12034
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12034
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Robert Stupp
>Assignee: Robert Stupp
> Fix For: 3.x
>
>
> With CASSANDRA-12032, Netty throws a 
> {{io.netty.util.internal.OutOfDirectMemoryError}} if there's not enough 
> off-heap memory for the response buffer. We can easily handle this situation 
> and return an error. This is not a condition that destabilizes the system and 
> should therefore not passed to {{JVMStabilityInspector}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12032) Update to Netty 4.0.37

2016-06-20 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-12032:

Reviewer: T Jake Luciani

> Update to Netty 4.0.37
> --
>
> Key: CASSANDRA-12032
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12032
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Robert Stupp
>Assignee: Robert Stupp
> Fix For: 3.x
>
>
> Update Netty to 4.0.37
> (no C* code changes in this ticket)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11870) Consider allocating direct buffers bypassing ByteBuffer.allocateDirect

2016-06-20 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-11870:

Reviewer: T Jake Luciani

> Consider allocating direct buffers bypassing ByteBuffer.allocateDirect
> --
>
> Key: CASSANDRA-11870
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11870
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Minor
> Fix For: 3.x
>
>
> As outlined in CASSANDRA-11818, {{ByteBuffer.allocateDirect}} uses 
> {{Bits.reserveMemory}}, which is there to respect the JVM setting 
> {{-XX:MaxDirectMemorySize=...}}.
> {{Bits.reserveMemory}} first tries an "optimistic" {{tryReserveMemory}} and 
> exits immediately on success. However, if that somehow doesn't succeed, it 
> triggers a {{System.gc()}}, which is bad IMO (however, kind of how direct 
> buffers work in Java). After that GC it sleeps and tries to reserve the 
> memory up to 9 times - up to 511 ms - and then throws 
> {{OutOfMemoryError("Direct buffer memory")}}.
> This is unnecessary for us since we always immediately "free" direct buffers 
> as soon as we no longer need them.
> Proposal: Manage direct-memory reservations in our own code and skip 
> {{Bits.reserveMemory}} that way.
> (However, Netty direct buffers are not under our control.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8700) replace the wiki with docs in the git repo

2016-06-20 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340003#comment-15340003
 ] 

Joshua McKenzie commented on CASSANDRA-8700:


Change data capture PR: https://github.com/pcmanus/cassandra/pull/59

> replace the wiki with docs in the git repo
> --
>
> Key: CASSANDRA-8700
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8700
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Documentation and Website
>Reporter: Jon Haddad
>Assignee: Sylvain Lebresne
>Priority: Blocker
> Fix For: 3.8
>
> Attachments: TombstonesAndGcGrace.md, bloom_filters.md, 
> compression.md, contributing.zip, getting_started.zip, hardware.md
>
>
> The wiki as it stands is pretty terrible.  It takes several minutes to apply 
> a single update, and as a result, it's almost never updated.  The information 
> there has very little context as to what version it applies to.  Most people 
> I've talked to that try to use the information they find there find it is 
> more confusing than helpful.
> I'd like to propose that instead of using the wiki, the doc directory in the 
> cassandra repo be used for docs (already used for CQL3 spec) in a format that 
> can be built to a variety of output formats like HTML / epub / etc.  I won't 
> start the bikeshedding on which markup format is preferable - but there are 
> several options that can work perfectly fine.  I've personally use sphinx w/ 
> restructured text, and markdown.  Both can build easily and as an added bonus 
> be pushed to readthedocs (or something similar) automatically.  For an 
> example, see cqlengine's documentation, which I think is already 
> significantly better than the wiki: 
> http://cqlengine.readthedocs.org/en/latest/
> In addition to being overall easier to maintain, putting the documentation in 
> the git repo adds context, since it evolves with the versions of Cassandra.
> If the wiki were kept even remotely up to date, I wouldn't bother with this, 
> but not having at least some basic documentation in the repo, or anywhere 
> associated with the project, is frustrating.
> For reference, the last 3 updates were:
> 1/15/15 - updating committers list
> 1/08/15 - updating contributers and how to contribute
> 12/16/14 - added a link to CQL docs from wiki frontpage (by me)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12023) Schema upgrade bug with super columns

2016-06-20 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-12023:
--
   Resolution: Fixed
Fix Version/s: (was: 3.0.x)
   (was: 3.x)
   3.0.8
   3.8
   Status: Resolved  (was: Patch Available)

> Schema upgrade bug with super columns
> -
>
> Key: CASSANDRA-12023
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12023
> Project: Cassandra
>  Issue Type: Bug
>  Components: Distributed Metadata
>Reporter: Jeremiah Jordan
>Assignee: Aleksey Yeschenko
>Priority: Critical
> Fix For: 3.8, 3.0.8
>
>
> Doing some upgrade tests starting on 2.0 to 2.1 to 3.0 we hit the follow bug 
> that prevents 3.0 nodes from starting.  Running the test a few times with 
> different waits and flushing sometimes or not I have seen the following 
> errors:
> {code}
> ERROR [main] 2016-06-17 10:42:40,112 CassandraDaemon.java:698 - Exception 
> encountered during startup
> org.apache.cassandra.serializers.MarshalException: cannot parse 'value' as 
> hex bytes
>   at 
> org.apache.cassandra.db.marshal.BytesType.fromString(BytesType.java:45) 
> ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.createColumnFromColumnRow(LegacySchemaMigrator.java:682)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.createColumnsFromColumnRows(LegacySchemaMigrator.java:641)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.decodeTableMetadata(LegacySchemaMigrator.java:316)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTableMetadata(LegacySchemaMigrator.java:273)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTable(LegacySchemaMigrator.java:244)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.lambda$readTables$7(LegacySchemaMigrator.java:237)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at java.util.ArrayList.forEach(ArrayList.java:1249) ~[na:1.8.0_66]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTables(LegacySchemaMigrator.java:237)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readKeyspace(LegacySchemaMigrator.java:186)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.lambda$readSchema$4(LegacySchemaMigrator.java:177)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at java.util.ArrayList.forEach(ArrayList.java:1249) ~[na:1.8.0_66]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readSchema(LegacySchemaMigrator.java:177)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.migrate(LegacySchemaMigrator.java:77)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:229) 
> [apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:557)
>  [apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:685) 
> [apache-cassandra-3.0.7.jar:3.0.7]
> Caused by: java.lang.NumberFormatException: An hex string representing bytes 
> must have an even length
>   at org.apache.cassandra.utils.Hex.hexToBytes(Hex.java:57) 
> ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.db.marshal.BytesType.fromString(BytesType.java:41) 
> ~[apache-cassandra-3.0.7.jar:3.0.7]
>   ... 16 common frames omitted
> {code}
> {code}
> ERROR [main] 2016-06-17 10:49:21,326 CassandraDaemon.java:698 - Exception 
> encountered during startup
> java.lang.RuntimeException: org.codehaus.jackson.JsonParseException: 
> Unexpected character ('K' (code 75)): expected a valid value (number, String, 
> array, object, 'true', 'false' or 'null')
>  at [Source: java.io.StringReader@60d4475f; line: 1, column: 2]
>   at 
> org.apache.cassandra.utils.FBUtilities.fromJsonMap(FBUtilities.java:561) 
> ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.decodeTableParams(LegacySchemaMigrator.java:442)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.decodeTableMetadata(LegacySchemaMigrator.java:365)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTableMetadata(LegacySchemaMigrator.java:273)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> 

[jira] [Commented] (CASSANDRA-12023) Schema upgrade bug with super columns

2016-06-20 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15339920#comment-15339920
 ] 

Aleksey Yeschenko commented on CASSANDRA-12023:
---

Thanks. Test results consistent with 3.0 and trunk, so committed as 
[b671522d008017bc46e48ffee4c43375d96c4f26|https://github.com/apache/cassandra/commit/b671522d008017bc46e48ffee4c43375d96c4f26]
 to 3.0 and merged with trunk.

> Schema upgrade bug with super columns
> -
>
> Key: CASSANDRA-12023
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12023
> Project: Cassandra
>  Issue Type: Bug
>  Components: Distributed Metadata
>Reporter: Jeremiah Jordan
>Assignee: Aleksey Yeschenko
>Priority: Critical
> Fix For: 3.0.x, 3.x
>
>
> Doing some upgrade tests starting on 2.0 to 2.1 to 3.0 we hit the follow bug 
> that prevents 3.0 nodes from starting.  Running the test a few times with 
> different waits and flushing sometimes or not I have seen the following 
> errors:
> {code}
> ERROR [main] 2016-06-17 10:42:40,112 CassandraDaemon.java:698 - Exception 
> encountered during startup
> org.apache.cassandra.serializers.MarshalException: cannot parse 'value' as 
> hex bytes
>   at 
> org.apache.cassandra.db.marshal.BytesType.fromString(BytesType.java:45) 
> ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.createColumnFromColumnRow(LegacySchemaMigrator.java:682)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.createColumnsFromColumnRows(LegacySchemaMigrator.java:641)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.decodeTableMetadata(LegacySchemaMigrator.java:316)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTableMetadata(LegacySchemaMigrator.java:273)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTable(LegacySchemaMigrator.java:244)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.lambda$readTables$7(LegacySchemaMigrator.java:237)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at java.util.ArrayList.forEach(ArrayList.java:1249) ~[na:1.8.0_66]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTables(LegacySchemaMigrator.java:237)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readKeyspace(LegacySchemaMigrator.java:186)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.lambda$readSchema$4(LegacySchemaMigrator.java:177)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at java.util.ArrayList.forEach(ArrayList.java:1249) ~[na:1.8.0_66]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readSchema(LegacySchemaMigrator.java:177)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.migrate(LegacySchemaMigrator.java:77)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:229) 
> [apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:557)
>  [apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:685) 
> [apache-cassandra-3.0.7.jar:3.0.7]
> Caused by: java.lang.NumberFormatException: An hex string representing bytes 
> must have an even length
>   at org.apache.cassandra.utils.Hex.hexToBytes(Hex.java:57) 
> ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.db.marshal.BytesType.fromString(BytesType.java:41) 
> ~[apache-cassandra-3.0.7.jar:3.0.7]
>   ... 16 common frames omitted
> {code}
> {code}
> ERROR [main] 2016-06-17 10:49:21,326 CassandraDaemon.java:698 - Exception 
> encountered during startup
> java.lang.RuntimeException: org.codehaus.jackson.JsonParseException: 
> Unexpected character ('K' (code 75)): expected a valid value (number, String, 
> array, object, 'true', 'false' or 'null')
>  at [Source: java.io.StringReader@60d4475f; line: 1, column: 2]
>   at 
> org.apache.cassandra.utils.FBUtilities.fromJsonMap(FBUtilities.java:561) 
> ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.decodeTableParams(LegacySchemaMigrator.java:442)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.decodeTableMetadata(LegacySchemaMigrator.java:365)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> 

[2/3] cassandra git commit: Fix upgrading schema with super columns with non-text subcomparators

2016-06-20 Thread aleksey
Fix upgrading schema with super columns with non-text subcomparators

patch by Aleksey Yeschenko; reviewed by Jeremiah Jordan for
CASSANDRA-12023


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b671522d
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b671522d
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b671522d

Branch: refs/heads/trunk
Commit: b671522d008017bc46e48ffee4c43375d96c4f26
Parents: 0a0e97d
Author: Aleksey Yeschenko 
Authored: Mon Jun 20 16:06:56 2016 +0100
Committer: Aleksey Yeschenko 
Committed: Mon Jun 20 18:04:56 2016 +0100

--
 CHANGES.txt  |  1 +
 src/java/org/apache/cassandra/db/CompactTables.java  | 11 +++
 .../apache/cassandra/schema/LegacySchemaMigrator.java|  6 --
 3 files changed, 12 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/b671522d/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 7873742..cc682c4 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.0.8
+ * Fix upgrading schema with super columns with non-text subcomparators 
(CASSANDRA-12023)
  * Add TimeWindowCompactionStrategy (CASSANDRA-9666)
 Merged from 2.2:
  * Don't send erroneous NEW_NODE notifications on restart (CASSANDRA-11038)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/b671522d/src/java/org/apache/cassandra/db/CompactTables.java
--
diff --git a/src/java/org/apache/cassandra/db/CompactTables.java 
b/src/java/org/apache/cassandra/db/CompactTables.java
index a72e7f2..a73b865 100644
--- a/src/java/org/apache/cassandra/db/CompactTables.java
+++ b/src/java/org/apache/cassandra/db/CompactTables.java
@@ -91,12 +91,15 @@ public abstract class CompactTables
 return columns.regulars.getSimple(0);
 }
 
-public static AbstractType 
columnDefinitionComparator(ColumnDefinition.Kind kind, boolean isSuper, 
AbstractType rawComparator, AbstractType rawSubComparator)
+public static AbstractType columnDefinitionComparator(String kind, 
boolean isSuper, AbstractType rawComparator, AbstractType 
rawSubComparator)
 {
+if ("compact_value".equals(kind))
+return UTF8Type.instance;
+
 if (isSuper)
-return kind == ColumnDefinition.Kind.REGULAR ? rawSubComparator : 
UTF8Type.instance;
-else
-return kind == ColumnDefinition.Kind.STATIC ? rawComparator : 
UTF8Type.instance;
+return "regular".equals(kind) ? rawSubComparator : 
UTF8Type.instance;
+
+return "static".equals(kind) ? rawComparator : UTF8Type.instance;
 }
 
 public static boolean hasEmptyCompactValue(CFMetaData metadata)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/b671522d/src/java/org/apache/cassandra/schema/LegacySchemaMigrator.java
--
diff --git a/src/java/org/apache/cassandra/schema/LegacySchemaMigrator.java 
b/src/java/org/apache/cassandra/schema/LegacySchemaMigrator.java
index 7411b93..924bd7a 100644
--- a/src/java/org/apache/cassandra/schema/LegacySchemaMigrator.java
+++ b/src/java/org/apache/cassandra/schema/LegacySchemaMigrator.java
@@ -662,7 +662,9 @@ public final class LegacySchemaMigrator
   boolean 
isStaticCompactTable,
   boolean 
needsUpgrade)
 {
-ColumnDefinition.Kind kind = deserializeKind(row.getString("type"));
+String rawKind = row.getString("type");
+
+ColumnDefinition.Kind kind = deserializeKind(rawKind);
 if (needsUpgrade && isStaticCompactTable && kind == 
ColumnDefinition.Kind.REGULAR)
 kind = ColumnDefinition.Kind.STATIC;
 
@@ -678,7 +680,7 @@ public final class LegacySchemaMigrator
 // we need to use the comparator fromString method
 AbstractType comparator = isCQLTable
  ? UTF8Type.instance
- : 
CompactTables.columnDefinitionComparator(kind, isSuper, rawComparator, 
rawSubComparator);
+ : 
CompactTables.columnDefinitionComparator(rawKind, isSuper, rawComparator, 
rawSubComparator);
 ColumnIdentifier name = 
ColumnIdentifier.getInterned(comparator.fromString(row.getString("column_name")),
 comparator);
 
 AbstractType validator = parseType(row.getString("validator"));



[3/3] cassandra git commit: Merge branch 'cassandra-3.0' into trunk

2016-06-20 Thread aleksey
Merge branch 'cassandra-3.0' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/fb781c99
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/fb781c99
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/fb781c99

Branch: refs/heads/trunk
Commit: fb781c99bb82395a0bca3aca41f621e00c3e3074
Parents: 88229a4 b671522
Author: Aleksey Yeschenko 
Authored: Mon Jun 20 18:08:32 2016 +0100
Committer: Aleksey Yeschenko 
Committed: Mon Jun 20 18:08:32 2016 +0100

--
 CHANGES.txt  |  2 ++
 src/java/org/apache/cassandra/db/CompactTables.java  | 11 +++
 .../apache/cassandra/schema/LegacySchemaMigrator.java|  6 --
 3 files changed, 13 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/fb781c99/CHANGES.txt
--
diff --cc CHANGES.txt
index d09cd5a,cc682c4..519856a
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,21 -1,5 +1,22 @@@
 -3.0.8
 +3.8
 + * SSTable tools mishandling LocalPartitioner (CASSANDRA-12002)
 + * When SEPWorker assigned work, set thread name to match pool 
(CASSANDRA-11966)
 + * Add cross-DC latency metrics (CASSANDRA-11596)
 + * Allow terms in selection clause (CASSANDRA-10783)
 + * Add bind variables to trace (CASSANDRA-11719)
 + * Switch counter shards' clock to timestamps (CASSANDRA-9811)
 + * Introduce HdrHistogram and response/service/wait separation to stress tool 
(CASSANDRA-11853)
 + * entry-weighers in QueryProcessor should respect partitionKeyBindIndexes 
field (CASSANDRA-11718)
 + * Support older ant versions (CASSANDRA-11807)
 + * Estimate compressed on disk size when deciding if sstable size limit 
reached (CASSANDRA-11623)
 + * cassandra-stress profiles should support case sensitive schemas 
(CASSANDRA-11546)
 + * Remove DatabaseDescriptor dependency from FileUtils (CASSANDRA-11578)
 + * Faster streaming (CASSANDRA-9766)
 + * Add prepared query parameter to trace for "Execute CQL3 prepared query" 
session (CASSANDRA-11425)
 + * Add repaired percentage metric (CASSANDRA-11503)
 + * Add Change-Data-Capture (CASSANDRA-8844)
 +Merged from 3.0:
+  * Fix upgrading schema with super columns with non-text subcomparators 
(CASSANDRA-12023)
   * Add TimeWindowCompactionStrategy (CASSANDRA-9666)
  Merged from 2.2:
   * Don't send erroneous NEW_NODE notifications on restart (CASSANDRA-11038)
@@@ -26,10 -10,8 +27,11 @@@ Merged from 2.1
   * Create interval tree over canonical sstables to avoid missing sstables 
during streaming (CASSANDRA-11886)
   * cqlsh COPY FROM: shutdown parent cluster after forking, to avoid 
corrupting SSL connections (CASSANDRA-11749)
  
+ 
 -3.0.7
 +3.7
 + * Support multiple folders for user defined compaction tasks 
(CASSANDRA-11765)
 + * Fix race in CompactionStrategyManager's pause/resume (CASSANDRA-11922)
 +Merged from 3.0:
   * Fix legacy serialization of Thrift-generated non-compound range tombstones
 when communicating with 2.x nodes (CASSANDRA-11930)
   * Fix Directories instantiations where CFS.initialDirectories should be used 
(CASSANDRA-11849)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/fb781c99/src/java/org/apache/cassandra/db/CompactTables.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/fb781c99/src/java/org/apache/cassandra/schema/LegacySchemaMigrator.java
--



[1/3] cassandra git commit: Fix upgrading schema with super columns with non-text subcomparators

2016-06-20 Thread aleksey
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-3.0 0a0e97df5 -> b671522d0
  refs/heads/trunk 88229a47a -> fb781c99b


Fix upgrading schema with super columns with non-text subcomparators

patch by Aleksey Yeschenko; reviewed by Jeremiah Jordan for
CASSANDRA-12023


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b671522d
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b671522d
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b671522d

Branch: refs/heads/cassandra-3.0
Commit: b671522d008017bc46e48ffee4c43375d96c4f26
Parents: 0a0e97d
Author: Aleksey Yeschenko 
Authored: Mon Jun 20 16:06:56 2016 +0100
Committer: Aleksey Yeschenko 
Committed: Mon Jun 20 18:04:56 2016 +0100

--
 CHANGES.txt  |  1 +
 src/java/org/apache/cassandra/db/CompactTables.java  | 11 +++
 .../apache/cassandra/schema/LegacySchemaMigrator.java|  6 --
 3 files changed, 12 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/b671522d/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 7873742..cc682c4 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.0.8
+ * Fix upgrading schema with super columns with non-text subcomparators 
(CASSANDRA-12023)
  * Add TimeWindowCompactionStrategy (CASSANDRA-9666)
 Merged from 2.2:
  * Don't send erroneous NEW_NODE notifications on restart (CASSANDRA-11038)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/b671522d/src/java/org/apache/cassandra/db/CompactTables.java
--
diff --git a/src/java/org/apache/cassandra/db/CompactTables.java 
b/src/java/org/apache/cassandra/db/CompactTables.java
index a72e7f2..a73b865 100644
--- a/src/java/org/apache/cassandra/db/CompactTables.java
+++ b/src/java/org/apache/cassandra/db/CompactTables.java
@@ -91,12 +91,15 @@ public abstract class CompactTables
 return columns.regulars.getSimple(0);
 }
 
-public static AbstractType 
columnDefinitionComparator(ColumnDefinition.Kind kind, boolean isSuper, 
AbstractType rawComparator, AbstractType rawSubComparator)
+public static AbstractType columnDefinitionComparator(String kind, 
boolean isSuper, AbstractType rawComparator, AbstractType 
rawSubComparator)
 {
+if ("compact_value".equals(kind))
+return UTF8Type.instance;
+
 if (isSuper)
-return kind == ColumnDefinition.Kind.REGULAR ? rawSubComparator : 
UTF8Type.instance;
-else
-return kind == ColumnDefinition.Kind.STATIC ? rawComparator : 
UTF8Type.instance;
+return "regular".equals(kind) ? rawSubComparator : 
UTF8Type.instance;
+
+return "static".equals(kind) ? rawComparator : UTF8Type.instance;
 }
 
 public static boolean hasEmptyCompactValue(CFMetaData metadata)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/b671522d/src/java/org/apache/cassandra/schema/LegacySchemaMigrator.java
--
diff --git a/src/java/org/apache/cassandra/schema/LegacySchemaMigrator.java 
b/src/java/org/apache/cassandra/schema/LegacySchemaMigrator.java
index 7411b93..924bd7a 100644
--- a/src/java/org/apache/cassandra/schema/LegacySchemaMigrator.java
+++ b/src/java/org/apache/cassandra/schema/LegacySchemaMigrator.java
@@ -662,7 +662,9 @@ public final class LegacySchemaMigrator
   boolean 
isStaticCompactTable,
   boolean 
needsUpgrade)
 {
-ColumnDefinition.Kind kind = deserializeKind(row.getString("type"));
+String rawKind = row.getString("type");
+
+ColumnDefinition.Kind kind = deserializeKind(rawKind);
 if (needsUpgrade && isStaticCompactTable && kind == 
ColumnDefinition.Kind.REGULAR)
 kind = ColumnDefinition.Kind.STATIC;
 
@@ -678,7 +680,7 @@ public final class LegacySchemaMigrator
 // we need to use the comparator fromString method
 AbstractType comparator = isCQLTable
  ? UTF8Type.instance
- : 
CompactTables.columnDefinitionComparator(kind, isSuper, rawComparator, 
rawSubComparator);
+ : 
CompactTables.columnDefinitionComparator(rawKind, isSuper, rawComparator, 
rawSubComparator);
 ColumnIdentifier name = 
ColumnIdentifier.getInterned(comparator.fromString(row.getString("column_name")),
 comparator);
 
 AbstractType validator = parseType(row.getString("validator"));



[jira] [Commented] (CASSANDRA-10786) Include hash of result set metadata in prepared statement id

2016-06-20 Thread Andy Tolbert (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15339842#comment-15339842
 ] 

Andy Tolbert commented on CASSANDRA-10786:
--

We (java driver team) discussed this and we agree with taking the approach 
we've previously taken, we can do a separate branch with this change, build a 
jar and include it with C* (in the lib/ directory as currently done).  We'll do 
some extra validation to make sure everything is good from the driver side.  
After a driver has been formally released with those changes, we would want to 
update C* to use that.

A few questions though:

* What is the timeline of this change?  Will it be in 3.8 or 3.10?
* Will this be the only change in for protocol v5 or will some of the tickets 
in [CASSANDRA-9362] be included as well?

> Include hash of result set metadata in prepared statement id
> 
>
> Key: CASSANDRA-10786
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10786
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
>Reporter: Olivier Michallat
>Assignee: Alex Petrov
>Priority: Minor
>  Labels: client-impacting, doc-impacting, protocolv5
> Fix For: 3.x
>
>
> This is a follow-up to CASSANDRA-7910, which was about invalidating a 
> prepared statement when the table is altered, to force clients to update 
> their local copy of the metadata.
> There's still an issue if multiple clients are connected to the same host. 
> The first client to execute the query after the cache was invalidated will 
> receive an UNPREPARED response, re-prepare, and update its local metadata. 
> But other clients might miss it entirely (the MD5 hasn't changed), and they 
> will keep using their old metadata. For example:
> # {{SELECT * ...}} statement is prepared in Cassandra with md5 abc123, 
> clientA and clientB both have a cache of the metadata (columns b and c) 
> locally
> # column a gets added to the table, C* invalidates its cache entry
> # clientA sends an EXECUTE request for md5 abc123, gets UNPREPARED response, 
> re-prepares on the fly and updates its local metadata to (a, b, c)
> # prepared statement is now in C*’s cache again, with the same md5 abc123
> # clientB sends an EXECUTE request for id abc123. Because the cache has been 
> populated again, the query succeeds. But clientB still has not updated its 
> metadata, it’s still (b,c)
> One solution that was suggested is to include a hash of the result set 
> metadata in the md5. This way the md5 would change at step 3, and any client 
> using the old md5 would get an UNPREPARED, regardless of whether another 
> client already reprepared.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11957) Implement seek() of org.apache.cassandra.db.commitlog.EncryptedFileSegmentInputStream

2016-06-20 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-11957:

Assignee: Imran Chaudhry  (was: Joshua McKenzie)

> Implement seek() of 
> org.apache.cassandra.db.commitlog.EncryptedFileSegmentInputStream 
> --
>
> Key: CASSANDRA-11957
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11957
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination, Local Write-Read Paths
>Reporter: Imran Chaudhry
>Assignee: Imran Chaudhry
>Priority: Critical
> Fix For: 3.x
>
>
> CDC needs the seek() method of 
> org.apache.cassandra.db.commitlog.EncryptedFileSegmentInputStream implemented 
> (currently throws an exception.)
> CommitLogs are read using this stream and the seek() method needs to be 
> implemented so that mutations which are appended to the currently active 
> commitlog can be read out in realtime.
>  
> Current implementation is:
>   
> public void seek(long position)
> {
> // implement this when we actually need it
> throw new UnsupportedOperationException();
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11957) Implement seek() of org.apache.cassandra.db.commitlog.EncryptedFileSegmentInputStream

2016-06-20 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-11957:

Reviewer: Joshua McKenzie

> Implement seek() of 
> org.apache.cassandra.db.commitlog.EncryptedFileSegmentInputStream 
> --
>
> Key: CASSANDRA-11957
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11957
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Coordination, Local Write-Read Paths
>Reporter: Imran Chaudhry
>Assignee: Imran Chaudhry
>Priority: Critical
> Fix For: 3.x
>
>
> CDC needs the seek() method of 
> org.apache.cassandra.db.commitlog.EncryptedFileSegmentInputStream implemented 
> (currently throws an exception.)
> CommitLogs are read using this stream and the seek() method needs to be 
> implemented so that mutations which are appended to the currently active 
> commitlog can be read out in realtime.
>  
> Current implementation is:
>   
> public void seek(long position)
> {
> // implement this when we actually need it
> throw new UnsupportedOperationException();
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12025) dtest failure in paging_test.TestPagingData.test_paging_with_filtering_on_counter_columns

2016-06-20 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-12025:

  Assignee: (was: DS Test Eng)
Issue Type: Bug  (was: Test)

Looks like a possible bug, passing to dev

> dtest failure in 
> paging_test.TestPagingData.test_paging_with_filtering_on_counter_columns
> -
>
> Key: CASSANDRA-12025
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12025
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Sean McCarthy
>  Labels: dtest
> Fix For: 3.x
>
> Attachments: node1.log, node1_debug.log, node1_gc.log, node2.log, 
> node2_debug.log, node2_gc.log, node3.log, node3_debug.log, node3_gc.log
>
>
> example failure:
> http://cassci.datastax.com/job/trunk_dtest/1276/testReport/paging_test/TestPagingData/test_paging_with_filtering_on_counter_columns
> Failed on CassCI build trunk_dtest #1276
> {code}
> Error Message
> Lists differ: [[4, 7, 8, 9], [4, 9, 10, 11]] != [[4, 7, 8, 9], [4, 8, 9, 10], 
> ...
> First differing element 1:
> [4, 9, 10, 11]
> [4, 8, 9, 10]
> Second list contains 1 additional elements.
> First extra element 2:
> [4, 9, 10, 11]
> - [[4, 7, 8, 9], [4, 9, 10, 11]]
> + [[4, 7, 8, 9], [4, 8, 9, 10], [4, 9, 10, 11]]
> ?+++  
> {code}
> {code}
> Stacktrace
>   File "/usr/lib/python2.7/unittest/case.py", line 329, in run
> testMethod()
>   File "/home/automaton/cassandra-dtest/tools.py", line 288, in wrapped
> f(obj)
>   File "/home/automaton/cassandra-dtest/paging_test.py", line 1148, in 
> test_paging_with_filtering_on_counter_columns
> self._test_paging_with_filtering_on_counter_columns(session, True)
>   File "/home/automaton/cassandra-dtest/paging_test.py", line 1107, in 
> _test_paging_with_filtering_on_counter_columns
> [4, 9, 10, 11]])
>   File "/usr/lib/python2.7/unittest/case.py", line 513, in assertEqual
> assertion_func(first, second, msg=msg)
>   File "/usr/lib/python2.7/unittest/case.py", line 742, in assertListEqual
> self.assertSequenceEqual(list1, list2, msg, seq_type=list)
>   File "/usr/lib/python2.7/unittest/case.py", line 724, in assertSequenceEqual
> self.fail(msg)
>   File "/usr/lib/python2.7/unittest/case.py", line 410, in fail
> raise self.failureException(msg)
> {code}
> Logs are attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12025) dtest failure in paging_test.TestPagingData.test_paging_with_filtering_on_counter_columns

2016-06-20 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-12025:

Fix Version/s: 3.x

> dtest failure in 
> paging_test.TestPagingData.test_paging_with_filtering_on_counter_columns
> -
>
> Key: CASSANDRA-12025
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12025
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Sean McCarthy
>  Labels: dtest
> Fix For: 3.x
>
> Attachments: node1.log, node1_debug.log, node1_gc.log, node2.log, 
> node2_debug.log, node2_gc.log, node3.log, node3_debug.log, node3_gc.log
>
>
> example failure:
> http://cassci.datastax.com/job/trunk_dtest/1276/testReport/paging_test/TestPagingData/test_paging_with_filtering_on_counter_columns
> Failed on CassCI build trunk_dtest #1276
> {code}
> Error Message
> Lists differ: [[4, 7, 8, 9], [4, 9, 10, 11]] != [[4, 7, 8, 9], [4, 8, 9, 10], 
> ...
> First differing element 1:
> [4, 9, 10, 11]
> [4, 8, 9, 10]
> Second list contains 1 additional elements.
> First extra element 2:
> [4, 9, 10, 11]
> - [[4, 7, 8, 9], [4, 9, 10, 11]]
> + [[4, 7, 8, 9], [4, 8, 9, 10], [4, 9, 10, 11]]
> ?+++  
> {code}
> {code}
> Stacktrace
>   File "/usr/lib/python2.7/unittest/case.py", line 329, in run
> testMethod()
>   File "/home/automaton/cassandra-dtest/tools.py", line 288, in wrapped
> f(obj)
>   File "/home/automaton/cassandra-dtest/paging_test.py", line 1148, in 
> test_paging_with_filtering_on_counter_columns
> self._test_paging_with_filtering_on_counter_columns(session, True)
>   File "/home/automaton/cassandra-dtest/paging_test.py", line 1107, in 
> _test_paging_with_filtering_on_counter_columns
> [4, 9, 10, 11]])
>   File "/usr/lib/python2.7/unittest/case.py", line 513, in assertEqual
> assertion_func(first, second, msg=msg)
>   File "/usr/lib/python2.7/unittest/case.py", line 742, in assertListEqual
> self.assertSequenceEqual(list1, list2, msg, seq_type=list)
>   File "/usr/lib/python2.7/unittest/case.py", line 724, in assertSequenceEqual
> self.fail(msg)
>   File "/usr/lib/python2.7/unittest/case.py", line 410, in fail
> raise self.failureException(msg)
> {code}
> Logs are attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12023) Schema upgrade bug with super columns

2016-06-20 Thread Jeremiah Jordan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15339769#comment-15339769
 ] 

Jeremiah Jordan commented on CASSANDRA-12023:
-

+1 patch looks good to me and stuff doesn't break any more with my tests.

> Schema upgrade bug with super columns
> -
>
> Key: CASSANDRA-12023
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12023
> Project: Cassandra
>  Issue Type: Bug
>  Components: Distributed Metadata
>Reporter: Jeremiah Jordan
>Assignee: Aleksey Yeschenko
>Priority: Critical
> Fix For: 3.0.x, 3.x
>
>
> Doing some upgrade tests starting on 2.0 to 2.1 to 3.0 we hit the follow bug 
> that prevents 3.0 nodes from starting.  Running the test a few times with 
> different waits and flushing sometimes or not I have seen the following 
> errors:
> {code}
> ERROR [main] 2016-06-17 10:42:40,112 CassandraDaemon.java:698 - Exception 
> encountered during startup
> org.apache.cassandra.serializers.MarshalException: cannot parse 'value' as 
> hex bytes
>   at 
> org.apache.cassandra.db.marshal.BytesType.fromString(BytesType.java:45) 
> ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.createColumnFromColumnRow(LegacySchemaMigrator.java:682)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.createColumnsFromColumnRows(LegacySchemaMigrator.java:641)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.decodeTableMetadata(LegacySchemaMigrator.java:316)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTableMetadata(LegacySchemaMigrator.java:273)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTable(LegacySchemaMigrator.java:244)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.lambda$readTables$7(LegacySchemaMigrator.java:237)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at java.util.ArrayList.forEach(ArrayList.java:1249) ~[na:1.8.0_66]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTables(LegacySchemaMigrator.java:237)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readKeyspace(LegacySchemaMigrator.java:186)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.lambda$readSchema$4(LegacySchemaMigrator.java:177)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at java.util.ArrayList.forEach(ArrayList.java:1249) ~[na:1.8.0_66]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readSchema(LegacySchemaMigrator.java:177)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.migrate(LegacySchemaMigrator.java:77)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:229) 
> [apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:557)
>  [apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:685) 
> [apache-cassandra-3.0.7.jar:3.0.7]
> Caused by: java.lang.NumberFormatException: An hex string representing bytes 
> must have an even length
>   at org.apache.cassandra.utils.Hex.hexToBytes(Hex.java:57) 
> ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.db.marshal.BytesType.fromString(BytesType.java:41) 
> ~[apache-cassandra-3.0.7.jar:3.0.7]
>   ... 16 common frames omitted
> {code}
> {code}
> ERROR [main] 2016-06-17 10:49:21,326 CassandraDaemon.java:698 - Exception 
> encountered during startup
> java.lang.RuntimeException: org.codehaus.jackson.JsonParseException: 
> Unexpected character ('K' (code 75)): expected a valid value (number, String, 
> array, object, 'true', 'false' or 'null')
>  at [Source: java.io.StringReader@60d4475f; line: 1, column: 2]
>   at 
> org.apache.cassandra.utils.FBUtilities.fromJsonMap(FBUtilities.java:561) 
> ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.decodeTableParams(LegacySchemaMigrator.java:442)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.decodeTableMetadata(LegacySchemaMigrator.java:365)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTableMetadata(LegacySchemaMigrator.java:273)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTable(LegacySchemaMigrator.java:244)
>  

[jira] [Updated] (CASSANDRA-12023) Schema upgrade bug with super columns

2016-06-20 Thread Jeremiah Jordan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremiah Jordan updated CASSANDRA-12023:

Reviewer: Jeremiah Jordan  (was: Sylvain Lebresne)

> Schema upgrade bug with super columns
> -
>
> Key: CASSANDRA-12023
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12023
> Project: Cassandra
>  Issue Type: Bug
>  Components: Distributed Metadata
>Reporter: Jeremiah Jordan
>Assignee: Aleksey Yeschenko
>Priority: Critical
> Fix For: 3.0.x, 3.x
>
>
> Doing some upgrade tests starting on 2.0 to 2.1 to 3.0 we hit the follow bug 
> that prevents 3.0 nodes from starting.  Running the test a few times with 
> different waits and flushing sometimes or not I have seen the following 
> errors:
> {code}
> ERROR [main] 2016-06-17 10:42:40,112 CassandraDaemon.java:698 - Exception 
> encountered during startup
> org.apache.cassandra.serializers.MarshalException: cannot parse 'value' as 
> hex bytes
>   at 
> org.apache.cassandra.db.marshal.BytesType.fromString(BytesType.java:45) 
> ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.createColumnFromColumnRow(LegacySchemaMigrator.java:682)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.createColumnsFromColumnRows(LegacySchemaMigrator.java:641)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.decodeTableMetadata(LegacySchemaMigrator.java:316)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTableMetadata(LegacySchemaMigrator.java:273)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTable(LegacySchemaMigrator.java:244)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.lambda$readTables$7(LegacySchemaMigrator.java:237)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at java.util.ArrayList.forEach(ArrayList.java:1249) ~[na:1.8.0_66]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTables(LegacySchemaMigrator.java:237)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readKeyspace(LegacySchemaMigrator.java:186)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.lambda$readSchema$4(LegacySchemaMigrator.java:177)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at java.util.ArrayList.forEach(ArrayList.java:1249) ~[na:1.8.0_66]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readSchema(LegacySchemaMigrator.java:177)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.migrate(LegacySchemaMigrator.java:77)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:229) 
> [apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:557)
>  [apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:685) 
> [apache-cassandra-3.0.7.jar:3.0.7]
> Caused by: java.lang.NumberFormatException: An hex string representing bytes 
> must have an even length
>   at org.apache.cassandra.utils.Hex.hexToBytes(Hex.java:57) 
> ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.db.marshal.BytesType.fromString(BytesType.java:41) 
> ~[apache-cassandra-3.0.7.jar:3.0.7]
>   ... 16 common frames omitted
> {code}
> {code}
> ERROR [main] 2016-06-17 10:49:21,326 CassandraDaemon.java:698 - Exception 
> encountered during startup
> java.lang.RuntimeException: org.codehaus.jackson.JsonParseException: 
> Unexpected character ('K' (code 75)): expected a valid value (number, String, 
> array, object, 'true', 'false' or 'null')
>  at [Source: java.io.StringReader@60d4475f; line: 1, column: 2]
>   at 
> org.apache.cassandra.utils.FBUtilities.fromJsonMap(FBUtilities.java:561) 
> ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.decodeTableParams(LegacySchemaMigrator.java:442)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.decodeTableMetadata(LegacySchemaMigrator.java:365)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTableMetadata(LegacySchemaMigrator.java:273)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTable(LegacySchemaMigrator.java:244)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> 

[jira] [Updated] (CASSANDRA-12023) Schema upgrade bug with super columns

2016-06-20 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-12023:
--
   Reviewer: Sylvain Lebresne
Component/s: (was: Core)
 Distributed Metadata

> Schema upgrade bug with super columns
> -
>
> Key: CASSANDRA-12023
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12023
> Project: Cassandra
>  Issue Type: Bug
>  Components: Distributed Metadata
>Reporter: Jeremiah Jordan
>Assignee: Aleksey Yeschenko
>Priority: Critical
> Fix For: 3.0.x, 3.x
>
>
> Doing some upgrade tests starting on 2.0 to 2.1 to 3.0 we hit the follow bug 
> that prevents 3.0 nodes from starting.  Running the test a few times with 
> different waits and flushing sometimes or not I have seen the following 
> errors:
> {code}
> ERROR [main] 2016-06-17 10:42:40,112 CassandraDaemon.java:698 - Exception 
> encountered during startup
> org.apache.cassandra.serializers.MarshalException: cannot parse 'value' as 
> hex bytes
>   at 
> org.apache.cassandra.db.marshal.BytesType.fromString(BytesType.java:45) 
> ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.createColumnFromColumnRow(LegacySchemaMigrator.java:682)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.createColumnsFromColumnRows(LegacySchemaMigrator.java:641)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.decodeTableMetadata(LegacySchemaMigrator.java:316)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTableMetadata(LegacySchemaMigrator.java:273)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTable(LegacySchemaMigrator.java:244)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.lambda$readTables$7(LegacySchemaMigrator.java:237)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at java.util.ArrayList.forEach(ArrayList.java:1249) ~[na:1.8.0_66]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTables(LegacySchemaMigrator.java:237)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readKeyspace(LegacySchemaMigrator.java:186)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.lambda$readSchema$4(LegacySchemaMigrator.java:177)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at java.util.ArrayList.forEach(ArrayList.java:1249) ~[na:1.8.0_66]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readSchema(LegacySchemaMigrator.java:177)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.migrate(LegacySchemaMigrator.java:77)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:229) 
> [apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:557)
>  [apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:685) 
> [apache-cassandra-3.0.7.jar:3.0.7]
> Caused by: java.lang.NumberFormatException: An hex string representing bytes 
> must have an even length
>   at org.apache.cassandra.utils.Hex.hexToBytes(Hex.java:57) 
> ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.db.marshal.BytesType.fromString(BytesType.java:41) 
> ~[apache-cassandra-3.0.7.jar:3.0.7]
>   ... 16 common frames omitted
> {code}
> {code}
> ERROR [main] 2016-06-17 10:49:21,326 CassandraDaemon.java:698 - Exception 
> encountered during startup
> java.lang.RuntimeException: org.codehaus.jackson.JsonParseException: 
> Unexpected character ('K' (code 75)): expected a valid value (number, String, 
> array, object, 'true', 'false' or 'null')
>  at [Source: java.io.StringReader@60d4475f; line: 1, column: 2]
>   at 
> org.apache.cassandra.utils.FBUtilities.fromJsonMap(FBUtilities.java:561) 
> ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.decodeTableParams(LegacySchemaMigrator.java:442)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.decodeTableMetadata(LegacySchemaMigrator.java:365)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTableMetadata(LegacySchemaMigrator.java:273)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTable(LegacySchemaMigrator.java:244)
>  

[jira] [Commented] (CASSANDRA-11854) Remove finished streaming connections from MessagingService

2016-06-20 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15339732#comment-15339732
 ] 

Paulo Motta commented on CASSANDRA-11854:
-

Addressed nits, rebased and resubmitted tests.

||2.1||2.2||3.0||trunk||
|[branch|https://github.com/apache/cassandra/compare/cassandra-2.1...pauloricardomg:2.1-11854]|[branch|https://github.com/apache/cassandra/compare/cassandra-2.2...pauloricardomg:2.2-11854]|[branch|https://github.com/apache/cassandra/compare/cassandra-3.0...pauloricardomg:3.0-11854]|[branch|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-11854]|
|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.1-11854-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.2-11854-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.0-11854-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-11854-testall/lastCompletedBuild/testReport/]|
|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.1-11854-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-2.2-11854-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.0-11854-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-11854-dtest/lastCompletedBuild/testReport/]|

Commit info: conflict from 2.1 to 2.2. 3.0 patch is slightly different and 
merges to trunk.

> Remove finished streaming connections from MessagingService
> ---
>
> Key: CASSANDRA-11854
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11854
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Paulo Motta
>Assignee: Paulo Motta
> Attachments: oom.png
>
>
> When a new {{IncomingStreamingConnection}} is created, [we register it in the 
> connections 
> map|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/MessagingService.java#L1109]
>  of {{MessagingService}}, but we [only remove it if there is an 
> exception|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/IncomingStreamingConnection.java#L83]
>  while attaching the socket to the stream session.
> On nodes with SSL and large number of vnodes, after many repair sessions 
> these old connections can accumulate and cause OOM (heap dump attached).
> The connection should be removed from the connections map after if it's 
> finished in order to be garbage collected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11315) Upgrade from 2.2.6 to 3.0.5 Fails with AssertionError

2016-06-20 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15339709#comment-15339709
 ] 

Aleksey Yeschenko commented on CASSANDRA-11315:
---

bq. Hi Aleksey Yeschenko, do you still have this ticket on your radar? 

Literally next on my list (:

> Upgrade from 2.2.6 to 3.0.5 Fails with AssertionError
> -
>
> Key: CASSANDRA-11315
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11315
> Project: Cassandra
>  Issue Type: Bug
> Environment: Ubuntu 14.04, Oracle Java 8, Apache Cassandra 2.2.5 -> 
> 3.0.3, Apache Cassandra 2.2.6 -> 3.0.5
>Reporter: Dominik Keil
>Assignee: Aleksey Yeschenko
>Priority: Blocker
> Fix For: 3.0.x, 3.x
>
>
> Hi,
> when trying to upgrade our development cluster from C* 2.2.5 to 3.0.3 
> Cassandra fails during startup.
> Here's the relevant log snippet:
> {noformat}
> [...]
> INFO  [main] 2016-03-08 11:42:01,291 ColumnFamilyStore.java:381 - 
> Initializing system.schema_triggers
> INFO  [main] 2016-03-08 11:42:01,302 ColumnFamilyStore.java:381 - 
> Initializing system.schema_usertypes
> INFO  [main] 2016-03-08 11:42:01,313 ColumnFamilyStore.java:381 - 
> Initializing system.schema_functions
> INFO  [main] 2016-03-08 11:42:01,324 ColumnFamilyStore.java:381 - 
> Initializing system.schema_aggregates
> INFO  [main] 2016-03-08 11:42:01,576 SystemKeyspace.java:1284 - Detected 
> version upgrade from 2.2.5 to 3.0.3, snapshotting system keyspace
> WARN  [main] 2016-03-08 11:42:01,911 CompressionParams.java:382 - The 
> sstable_compression option has been deprecated. You should use class instead
> WARN  [main] 2016-03-08 11:42:01,959 CompressionParams.java:333 - The 
> chunk_length_kb option has been deprecated. You should use chunk_length_in_kb 
> instead
> ERROR [main] 2016-03-08 11:42:02,638 CassandraDaemon.java:692 - Exception 
> encountered during startup
> java.lang.AssertionError: null
> at 
> org.apache.cassandra.db.CompactTables.getCompactValueColumn(CompactTables.java:90)
>  ~[apache-cassandra-3.0.3.jar:3.0.3]
> at 
> org.apache.cassandra.config.CFMetaData.rebuild(CFMetaData.java:315) 
> ~[apache-cassandra-3.0.3.jar:3.0.3]
> at org.apache.cassandra.config.CFMetaData.(CFMetaData.java:291) 
> ~[apache-cassandra-3.0.3.jar:3.0.3]
> at org.apache.cassandra.config.CFMetaData.create(CFMetaData.java:367) 
> ~[apache-cassandra-3.0.3.jar:3.0.3]
> at 
> org.apache.cassandra.schema.LegacySchemaMigrator.decodeTableMetadata(LegacySchemaMigrator.java:337)
>  ~[apache-cassandra-3.0.3.jar:3.0.3]
> at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTableMetadata(LegacySchemaMigrator.java:273)
>  ~[apache-cassandra-3.0.3.jar:3.0.3]
> at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTable(LegacySchemaMigrator.java:244)
>  ~[apache-cassandra-3.0.3.jar:3.0.3]
> at 
> org.apache.cassandra.schema.LegacySchemaMigrator.lambda$readTables$227(LegacySchemaMigrator.java:237)
>  ~[apache-cassandra-3.0.3.jar:3.0.3]
> at java.util.ArrayList.forEach(ArrayList.java:1249) ~[na:1.8.0_74]
> at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTables(LegacySchemaMigrator.java:237)
>  ~[apache-cassandra-3.0.3.jar:3.0.3]
> at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readKeyspace(LegacySchemaMigrator.java:186)
>  ~[apache-cassandra-3.0.3.jar:3.0.3]
> at 
> org.apache.cassandra.schema.LegacySchemaMigrator.lambda$readSchema$224(LegacySchemaMigrator.java:177)
>  ~[apache-cassandra-3.0.3.jar:3.0.3]
> at java.util.ArrayList.forEach(ArrayList.java:1249) ~[na:1.8.0_74]
> at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readSchema(LegacySchemaMigrator.java:177)
>  ~[apache-cassandra-3.0.3.jar:3.0.3]
> at 
> org.apache.cassandra.schema.LegacySchemaMigrator.migrate(LegacySchemaMigrator.java:77)
>  ~[apache-cassandra-3.0.3.jar:3.0.3]
> at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:223) 
> [apache-cassandra-3.0.3.jar:3.0.3]
> at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:551)
>  [apache-cassandra-3.0.3.jar:3.0.3]
> at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:679) 
> [apache-cassandra-3.0.3.jar:3.0.3]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-12023) Schema upgrade bug with super columns

2016-06-20 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-12023:
--
Status: Patch Available  (was: In Progress)

> Schema upgrade bug with super columns
> -
>
> Key: CASSANDRA-12023
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12023
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jeremiah Jordan
>Assignee: Aleksey Yeschenko
>Priority: Critical
> Fix For: 3.0.x, 3.x
>
>
> Doing some upgrade tests starting on 2.0 to 2.1 to 3.0 we hit the follow bug 
> that prevents 3.0 nodes from starting.  Running the test a few times with 
> different waits and flushing sometimes or not I have seen the following 
> errors:
> {code}
> ERROR [main] 2016-06-17 10:42:40,112 CassandraDaemon.java:698 - Exception 
> encountered during startup
> org.apache.cassandra.serializers.MarshalException: cannot parse 'value' as 
> hex bytes
>   at 
> org.apache.cassandra.db.marshal.BytesType.fromString(BytesType.java:45) 
> ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.createColumnFromColumnRow(LegacySchemaMigrator.java:682)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.createColumnsFromColumnRows(LegacySchemaMigrator.java:641)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.decodeTableMetadata(LegacySchemaMigrator.java:316)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTableMetadata(LegacySchemaMigrator.java:273)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTable(LegacySchemaMigrator.java:244)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.lambda$readTables$7(LegacySchemaMigrator.java:237)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at java.util.ArrayList.forEach(ArrayList.java:1249) ~[na:1.8.0_66]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTables(LegacySchemaMigrator.java:237)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readKeyspace(LegacySchemaMigrator.java:186)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.lambda$readSchema$4(LegacySchemaMigrator.java:177)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at java.util.ArrayList.forEach(ArrayList.java:1249) ~[na:1.8.0_66]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readSchema(LegacySchemaMigrator.java:177)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.migrate(LegacySchemaMigrator.java:77)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:229) 
> [apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:557)
>  [apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:685) 
> [apache-cassandra-3.0.7.jar:3.0.7]
> Caused by: java.lang.NumberFormatException: An hex string representing bytes 
> must have an even length
>   at org.apache.cassandra.utils.Hex.hexToBytes(Hex.java:57) 
> ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.db.marshal.BytesType.fromString(BytesType.java:41) 
> ~[apache-cassandra-3.0.7.jar:3.0.7]
>   ... 16 common frames omitted
> {code}
> {code}
> ERROR [main] 2016-06-17 10:49:21,326 CassandraDaemon.java:698 - Exception 
> encountered during startup
> java.lang.RuntimeException: org.codehaus.jackson.JsonParseException: 
> Unexpected character ('K' (code 75)): expected a valid value (number, String, 
> array, object, 'true', 'false' or 'null')
>  at [Source: java.io.StringReader@60d4475f; line: 1, column: 2]
>   at 
> org.apache.cassandra.utils.FBUtilities.fromJsonMap(FBUtilities.java:561) 
> ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.decodeTableParams(LegacySchemaMigrator.java:442)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.decodeTableMetadata(LegacySchemaMigrator.java:365)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTableMetadata(LegacySchemaMigrator.java:273)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTable(LegacySchemaMigrator.java:244)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> 

[jira] [Comment Edited] (CASSANDRA-12023) Schema upgrade bug with super columns

2016-06-20 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15339707#comment-15339707
 ] 

Aleksey Yeschenko edited comment on CASSANDRA-12023 at 6/20/16 3:25 PM:


||branch||testall||dtest||
|[12023-3.0|https://github.com/iamaleksey/cassandra/tree/12023-3.0]|[testall|http://cassci.datastax.com/view/Dev/view/iamaleksey/job/iamaleksey-12023-3.0-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/iamaleksey/job/iamaleksey-12023-3.0-dtest]|
|[12023-trunk|https://github.com/iamaleksey/cassandra/tree/12023-trunk]|[testall|http://cassci.datastax.com/view/Dev/view/iamaleksey/job/iamaleksey-12023-trunk-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/iamaleksey/job/iamaleksey-12023-trunk-dtest]|

{{compact_value}} in 3.0 is (correctly) being turned into {{Kind.REGULAR}}, but 
the original kind is important when decoding the name for super columns. 
Without the patch, the {{compact_value}} column is being treated like a regular 
defined counter super column from {{column_metadata}}. After the patch, it's 
properly special cased and is correctly decoded using {{UTF8Type}}.


was (Author: iamaleksey):
||branch||testall||dtest||
|[12023-trunk|https://github.com/12023-3.0/cassandra/tree/12023-trunk]|[testall|http://cassci.datastax.com/view/Dev/view/12023-3.0/job/12023-3.0-12023-trunk-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/12023-3.0/job/12023-3.0-12023-trunk-dtest]|

{{compact_value}} in 3.0 is (correctly) being turned into {{Kind.REGULAR}}, but 
the original kind is important when decoding the name for super columns. 
Without the patch, the {{compact_value}} column is being treated like a regular 
defined counter super column from {{column_metadata}}. After the patch, it's 
properly special cased and is correctly decoded using {{UTF8Type}}.

> Schema upgrade bug with super columns
> -
>
> Key: CASSANDRA-12023
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12023
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jeremiah Jordan
>Assignee: Aleksey Yeschenko
>Priority: Critical
> Fix For: 3.0.x, 3.x
>
>
> Doing some upgrade tests starting on 2.0 to 2.1 to 3.0 we hit the follow bug 
> that prevents 3.0 nodes from starting.  Running the test a few times with 
> different waits and flushing sometimes or not I have seen the following 
> errors:
> {code}
> ERROR [main] 2016-06-17 10:42:40,112 CassandraDaemon.java:698 - Exception 
> encountered during startup
> org.apache.cassandra.serializers.MarshalException: cannot parse 'value' as 
> hex bytes
>   at 
> org.apache.cassandra.db.marshal.BytesType.fromString(BytesType.java:45) 
> ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.createColumnFromColumnRow(LegacySchemaMigrator.java:682)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.createColumnsFromColumnRows(LegacySchemaMigrator.java:641)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.decodeTableMetadata(LegacySchemaMigrator.java:316)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTableMetadata(LegacySchemaMigrator.java:273)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTable(LegacySchemaMigrator.java:244)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.lambda$readTables$7(LegacySchemaMigrator.java:237)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at java.util.ArrayList.forEach(ArrayList.java:1249) ~[na:1.8.0_66]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTables(LegacySchemaMigrator.java:237)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readKeyspace(LegacySchemaMigrator.java:186)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.lambda$readSchema$4(LegacySchemaMigrator.java:177)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at java.util.ArrayList.forEach(ArrayList.java:1249) ~[na:1.8.0_66]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readSchema(LegacySchemaMigrator.java:177)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.migrate(LegacySchemaMigrator.java:77)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:229) 
> [apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:557)
>  

[jira] [Commented] (CASSANDRA-12023) Schema upgrade bug with super columns

2016-06-20 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15339707#comment-15339707
 ] 

Aleksey Yeschenko commented on CASSANDRA-12023:
---

||branch||testall||dtest||
|[12023-trunk|https://github.com/12023-3.0/cassandra/tree/12023-trunk]|[testall|http://cassci.datastax.com/view/Dev/view/12023-3.0/job/12023-3.0-12023-trunk-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/12023-3.0/job/12023-3.0-12023-trunk-dtest]|

{{compact_value}} in 3.0 is (correctly) being turned into {{Kind.REGULAR}}, but 
the original kind is important when decoding the name for super columns. 
Without the patch, the {{compact_value}} column is being treated like a regular 
defined counter super column from {{column_metadata}}. After the patch, it's 
properly special cased and is correctly decoded using {{UTF8Type}}.

> Schema upgrade bug with super columns
> -
>
> Key: CASSANDRA-12023
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12023
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Jeremiah Jordan
>Assignee: Aleksey Yeschenko
>Priority: Critical
> Fix For: 3.0.x, 3.x
>
>
> Doing some upgrade tests starting on 2.0 to 2.1 to 3.0 we hit the follow bug 
> that prevents 3.0 nodes from starting.  Running the test a few times with 
> different waits and flushing sometimes or not I have seen the following 
> errors:
> {code}
> ERROR [main] 2016-06-17 10:42:40,112 CassandraDaemon.java:698 - Exception 
> encountered during startup
> org.apache.cassandra.serializers.MarshalException: cannot parse 'value' as 
> hex bytes
>   at 
> org.apache.cassandra.db.marshal.BytesType.fromString(BytesType.java:45) 
> ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.createColumnFromColumnRow(LegacySchemaMigrator.java:682)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.createColumnsFromColumnRows(LegacySchemaMigrator.java:641)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.decodeTableMetadata(LegacySchemaMigrator.java:316)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTableMetadata(LegacySchemaMigrator.java:273)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTable(LegacySchemaMigrator.java:244)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.lambda$readTables$7(LegacySchemaMigrator.java:237)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at java.util.ArrayList.forEach(ArrayList.java:1249) ~[na:1.8.0_66]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readTables(LegacySchemaMigrator.java:237)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readKeyspace(LegacySchemaMigrator.java:186)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.lambda$readSchema$4(LegacySchemaMigrator.java:177)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at java.util.ArrayList.forEach(ArrayList.java:1249) ~[na:1.8.0_66]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.readSchema(LegacySchemaMigrator.java:177)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.schema.LegacySchemaMigrator.migrate(LegacySchemaMigrator.java:77)
>  ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:229) 
> [apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:557)
>  [apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:685) 
> [apache-cassandra-3.0.7.jar:3.0.7]
> Caused by: java.lang.NumberFormatException: An hex string representing bytes 
> must have an even length
>   at org.apache.cassandra.utils.Hex.hexToBytes(Hex.java:57) 
> ~[apache-cassandra-3.0.7.jar:3.0.7]
>   at 
> org.apache.cassandra.db.marshal.BytesType.fromString(BytesType.java:41) 
> ~[apache-cassandra-3.0.7.jar:3.0.7]
>   ... 16 common frames omitted
> {code}
> {code}
> ERROR [main] 2016-06-17 10:49:21,326 CassandraDaemon.java:698 - Exception 
> encountered during startup
> java.lang.RuntimeException: org.codehaus.jackson.JsonParseException: 
> Unexpected character ('K' (code 75)): expected a valid value (number, String, 
> array, object, 'true', 'false' or 'null')
>  at [Source: java.io.StringReader@60d4475f; line: 1, column: 2]
>   at 
> org.apache.cassandra.utils.FBUtilities.fromJsonMap(FBUtilities.java:561) 
> 

[jira] [Commented] (CASSANDRA-11723) Cassandra upgrade from 2.1.11 to 3.0.5 leads to unstable nodes (jemalloc to blame)

2016-06-20 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15339684#comment-15339684
 ] 

Robert Stupp commented on CASSANDRA-11723:
--

Sorry for getting back to you after such a long time. Any recent version of 
jemalloc should work.

> Cassandra upgrade from 2.1.11 to 3.0.5 leads to unstable nodes (jemalloc to 
> blame)
> --
>
> Key: CASSANDRA-11723
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11723
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Stefano Ortolani
> Fix For: 3.0.x
>
>
> Upgrade seems fine, but any restart of the node might lead to a situation 
> where the node just dies after 30 seconds / 1 minute. 
> Nothing in the logs besides many "FailureDetector.java:456 - Ignoring 
> interval time of 3000892567 for /10.12.a.x" output every second (against all 
> other nodes) in debug.log plus some spurious GraphiteErrors/ReadRepair 
> notifications:
> {code:xml}
> DEBUG [GossipStage:1] 2016-05-05 22:29:03,921 FailureDetector.java:456 - 
> Ignoring interval time of 2373187360 for /10.12.a.x
> DEBUG [GossipStage:1] 2016-05-05 22:29:03,921 FailureDetector.java:456 - 
> Ignoring interval time of 2000276196 for /10.12.a.y
> DEBUG [ReadRepairStage:24] 2016-05-05 22:29:03,990 ReadCallback.java:234 - 
> Digest mismatch:
> org.apache.cassandra.service.DigestMismatchException: Mismatch for key 
> DecoratedKey(-152946356843306763, e859fdd2f264485f42030ce261e4e12e) 
> (d6e617ece3b7bec6138b52b8974b8cab vs 31becca666a62b3c4b2fc0bab9902718)
>   at 
> org.apache.cassandra.service.DigestResolver.resolve(DigestResolver.java:85) 
> ~[apache-cassandra-3.0.5.jar:3.0.5]
>   at 
> org.apache.cassandra.service.ReadCallback$AsyncRepairRunner.run(ReadCallback.java:225)
>  ~[apache-cassandra-3.0.5.jar:3.0.5]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_60]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_60]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]
> DEBUG [GossipStage:1] 2016-05-05 22:29:04,841 FailureDetector.java:456 - 
> Ignoring interval time of 3000299340 for /10.12.33.5
> ERROR [metrics-graphite-reporter-1-thread-1] 2016-05-05 22:29:05,692 
> ScheduledReporter.java:119 - RuntimeException thrown from 
> GraphiteReporter#report. Exception was suppressed.
> java.lang.IllegalStateException: Unable to compute ceiling for max when 
> histogram overflowed
>   at 
> org.apache.cassandra.utils.EstimatedHistogram.rawMean(EstimatedHistogram.java:231)
>  ~[apache-cassandra-3.0.5.jar:3.0.5]
>   at 
> org.apache.cassandra.metrics.EstimatedHistogramReservoir$HistogramSnapshot.getMean(EstimatedHistogramReservoir.java:103)
>  ~[apache-cassandra-3.0.5.jar:3.0.5]
>   at 
> com.codahale.metrics.graphite.GraphiteReporter.reportHistogram(GraphiteReporter.java:252)
>  ~[metrics-graphite-3.1.0.jar:3.1.0]
>   at 
> com.codahale.metrics.graphite.GraphiteReporter.report(GraphiteReporter.java:166)
>  ~[metrics-graphite-3.1.0.jar:3.1.0]
>   at 
> com.codahale.metrics.ScheduledReporter.report(ScheduledReporter.java:162) 
> ~[metrics-core-3.1.0.jar:3.1.0]
>   at 
> com.codahale.metrics.ScheduledReporter$1.run(ScheduledReporter.java:117) 
> ~[metrics-core-3.1.0.jar:3.1.0]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_60]
>   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) 
> [na:1.8.0_60]
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>  [na:1.8.0_60]
>   at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>  [na:1.8.0_60]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [na:1.8.0_60]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_60]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_60]
> {code}
> I know this is not much but nothing else gets to dmesg or to any other log. 
> Any suggestion how to debug this further?
> I upgraded two nodes so far, and it happened on both nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-12038) dtest failure in batch_test.TestBatch.logged_batch_compatibility_3_test

2016-06-20 Thread Craig Kodman (JIRA)
Craig Kodman created CASSANDRA-12038:


 Summary: dtest failure in 
batch_test.TestBatch.logged_batch_compatibility_3_test
 Key: CASSANDRA-12038
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12038
 Project: Cassandra
  Issue Type: Test
Reporter: Craig Kodman
Assignee: DS Test Eng


example failure:

http://cassci.datastax.com/job/cassandra-3.0_novnode_dtest/252/testReport/batch_test/TestBatch/logged_batch_compatibility_3_test

Failed on CassCI build cassandra-3.0_novnode_dtest #252



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11960) Hints are not seekable

2016-06-20 Thread Branimir Lambov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15339665#comment-15339665
 ] 

Branimir Lambov commented on CASSANDRA-11960:
-

Is it possible to save both the file offset (from {{getSourcePosition}}) as 
well as the buffer offset? This would let us resume from exactly where we 
stopped, wouldn't it?

I believe the reason I removed the seeking was because I saw it can't work 
correctly for compressed files (as we store uncompressed offset but don't have 
a way to map it to compressed chunk and offset), and after that I wasn't very 
thorough in checking that it's not actually used.

> Hints are not seekable
> --
>
> Key: CASSANDRA-11960
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11960
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Robert Stupp
>Assignee: Stefan Podkowinski
>
> Got the following error message on trunk. No idea how to reproduce. But the 
> only thing the (not overridden) seek method does is throwing this exception.
> {code}
> ERROR [HintsDispatcher:2] 2016-06-05 18:51:09,397 CassandraDaemon.java:222 - 
> Exception in thread Thread[HintsDispatcher:2,1,main]
> java.lang.UnsupportedOperationException: Hints are not seekable.
>   at org.apache.cassandra.hints.HintsReader.seek(HintsReader.java:114) 
> ~[main/:na]
>   at 
> org.apache.cassandra.hints.HintsDispatcher.seek(HintsDispatcher.java:79) 
> ~[main/:na]
>   at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:257)
>  ~[main/:na]
>   at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:242)
>  ~[main/:na]
>   at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:220)
>  ~[main/:na]
>   at 
> org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:199)
>  ~[main/:na]
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_91]
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_91]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_91]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_91]
>   at java.lang.Thread.run(Thread.java:745) [na:1.8.0_91]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-12037) dtest failure in repair_tests.repair_test.TestRepair.repair_after_upgrade_test

2016-06-20 Thread Craig Kodman (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Kodman resolved CASSANDRA-12037.
--
Resolution: Duplicate

12036

> dtest failure in repair_tests.repair_test.TestRepair.repair_after_upgrade_test
> --
>
> Key: CASSANDRA-12037
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12037
> Project: Cassandra
>  Issue Type: Test
>Reporter: Craig Kodman
>Assignee: DS Test Eng
>  Labels: dtest
>
> example failure:
> http://cassci.datastax.com/job/cassandra-3.0_dtest_win32/257/testReport/repair_tests.repair_test/TestRepair/repair_after_upgrade_test
> Failed on CassCI build cassandra-3.0_dtest_win32 #257



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-12037) dtest failure in repair_tests.repair_test.TestRepair.repair_after_upgrade_test

2016-06-20 Thread Craig Kodman (JIRA)
Craig Kodman created CASSANDRA-12037:


 Summary: dtest failure in 
repair_tests.repair_test.TestRepair.repair_after_upgrade_test
 Key: CASSANDRA-12037
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12037
 Project: Cassandra
  Issue Type: Test
Reporter: Craig Kodman
Assignee: DS Test Eng


example failure:

http://cassci.datastax.com/job/cassandra-3.0_dtest_win32/257/testReport/repair_tests.repair_test/TestRepair/repair_after_upgrade_test

Failed on CassCI build cassandra-3.0_dtest_win32 #257



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-12036) dtest failure in repair_tests.repair_test.TestRepair.repair_after_upgrade_test

2016-06-20 Thread Craig Kodman (JIRA)
Craig Kodman created CASSANDRA-12036:


 Summary: dtest failure in 
repair_tests.repair_test.TestRepair.repair_after_upgrade_test
 Key: CASSANDRA-12036
 URL: https://issues.apache.org/jira/browse/CASSANDRA-12036
 Project: Cassandra
  Issue Type: Test
Reporter: Craig Kodman
Assignee: DS Test Eng


example failure:

http://cassci.datastax.com/job/cassandra-3.0_dtest_win32/257/testReport/repair_tests.repair_test/TestRepair/repair_after_upgrade_test

Failed on CassCI build cassandra-3.0_dtest_win32 #257



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11845) Hanging repair in cassandra 2.2.4

2016-06-20 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15339631#comment-15339631
 ] 

Paulo Motta commented on CASSANDRA-11845:
-

[~vin01] you may want to have a look at tuning your [tcp_keepalive 
settings|https://docs.datastax.com/en/cassandra/2.1/cassandra/troubleshooting/trblshootIdleFirewall.html]
 and see if that helps.

With that said, this will be ultimately fixed by CASSANDRA-11841 with 
keep-alive in the application layer. And you may are probably seeing the 
effects of CASSANDRA-10992 which is causing the stream sessions of compressed 
tables to hang due to the connection reset exception and should be fixed soon.

> Hanging repair in cassandra 2.2.4
> -
>
> Key: CASSANDRA-11845
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11845
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
> Environment: Centos 6
>Reporter: vin01
>Priority: Minor
> Attachments: cassandra-2.2.4.error.log
>
>
> So after increasing the streaming_timeout_in_ms value to 3 hours, i was able 
> to avoid the socketTimeout errors i was getting earlier 
> (https://issues.apAache.org/jira/browse/CASSANDRA-11826), but now the issue 
> is repair just stays stuck.
> current status :-
> [2016-05-19 05:52:50,835] Repair session a0e590e1-1d99-11e6-9d63-b717b380ffdd 
> for range (-3309358208555432808,-3279958773585646585] finished (progress: 54%)
> [2016-05-19 05:53:09,446] Repair session a0e590e3-1d99-11e6-9d63-b717b380ffdd 
> for range (8149151263857514385,8181801084802729407] finished (progress: 55%)
> [2016-05-19 05:53:13,808] Repair session a0e5b7f1-1d99-11e6-9d63-b717b380ffdd 
> for range (3372779397996730299,3381236471688156773] finished (progress: 55%)
> [2016-05-19 05:53:27,543] Repair session a0e5b7f3-1d99-11e6-9d63-b717b380ffdd 
> for range (-4182952858113330342,-4157904914928848809] finished (progress: 55%)
> [2016-05-19 05:53:41,128] Repair session a0e5df00-1d99-11e6-9d63-b717b380ffdd 
> for range (6499366179019889198,6523760493740195344] finished (progress: 55%)
> And its 10:46:25 Now, almost 5 hours since it has been stuck right there.
> Earlier i could see repair session going on in system.log but there are no 
> logs coming in right now, all i get in logs is regular index summary 
> redistribution logs.
> Last logs for repair i saw in logs :-
> INFO  [RepairJobTask:5] 2016-05-19 05:53:41,125 RepairJob.java:152 - [repair 
> #a0e5df00-1d99-11e6-9d63-b717b380ffdd] TABLE_NAME is fully synced
> INFO  [RepairJobTask:5] 2016-05-19 05:53:41,126 RepairSession.java:279 - 
> [repair #a0e5df00-1d99-11e6-9d63-b717b380ffdd] Session completed successfully
> INFO  [RepairJobTask:5] 2016-05-19 05:53:41,126 RepairRunnable.java:232 - 
> Repair session a0e5df00-1d99-11e6-9d63-b717b380ffdd for range 
> (6499366179019889198,6523760493740195344] finished
> Its an incremental repair, and in "nodetool netstats" output i can see logs 
> like :-
> Repair e3055fb0-1d9d-11e6-9d63-b717b380ffdd
> /Node-2
> Receiving 8 files, 1093461 bytes total. Already received 8 files, 
> 1093461 bytes total
> 
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80872-big-Data.db
>  399475/399475 bytes(100%) received from idx:0/Node-2
> 
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80879-big-Data.db
>  53809/53809 bytes(100%) received from idx:0/Node-2
> 
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80878-big-Data.db
>  89955/89955 bytes(100%) received from idx:0/Node-2
> 
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80881-big-Data.db
>  168790/168790 bytes(100%) received from idx:0/Node-2
> 
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80886-big-Data.db
>  107785/107785 bytes(100%) received from idx:0/Node-2
> 
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80880-big-Data.db
>  52889/52889 bytes(100%) received from idx:0/Node-2
> 
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80884-big-Data.db
>  148882/148882 bytes(100%) received from idx:0/Node-2
> 
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/tmp-la-80883-big-Data.db
>  71876/71876 bytes(100%) received from idx:0/Node-2
> Sending 5 files, 863321 bytes total. Already sent 5 files, 863321 
> bytes total
> 
> /data/cassandra/data/KEYSPACE_NAME/TABLE_NAME-01ad9750723e11e4bfe0d3887930a87c/la-73168-big-Data.db
>  161895/161895 bytes(100%) sent to idx:0/Node-2
> 
> 

[jira] [Updated] (CASSANDRA-11854) Remove finished streaming connections from MessagingService

2016-06-20 Thread Joshua McKenzie (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joshua McKenzie updated CASSANDRA-11854:

Reviewer: Marcus Eriksson  (was: Yuki Morishita)

> Remove finished streaming connections from MessagingService
> ---
>
> Key: CASSANDRA-11854
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11854
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Paulo Motta
>Assignee: Paulo Motta
> Attachments: oom.png
>
>
> When a new {{IncomingStreamingConnection}} is created, [we register it in the 
> connections 
> map|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/MessagingService.java#L1109]
>  of {{MessagingService}}, but we [only remove it if there is an 
> exception|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/IncomingStreamingConnection.java#L83]
>  while attaching the socket to the stream session.
> On nodes with SSL and large number of vnodes, after many repair sessions 
> these old connections can accumulate and cause OOM (heap dump attached).
> The connection should be removed from the connections map after if it's 
> finished in order to be garbage collected.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >