[jira] [Commented] (CASSANDRA-10406) Nodetool supports to rebuild from specific ranges.

2016-04-14 Thread Yuki Morishita (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242309#comment-15242309
 ] 

Yuki Morishita commented on CASSANDRA-10406:


Thanks!

Unfortunately, 2.1 went to accept critical bug fixes only, and new features 
like this only goes to trunk.
So I took your patch and converted to trunk, pushed as bellow.

||branch||testall||dtest||
|[10406-trunk|https://github.com/yukim/cassandra/tree/10406-trunk]|[testall|http://cassci.datastax.com/view/Dev/view/yukim/job/yukim-10406-trunk-testall/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/yukim/job/yukim-10406-trunk-dtest/lastCompletedBuild/testReport/]|

I modified the part that converts given range representation to {{Range}} 
object to use regexp so that we don't get confusing error message when failed 
to parse.

Have a look, and since tests look good, I will commit if nothing seems wrong.
Then we can work together for dtest in CASSANDRA-10526.

> Nodetool supports to rebuild from specific ranges.
> --
>
> Key: CASSANDRA-10406
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10406
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Dikang Gu
>Assignee: Dikang Gu
> Fix For: 2.1.x
>
> Attachments: 0001-nodetool-rebuild-support-range-tokens.patch
>
>
> Add the 'nodetool rebuildrange' command, so that if `nodetool rebuild` 
> failed, we do not need to rebuild all the ranges, and can just rebuild those 
> failed ones.
> Should be easily ported to all versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11549) cqlsh: COPY FROM ignores NULL values in conversion

2016-04-14 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242297#comment-15242297
 ] 

Stefania commented on CASSANDRA-11549:
--

Sorry about that, I must have run the initial CI job on trunk without the 
correct dtest branch. The problem was with the test code: it needs to check for 
null indicators before calling {{format_value}}.

I fixed the test code and, to be on the safe side, restarted all 4 CI jobs to 
make sure the test code works with all C* branches. Results are still pending.

> cqlsh: COPY FROM ignores NULL values in conversion
> --
>
> Key: CASSANDRA-11549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11549
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Stefania
>Assignee: Stefania
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
>
> COPY FROM fails to import empty values. 
> For example:
> {code}
> $ cat test.csv
> a,10,20
> b,30,
> c,50,60
> $ cqlsh
> cqlsh> create keyspace if not exists test with replication = {'class': 
> 'SimpleStrategy', 'replication_factor':1};
> cqlsh> create table if not exists test.test (t text primary key, i1 int, i2 
> int);
> cqlsh> copy test.test (t,i1,i2) from 'test.csv';
> {code}
> Imports:
> {code}
> select * from test.test";
>  t | i1 | i2
> ---++
>  a | 10 | 20
>  c | 50 | 60
> (2 rows)
> {code}
> and generates a {{ParseError - invalid literal for int() with base 10: '',  
> given up without retries}} for the row with an empty value.
> It should import the empty value as a {{null}} and there should be no error:
> {code}
> cqlsh> select * from test.test";
>  t | i1 | i2
> ---++--
>  a | 10 |   20
>  c | 50 |   60
>  b | 30 | null
> (3 rows)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11437) Make number of cores used by cqlsh COPY visible to testing code

2016-04-14 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242267#comment-15242267
 ] 

Stefania commented on CASSANDRA-11437:
--

Thank you for the review and for testing the patch, marking this as ready to 
commit.

> Make number of cores used by cqlsh COPY visible to testing code
> ---
>
> Key: CASSANDRA-11437
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11437
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Testing
>Reporter: Jim Witschey
>Assignee: Stefania
>Priority: Minor
>  Labels: lhf
> Fix For: 3.x
>
>
> As per this conversation with [~Stefania]:
> https://github.com/riptano/cassandra-dtest/pull/869#issuecomment-200597829
> we don't currently have a way to verify that the test environment variable 
> {{CQLSH_COPY_TEST_NUM_CORES}} actually affects the behavior of {{COPY}} in 
> the intended way. If this were added, we could make our tests of the one-core 
> edge case a little stricter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11437) Make number of cores used by cqlsh COPY visible to testing code

2016-04-14 Thread Stefania (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefania updated CASSANDRA-11437:
-
Status: Ready to Commit  (was: Patch Available)

> Make number of cores used by cqlsh COPY visible to testing code
> ---
>
> Key: CASSANDRA-11437
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11437
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Testing
>Reporter: Jim Witschey
>Assignee: Stefania
>Priority: Minor
>  Labels: lhf
> Fix For: 3.x
>
>
> As per this conversation with [~Stefania]:
> https://github.com/riptano/cassandra-dtest/pull/869#issuecomment-200597829
> we don't currently have a way to verify that the test environment variable 
> {{CQLSH_COPY_TEST_NUM_CORES}} actually affects the behavior of {{COPY}} in 
> the intended way. If this were added, we could make our tests of the one-core 
> edge case a little stricter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11437) Make number of cores used by cqlsh COPY visible to testing code

2016-04-14 Thread Stefania (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefania updated CASSANDRA-11437:
-
Summary: Make number of cores used by cqlsh COPY visible to testing code  
(was: Make number of cores used for copy tasks visible)

> Make number of cores used by cqlsh COPY visible to testing code
> ---
>
> Key: CASSANDRA-11437
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11437
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Testing
>Reporter: Jim Witschey
>Assignee: Stefania
>Priority: Minor
>  Labels: lhf
> Fix For: 3.x
>
>
> As per this conversation with [~Stefania]:
> https://github.com/riptano/cassandra-dtest/pull/869#issuecomment-200597829
> we don't currently have a way to verify that the test environment variable 
> {{CQLSH_COPY_TEST_NUM_CORES}} actually affects the behavior of {{COPY}} in 
> the intended way. If this were added, we could make our tests of the one-core 
> edge case a little stricter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11474) cqlsh: COPY FROM should use regular inserts for single statement batches

2016-04-14 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242254#comment-15242254
 ] 

Stefania commented on CASSANDRA-11474:
--

Thanks for the review.

I've removed the single insert special case from the patch on trunk but left 
the fix on reporting errors during the initialization phase of worker 
processes. I've also prepared the ticket for commit by editing CHANGES.txt and 
the commit messages, please check if the description seems reasonable.

I've restarted CI, results are still pending.

Also, here is the dtest [pull 
request|https://github.com/riptano/cassandra-dtest/pull/928].

> cqlsh: COPY FROM should use regular inserts for single statement batches
> 
>
> Key: CASSANDRA-11474
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11474
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Stefania
>Assignee: Stefania
>Priority: Minor
>  Labels: lhf
> Fix For: 2.2.x, 3.0.x, 3.x
>
>
> I haven't reproduced it with a test yet but, from code inspection, if CQL 
> rows are larger than {{batch_size_fail_threshold_in_kb}} and this parameter 
> cannot be changed, then data import will fail.
> Users can control the batch size by setting MAXBATCHSIZE.
> If a batch contains a single statement, there is no need to use a batch and 
> we should use normal inserts instead or, alternatively, we should skip the 
> batch size check for unlogged batches with only one statement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11560) dtest failure in user_types_test.TestUserTypes.udt_subfield_test

2016-04-14 Thread Michael Shuler (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242196#comment-15242196
 ] 

Michael Shuler commented on CASSANDRA-11560:


Merged. Thanks, Tyler!

> dtest failure in user_types_test.TestUserTypes.udt_subfield_test
> 
>
> Key: CASSANDRA-11560
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11560
> Project: Cassandra
>  Issue Type: Test
>Reporter: Michael Shuler
>Assignee: Tyler Hobbs
>  Labels: dtest
>
> example failure:
> http://cassci.datastax.com/job/trunk_dtest/1125/testReport/user_types_test/TestUserTypes/udt_subfield_test
> Failed on CassCI build trunk_dtest #1125
> Appears to be a test problem:
> {noformat}
> Error Message
> 'NoneType' object is not iterable
>  >> begin captured logging << 
> dtest: DEBUG: cluster ccm directory: /mnt/tmp/dtest-Kzg9Sk
> dtest: DEBUG: Custom init_config not found. Setting defaults.
> dtest: DEBUG: Done setting configuration options:
> {   'initial_token': None,
> 'num_tokens': '32',
> 'phi_convict_threshold': 5,
> 'range_request_timeout_in_ms': 1,
> 'read_request_timeout_in_ms': 1,
> 'request_timeout_in_ms': 1,
> 'truncate_request_timeout_in_ms': 1,
> 'write_request_timeout_in_ms': 1}
> - >> end captured logging << -
> Stacktrace
>   File "/usr/lib/python2.7/unittest/case.py", line 329, in run
> testMethod()
>   File "/home/automaton/cassandra-dtest/tools.py", line 253, in wrapped
> f(obj)
>   File "/home/automaton/cassandra-dtest/user_types_test.py", line 767, in 
> udt_subfield_test
> self.assertEqual(listify(rows[0]), [[None]])
>   File "/home/automaton/cassandra-dtest/user_types_test.py", line 25, in 
> listify
> for i in item:
> "'NoneType' object is not iterable\n >> begin captured 
> logging << \ndtest: DEBUG: cluster ccm directory: 
> /mnt/tmp/dtest-Kzg9Sk\ndtest: DEBUG: Custom init_config not found. Setting 
> defaults.\ndtest: DEBUG: Done setting configuration options:\n{   
> 'initial_token': None,\n'num_tokens': '32',\n'phi_convict_threshold': 
> 5,\n'range_request_timeout_in_ms': 1,\n
> 'read_request_timeout_in_ms': 1,\n'request_timeout_in_ms': 1,\n   
>  'truncate_request_timeout_in_ms': 1,\n'write_request_timeout_in_ms': 
> 1}\n- >> end captured logging << 
> -"
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11452) Cache implementation using LIRS eviction for in-process page cache

2016-04-14 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242175#comment-15242175
 ] 

Benedict commented on CASSANDRA-11452:
--

bq. I'd expect the collision would be flushed out by the eviction when we 
detect that the victim's and candidates hash codes are equal. To me the victim 
means the item that the eviction policy selected, so the jittered LRU is 
selecting the guard. It might also be simpler code that method is long to 
handle the various edge cases.

I think we may be suffering from the ambiguities of the written word.  I 
thought you meant to change the jitter to select the victim rather than the 
guard, i.e. to remove not the LRU.  If you just mean to calculate the guard 
earlier then I was raising in invalid contention.

I must admit that since you raise specifically the hash comparison that I don't 
entirely follow its logic (I apologise if this is my density; I've not put as 
much thought into it as I could).  It seems to me that if the LRU and MRU are 
colliding, for instance, then we hit the problem and the comparison does 
nothing to stop it.  And it doesn't stop two collisions entering the map unless 
the collision appears only when the LRU collides with it on admission.  I 
haven't looked closely at the test cases so I'm not sure what it's meant to be 
stopping, but I suspect the jitter is a stronger more general solution.

bq. Sorry this is existing code in the sketch, as suggested by Thomas Meuller 
(H2). That was to protect against hash collision attacks exploiting the hash 
function.

Ah. Personally I don't see any harm in regularising the bits over the address 
space with a random seed - a bit of variance never hurt anybody, and since only 
tests have a reliable data distribution, only our benchmarks are likely 
noticing it in any functional sense.

bq. Unfortunately good traces are also hard to find.

_Any_ traces are hard to find.  The main thing that stopped me exploring some 
of these ideas myself over the past few years was the perceived impossibility 
of finding a suite of good quality traces.  As much as I am impressed by the 
paper, when I first encountered W-TInyLFU I was most excited to see a suite of 
readily available traces with a simulator.  Of course, given the bar has been 
raised for making use of the idea the time investment for doing something 
useful has gone up correspondingly, but at least on the fun side of the 
equation.

bq. I'm really interested to see what other avenues people take to exploit 
sketches in a cache policy

Yup.  Me too.

> Cache implementation using LIRS eviction for in-process page cache
> --
>
> Key: CASSANDRA-11452
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11452
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>
> Following up from CASSANDRA-5863, to make best use of caching and to avoid 
> having to explicitly marking compaction accesses as non-cacheable, we need a 
> cache implementation that uses an eviction algorithm that can better handle 
> non-recurring accesses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11560) dtest failure in user_types_test.TestUserTypes.udt_subfield_test

2016-04-14 Thread Tyler Hobbs (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tyler Hobbs updated CASSANDRA-11560:

Status: Patch Available  (was: In Progress)

> dtest failure in user_types_test.TestUserTypes.udt_subfield_test
> 
>
> Key: CASSANDRA-11560
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11560
> Project: Cassandra
>  Issue Type: Test
>Reporter: Michael Shuler
>Assignee: Tyler Hobbs
>  Labels: dtest
>
> example failure:
> http://cassci.datastax.com/job/trunk_dtest/1125/testReport/user_types_test/TestUserTypes/udt_subfield_test
> Failed on CassCI build trunk_dtest #1125
> Appears to be a test problem:
> {noformat}
> Error Message
> 'NoneType' object is not iterable
>  >> begin captured logging << 
> dtest: DEBUG: cluster ccm directory: /mnt/tmp/dtest-Kzg9Sk
> dtest: DEBUG: Custom init_config not found. Setting defaults.
> dtest: DEBUG: Done setting configuration options:
> {   'initial_token': None,
> 'num_tokens': '32',
> 'phi_convict_threshold': 5,
> 'range_request_timeout_in_ms': 1,
> 'read_request_timeout_in_ms': 1,
> 'request_timeout_in_ms': 1,
> 'truncate_request_timeout_in_ms': 1,
> 'write_request_timeout_in_ms': 1}
> - >> end captured logging << -
> Stacktrace
>   File "/usr/lib/python2.7/unittest/case.py", line 329, in run
> testMethod()
>   File "/home/automaton/cassandra-dtest/tools.py", line 253, in wrapped
> f(obj)
>   File "/home/automaton/cassandra-dtest/user_types_test.py", line 767, in 
> udt_subfield_test
> self.assertEqual(listify(rows[0]), [[None]])
>   File "/home/automaton/cassandra-dtest/user_types_test.py", line 25, in 
> listify
> for i in item:
> "'NoneType' object is not iterable\n >> begin captured 
> logging << \ndtest: DEBUG: cluster ccm directory: 
> /mnt/tmp/dtest-Kzg9Sk\ndtest: DEBUG: Custom init_config not found. Setting 
> defaults.\ndtest: DEBUG: Done setting configuration options:\n{   
> 'initial_token': None,\n'num_tokens': '32',\n'phi_convict_threshold': 
> 5,\n'range_request_timeout_in_ms': 1,\n
> 'read_request_timeout_in_ms': 1,\n'request_timeout_in_ms': 1,\n   
>  'truncate_request_timeout_in_ms': 1,\n'write_request_timeout_in_ms': 
> 1}\n- >> end captured logging << 
> -"
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11560) dtest failure in user_types_test.TestUserTypes.udt_subfield_test

2016-04-14 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242164#comment-15242164
 ] 

Tyler Hobbs commented on CASSANDRA-11560:
-

It was indeed just a test problem.  dtest pull request to fix it here: 
https://github.com/riptano/cassandra-dtest/pull/927

> dtest failure in user_types_test.TestUserTypes.udt_subfield_test
> 
>
> Key: CASSANDRA-11560
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11560
> Project: Cassandra
>  Issue Type: Test
>Reporter: Michael Shuler
>Assignee: Tyler Hobbs
>  Labels: dtest
>
> example failure:
> http://cassci.datastax.com/job/trunk_dtest/1125/testReport/user_types_test/TestUserTypes/udt_subfield_test
> Failed on CassCI build trunk_dtest #1125
> Appears to be a test problem:
> {noformat}
> Error Message
> 'NoneType' object is not iterable
>  >> begin captured logging << 
> dtest: DEBUG: cluster ccm directory: /mnt/tmp/dtest-Kzg9Sk
> dtest: DEBUG: Custom init_config not found. Setting defaults.
> dtest: DEBUG: Done setting configuration options:
> {   'initial_token': None,
> 'num_tokens': '32',
> 'phi_convict_threshold': 5,
> 'range_request_timeout_in_ms': 1,
> 'read_request_timeout_in_ms': 1,
> 'request_timeout_in_ms': 1,
> 'truncate_request_timeout_in_ms': 1,
> 'write_request_timeout_in_ms': 1}
> - >> end captured logging << -
> Stacktrace
>   File "/usr/lib/python2.7/unittest/case.py", line 329, in run
> testMethod()
>   File "/home/automaton/cassandra-dtest/tools.py", line 253, in wrapped
> f(obj)
>   File "/home/automaton/cassandra-dtest/user_types_test.py", line 767, in 
> udt_subfield_test
> self.assertEqual(listify(rows[0]), [[None]])
>   File "/home/automaton/cassandra-dtest/user_types_test.py", line 25, in 
> listify
> for i in item:
> "'NoneType' object is not iterable\n >> begin captured 
> logging << \ndtest: DEBUG: cluster ccm directory: 
> /mnt/tmp/dtest-Kzg9Sk\ndtest: DEBUG: Custom init_config not found. Setting 
> defaults.\ndtest: DEBUG: Done setting configuration options:\n{   
> 'initial_token': None,\n'num_tokens': '32',\n'phi_convict_threshold': 
> 5,\n'range_request_timeout_in_ms': 1,\n
> 'read_request_timeout_in_ms': 1,\n'request_timeout_in_ms': 1,\n   
>  'truncate_request_timeout_in_ms': 1,\n'write_request_timeout_in_ms': 
> 1}\n- >> end captured logging << 
> -"
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11452) Cache implementation using LIRS eviction for in-process page cache

2016-04-14 Thread Ben Manes (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242133#comment-15242133
 ] 

Ben Manes commented on CASSANDRA-11452:
---

{{quote}}
I think it's better for the jitter to not affect the victim, since if there is 
a collision that doesn't get flushed out that would permit the cache efficiency 
to remain degraded indefinitely
{{quote}}

I'd expect the collision would be flushed out by the eviction when we detect 
that the victim's and candidates hash codes are equal. To me the victim means 
the item that the eviction policy selected, so the jittered LRU is selecting 
the guard. It might also be simpler code that method is long to handle the 
various edge cases.

{{quote}}
I don't recall that suggestion, and don't see a corresponding change in the 
codebase; remind me?
{{quote}}

Sorry this is existing code in the sketch, as suggested by Thomas Meuller (H2). 
That was to protect against hash collision attacks exploiting the hash 
function. I know this is a bit weak since Java originally tried that and 
switched to red-black tree bins instead. It provides a little unpredictability 
on the sketch which might be a good thing.

{{quote}}
There's a wealth of possible avenues to explore.
{{quote}}

I'm really interested to see what other avenues people take to exploit sketches 
in a cache policy. The two citations of the original paper were dismissive. I 
think the revision has more weight due to the comparative analysis. There seems 
to be a lot of optimization tricks to explore. Unfortunately good traces are 
also hard to find.

> Cache implementation using LIRS eviction for in-process page cache
> --
>
> Key: CASSANDRA-11452
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11452
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>
> Following up from CASSANDRA-5863, to make best use of caching and to avoid 
> having to explicitly marking compaction accesses as non-cacheable, we need a 
> cache implementation that uses an eviction algorithm that can better handle 
> non-recurring accesses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11452) Cache implementation using LIRS eviction for in-process page cache

2016-04-14 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242109#comment-15242109
 ] 

Benedict commented on CASSANDRA-11452:
--

Breaking out of nesting hell...

bq. I'd probably jitter when as the selection of the victim near the top of the 
loop

I think it's better for the jitter to not affect the victim, since if there is 
a collision that doesn't get flushed out that would permit the cache efficiency 
to remain degraded indefinitely (with far fewer admissions, so lower adaptation 
of the cache and greater reliance on the smaller LRU and the access behaviours 
there).

As far as I can tell the actual value used for determining admission doesn't in 
any way need to be coupled to the victim.  It's possible there are plenty of 
other ways of arriving at a good threshold, and perhaps they should even be 
explored.  For instance, just riffing here, if one were to massively increase 
the size of the sketches, lengthen their lifecycle, and shrink the main LRU, 
raising the threshold may raise the efficiency of the cache overall by only 
admitting elements with a very high chance of reuse, even if they have less 
available space.  Obviously highly dependent on the data distribution.  It's 
possible but the best strategies could even be calculated by the statistics one 
could infer from the sketches.  There's a wealth of possible avenues to explore.

bq. Do you think the random seed used by the sketch is still a good addition?

I don't recall that suggestion, and don't see a corresponding change in the 
codebase; remind me?

bq.  I won't have the bandwidth to test this until the evening.

No worries - I'm certainly not rushing you.  This is just a fun little 
distraction for me.

> Cache implementation using LIRS eviction for in-process page cache
> --
>
> Key: CASSANDRA-11452
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11452
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>
> Following up from CASSANDRA-5863, to make best use of caching and to avoid 
> having to explicitly marking compaction accesses as non-cacheable, we need a 
> cache implementation that uses an eviction algorithm that can better handle 
> non-recurring accesses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11354) PrimaryKeyRestrictionSet should be refactored

2016-04-14 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242083#comment-15242083
 ] 

Tyler Hobbs commented on CASSANDRA-11354:
-

The patch looks good to me overall.  The new class structure is much more 
understandable. I think the only remaining confusing part is the difference 
between {{PartitionKeyRestrictions}} and {{PartitionKeyRestrictionSet}}, which 
is not obvious at first.  I think perhaps renaming 
{{PartitionKeyRestrictionSet}} to {{PartitionKeySingleRestrictionSet}} may make 
this clearer (although the new name is long).  Editing the javadocs to clarify 
that {{PartitionKeyRestrictionSet}} specifically doesn't include token 
restrictions would also help.

Other than that, I only have a couple of nitpicks:
* In {{ClusteringColumnRestrictions}}, most of the checks in the private 
constructor can be moved to {{mergeWith()}} to make it more clear what is going 
on.
* {{Restrictions}} has two redundant methods that are also declared in 
{{Restriction}}: {{getColumnDefs()}} and {{getFunctions()}}

With those fixed, +1

> PrimaryKeyRestrictionSet should be refactored
> -
>
> Key: CASSANDRA-11354
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11354
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>
> While reviewing CASSANDRA-11310 I realized that the code of 
> {{PrimaryKeyRestrictionSet}} was really confusing.
> The main 2 issues are:
> * the fact that it is used for partition keys and clustering columns 
> restrictions whereas those types of column required different processing
> * the {{isEQ}}, {{isSlice}}, {{isIN}} and {{isContains}} methods should not 
> be there as the set of restrictions might not match any of those categories 
> when secondary indexes are used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-7186) alter table add column not always propogating

2016-04-14 Thread Uttam Phalnikar (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242063#comment-15242063
 ] 

Uttam Phalnikar edited comment on CASSANDRA-7186 at 4/14/16 10:40 PM:
--

Actually restart of a node fixed problem on that node. Is there any cache flush 
that I need to achieve same results without restart?
PS> nodetool repair system and nodetool flush system didn't help


was (Author: uttam1105):
Actually restart of a node fixed problem on that node. Is there any cache flush 
that I need to achieve same results without restart?

> alter table add column not always propogating
> -
>
> Key: CASSANDRA-7186
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7186
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Martin Meyer
>Assignee: Philip Thompson
> Fix For: 2.0.12
>
>
> I've been many times in Cassandra 2.0.6 that adding columns to existing 
> tables seems to not fully propagate to our entire cluster. We add an extra 
> column to various tables maybe 0-2 times a week, and so far many of these 
> ALTERs have resulted in at least one node showing the old table description a 
> pretty long time (~30 mins) after the original ALTER command was issued.
> We originally identified this issue when a connected clients would complain 
> that a column it issued a SELECT for wasn't a known column, at which point we 
> have to ask each node to describe the most recently altered table. One of 
> them will not know about the newly added field. Issuing the original ALTER 
> statement on that node makes everything work correctly.
> We have seen this issue on multiple tables (we don't always alter the same 
> one). It has affected various nodes in the cluster (not always the same one 
> is not getting the mutation propagated). No new nodes have been added to the 
> cluster recently. All nodes are homogenous (hardware and software), running 
> 2.0.6. We don't see any particular errors or exceptions on the node that 
> didn't get the schema update, only the later error from a Java client about 
> asking for an unknown column in a SELECT. We have to check each node manually 
> to find the offender. The tables he have seen this on are under fairly heavy 
> read and write load, but we haven't altered any tables that are not, so that 
> might not be important.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8844) Change Data Capture (CDC)

2016-04-14 Thread Joshua McKenzie (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242064#comment-15242064
 ] 

Joshua McKenzie commented on CASSANDRA-8844:


bq. On recovery, we are going to delete the CDC Commit Logs instead of moving 
them to the CDC Overflow folder; we use ACLSM#deleteUntrackedCommitLogSegment, 
which isn't overwritten for the CDC case
Fixed.

bq. Right now, there is no way to avoid getting a failed allocation even if the 
consumer is matching the speed of the CDC overflow logic. CDC keyspaces will 
have at least 250ms of failure after it has written up to capacity files, even 
though the space could have been reclaimed. I suggest that in 
CommitLogSegmentManagerCDC#discard, we also call maybeUpdateCDCSizeCounterAsync 
so that we update the size, but not too quickly
Fixed, though I also augmented that signature to allow for bypassing the sleep 
interval. Rather than forcing a 250ms wait / throttling like we need on 
mutation application, I think it's reasonable to have no sleep on the discard 
path and immediately recover any unknown free space in the counter if a 
consumer is live.

bq. The reset of recalculating in CLSMCDC#updateCDCDirectorySize should happen 
inside of a finally; for example, if we get an IOException, we will never be 
able to recalculate the CDC directory size. If this is intentional, we should 
make sure that we explicitly flag that decision
Good catch. Changed to put the manager wake and CAS in finally so it shouldn't 
be exposed to a potential hang there.

bq. We aren't actually splitting the space between the regular Commit Log and 
the CDC log, so I'd think we should use the same space for the commit log and 
the CDC log
Not sure I understand here. The data/cdc_overflow and data/cdc are split on 
disk, but not necessarily split as far as us having independent allocation 
space for each directory. Same goes for cdc and commitlog. I'd actually be more 
in favor of allowing tuning of all three rather than glomming cdc w/commitlog. 
Thoughts?

bq. In DropKeyspaceStatement#announceMigration, we should keep the catching of 
the exception as we had before; this check is not sufficient, as it is the same 
as in the validate step. Even though we've passed validation, we could still 
get an exception when we try to update the schema
Reverted. I dislike the flow of the code in this method and I'm fairly sure 
{{ifExists && oldKsm == null}} better reflects the logical intent of what we 
were trying for before (ConfigurationException on non-existant w/ifExists is 
ok), but I concede the point that the new code isn't strictly necessarily in 
terms of this patch and is also subtly behaviorally different.

bq. The FileUtils.createDirectory calls should be in the checks for cdc being 
empty; right now, it only works if saved_caches hasn't been specified
Not sure I follow. It's also in DatabaseDescriptor.createAllDirectories. Could 
you clarify the context of this point a bit?

bq. In Parser.g, do we need to use anything in the value of the map? or can we 
just use a null value?
Done

bq. In Config.java, the change in name from 
commitlog_max_compression_buffers_in_pool to 
commitlog_max_compression_buffers_per_pool isn't compatible for users who used 
that option; we need a NEWS entry for it
Keeping, noted in NEWS.txt. This was an undocumented variable in the .yaml so I 
suspect overrides are limited in the wild. Also added a more formal NEWS.txt 
entry and CHANGES.txt for CDC as a feature

bq. In PropertyDefinitions#getSet, can we just use the keySet instead of 
creating a new HashSet for it?
Fixed. Missed the forest for the trees while implementing that one.

bq. In AbstractCommitLogSegmentManager#start, we should include the type in the 
name of the Thread so that we can tell whether the thread is for the standard 
CL or the CDC CL
Added.

bq. Would be good to add a flag to CommitLogReadErrorReason to tell whether the 
error is recoverable or not; this would explain whether we will check the 
return value or not in CR#readMutation
I augmented the enum names to indicate which are recoverable and which are not 
and extended the interface to support that. I didn't like having those 2 
concepts (recoverable and unrecoverable errors) living in the same method since 
it was rather misleading to have a "shouldStopOnX" with a caller that didn't 
care about your return. In the case of the CommitLogReplayer, it will continue 
to pass that into a single method for logical purposes, but subsequent 
implementers can take more granular action.

bq. Not sure if there is a reason to keep MutationInitiator, it serves a 
similar role to the new ICommitLogReadHandler
Similar, but different enough (hijacking the futures operations for mocking in 
tests) that I'd prefer leaving that to a future effort if we choose to go that 
route.

bq. Don't understand the immediate use case in the comment above 

[jira] [Commented] (CASSANDRA-7186) alter table add column not always propogating

2016-04-14 Thread Uttam Phalnikar (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242063#comment-15242063
 ] 

Uttam Phalnikar commented on CASSANDRA-7186:


Actually restart of a node fixed problem on that node. Is there any cache flush 
that I need to achieve same results without restart?

> alter table add column not always propogating
> -
>
> Key: CASSANDRA-7186
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7186
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Martin Meyer
>Assignee: Philip Thompson
> Fix For: 2.0.12
>
>
> I've been many times in Cassandra 2.0.6 that adding columns to existing 
> tables seems to not fully propagate to our entire cluster. We add an extra 
> column to various tables maybe 0-2 times a week, and so far many of these 
> ALTERs have resulted in at least one node showing the old table description a 
> pretty long time (~30 mins) after the original ALTER command was issued.
> We originally identified this issue when a connected clients would complain 
> that a column it issued a SELECT for wasn't a known column, at which point we 
> have to ask each node to describe the most recently altered table. One of 
> them will not know about the newly added field. Issuing the original ALTER 
> statement on that node makes everything work correctly.
> We have seen this issue on multiple tables (we don't always alter the same 
> one). It has affected various nodes in the cluster (not always the same one 
> is not getting the mutation propagated). No new nodes have been added to the 
> cluster recently. All nodes are homogenous (hardware and software), running 
> 2.0.6. We don't see any particular errors or exceptions on the node that 
> didn't get the schema update, only the later error from a Java client about 
> asking for an unknown column in a SELECT. We have to check each node manually 
> to find the offender. The tables he have seen this on are under fairly heavy 
> read and write load, but we haven't altered any tables that are not, so that 
> might not be important.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9555) Don't let offline tools run while cassandra is running

2016-04-14 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241994#comment-15241994
 ] 

Robert Stupp commented on CASSANDRA-9555:
-

+1 on the "I know what I am doing" flag.

Would it be ok to target 3.0.x? I think executing tools like sstablescrub, 
sstableupgrade (all that update/delete sstables) while C* is running is not 
intended and such an operational mistake will lead to issues on that node.

My idea to detect whether a node is running is to try to connect to 
jmx/gossip/native/thrift ports (using the config in the environment/.yaml) - if 
at least one of these ports is open, the node would be considered running.

> Don't let offline tools run while cassandra is running
> --
>
> Key: CASSANDRA-9555
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9555
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Eriksson
>Assignee: Robert Stupp
>Priority: Minor
> Fix For: 3.x
>
>
> We should not let offline tools that modify sstables run while Cassandra is 
> running. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-11580) remove DatabaseDescriptor dependency from SegmentedFile

2016-04-14 Thread Yuki Morishita (JIRA)
Yuki Morishita created CASSANDRA-11580:
--

 Summary: remove DatabaseDescriptor dependency from SegmentedFile
 Key: CASSANDRA-11580
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11580
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Yuki Morishita


Several configurable parameters are pulled from {{DatabaseDescriptor}} from 
{{SegmentedFile}} and its subclasses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11549) cqlsh: COPY FROM ignores NULL values in conversion

2016-04-14 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241983#comment-15241983
 ] 

Paulo Motta commented on CASSANDRA-11549:
-

I tried executing the dtest locally on trunk and found some errors (they do not 
fail on 2.1), so I re-triggered another [dtest 
run|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-11549-dtest/lastCompletedBuild/testReport/]
 with the new tests so you can have a look.

> cqlsh: COPY FROM ignores NULL values in conversion
> --
>
> Key: CASSANDRA-11549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11549
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Stefania
>Assignee: Stefania
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
>
> COPY FROM fails to import empty values. 
> For example:
> {code}
> $ cat test.csv
> a,10,20
> b,30,
> c,50,60
> $ cqlsh
> cqlsh> create keyspace if not exists test with replication = {'class': 
> 'SimpleStrategy', 'replication_factor':1};
> cqlsh> create table if not exists test.test (t text primary key, i1 int, i2 
> int);
> cqlsh> copy test.test (t,i1,i2) from 'test.csv';
> {code}
> Imports:
> {code}
> select * from test.test";
>  t | i1 | i2
> ---++
>  a | 10 | 20
>  c | 50 | 60
> (2 rows)
> {code}
> and generates a {{ParseError - invalid literal for int() with base 10: '',  
> given up without retries}} for the row with an empty value.
> It should import the empty value as a {{null}} and there should be no error:
> {code}
> cqlsh> select * from test.test";
>  t | i1 | i2
> ---++--
>  a | 10 |   20
>  c | 50 |   60
>  b | 30 | null
> (3 rows)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11579) remove DatabaseDescriptor dependency from SequentialWriter

2016-04-14 Thread Yuki Morishita (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuki Morishita updated CASSANDRA-11579:
---
Status: Patch Available  (was: Open)

||branch||testall||dtest||
|[11192-sequentialwriter|https://github.com/yukim/cassandra/tree/11192-sequentialwriter]|[testall|http://cassci.datastax.com/view/Dev/view/yukim/job/yukim-11192-sequentialwriter-testall/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/yukim/job/yukim-11192-sequentialwriter-dtest/lastCompletedBuild/testReport/]|

The patch introduces {{SequentialWriterOption}} to give {{SequentialWriter}} to 
configure, rather than directly access {{DatabaseDescriptor}}.
Patch also fixes _potential_ bug that {{lastFlushOffset}} can be wrong after 
{{resetAndTruncate}} with brief unit test.

Pre patch, {{SequentialWriter}} always pull {{trickleFsync}} and 
{{trickleFsyncByteInterval}} from DD, but in my patch those are skipped for SW 
of small buffer size.

> remove DatabaseDescriptor dependency from SequentialWriter
> --
>
> Key: CASSANDRA-11579
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11579
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Yuki Morishita
>Assignee: Yuki Morishita
>Priority: Minor
>
> {{SequentialWriter}} and its subclass is widely used in Cassandra, mainly 
> from SSTable. Removing dependency to {{DatabaseDescriptor}} improve 
> reusability of this class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (CASSANDRA-9555) Don't let offline tools run while cassandra is running

2016-04-14 Thread Robert Stupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp reassigned CASSANDRA-9555:
---

Assignee: Robert Stupp

> Don't let offline tools run while cassandra is running
> --
>
> Key: CASSANDRA-9555
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9555
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Marcus Eriksson
>Assignee: Robert Stupp
>Priority: Minor
> Fix For: 3.x
>
>
> We should not let offline tools that modify sstables run while Cassandra is 
> running. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11206) Support large partitions on the 3.0 sstable format

2016-04-14 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241975#comment-15241975
 ] 

Robert Stupp commented on CASSANDRA-11206:
--

bq. need to change the version of sstable
The change does not change the index sstable format - just the format of the 
saved key cache.

bq. AutoSavingCache change require a step on the users part
No, all that happens is that you lose the contents of the old saved key cache. 
This is since the change requires some more information on shallow indexed 
entries (offset in index file).

bq. 0,1,2 magic bytes
Made these constants and pushed a commit for this.

bq. dtests/unit test with column_index_cache_size_in_kb: 0
I've setup a new branch {{11206-large-part-0kb-trunk}} and triggered CI for 
this. 
[testall|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-11206-large-part-0kb-trunk-testall/lastBuild/]
 
[dtest|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-11206-large-part-0kb-trunk-dtest/lastBuild/]


> Support large partitions on the 3.0 sstable format
> --
>
> Key: CASSANDRA-11206
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11206
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jonathan Ellis
>Assignee: Robert Stupp
> Fix For: 3.x
>
> Attachments: 11206-gc.png, trunk-gc.png
>
>
> Cassandra saves a sample of IndexInfo objects that store the offset within 
> each partition of every 64KB (by default) range of rows.  To find a row, we 
> binary search this sample, then scan the partition of the appropriate range.
> The problem is that this scales poorly as partitions grow: on a cache miss, 
> we deserialize the entire set of IndexInfo, which both creates a lot of GC 
> overhead (as noted in CASSANDRA-9754) but is also non-negligible i/o activity 
> (relative to reading a single 64KB row range) as partitions get truly large.
> We introduced an "offset map" in CASSANDRA-10314 that allows us to perform 
> the IndexInfo bsearch while only deserializing IndexInfo that we need to 
> compare against, i.e. log(N) deserializations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-11579) remove DatabaseDescriptor dependency from SequentialWriter

2016-04-14 Thread Yuki Morishita (JIRA)
Yuki Morishita created CASSANDRA-11579:
--

 Summary: remove DatabaseDescriptor dependency from SequentialWriter
 Key: CASSANDRA-11579
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11579
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Yuki Morishita
Assignee: Yuki Morishita
Priority: Minor


{{SequentialWriter}} and its subclass is widely used in Cassandra, mainly from 
SSTable. Removing dependency to {{DatabaseDescriptor}} improve reusability of 
this class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11578) remove DatabaseDescriptor dependency from FileUtil

2016-04-14 Thread Yuki Morishita (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuki Morishita updated CASSANDRA-11578:
---
Status: Patch Available  (was: Open)

||branch||testall||dtest||
|[11192-fileutil|https://github.com/yukim/cassandra/tree/11192-fileutil]|[testall|http://cassci.datastax.com/view/Dev/view/yukim/job/yukim-11192-fileutil-testall/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/yukim/job/yukim-11192-fileutil-dtest/lastCompletedBuild/testReport/]|

Patch to add {{FSErrorHandler}} and use default implementation to access 
{{StorageService}} etc only from {{CassandraDaemon}}.

> remove DatabaseDescriptor dependency from FileUtil
> --
>
> Key: CASSANDRA-11578
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11578
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Yuki Morishita
>Assignee: Yuki Morishita
>Priority: Minor
>
> {{FileUtil}} has dependencies to {{DatabaseDescriptor}} and other online 
> related classes like {{StorageService}} when handling FS error.
> This is used in handling error in SSTable as well, so when one wants to use 
> SSTableReader/Writer offline, they has a chance to initializing unnecessary 
> staff at error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-11578) remove DatabaseDescriptor dependency from FileUtil

2016-04-14 Thread Yuki Morishita (JIRA)
Yuki Morishita created CASSANDRA-11578:
--

 Summary: remove DatabaseDescriptor dependency from FileUtil
 Key: CASSANDRA-11578
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11578
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Yuki Morishita
Assignee: Yuki Morishita
Priority: Minor


{{FileUtil}} has dependencies to {{DatabaseDescriptor}} and other online 
related classes like {{StorageService}} when handling FS error.

This is used in handling error in SSTable as well, so when one wants to use 
SSTableReader/Writer offline, they has a chance to initializing unnecessary 
staff at error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11192) remove DatabaseDescriptor dependency from o.a.c.io.util package

2016-04-14 Thread Yuki Morishita (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuki Morishita updated CASSANDRA-11192:
---
Issue Type: Improvement  (was: Sub-task)
Parent: (was: CASSANDRA-11191)

> remove DatabaseDescriptor dependency from o.a.c.io.util package
> ---
>
> Key: CASSANDRA-11192
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11192
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Yuki Morishita
>
> DatabaseDescriptor is the source of all configuration in Cassandra, but since 
> its static initialization from Config/cassandra.yaml, it is hard to configure 
> programatically. Also if it's not {{Config.setClientMode(true)}}, 
> DatabaseDescriptor creates/initializes tons of unnecessary things for just 
> reading SSTable.
> Since o.a.c.io.util is the core of accessing files, they should be as 
> independent as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-11577) Traces persist for longer than 24 hours

2016-04-14 Thread Josh Wickman (JIRA)
Josh Wickman created CASSANDRA-11577:


 Summary: Traces persist for longer than 24 hours
 Key: CASSANDRA-11577
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11577
 Project: Cassandra
  Issue Type: Bug
Reporter: Josh Wickman
Priority: Minor


My deployment currently has clusters on both Cassandra 1.2 (1.2.19) and 2.1 
(2.1.11) with tracing on.  On 2.1, the trace records persist for longer than 
the [documented 24 
hours|https://docs.datastax.com/en/cql/3.3/cql/cql_reference/tracing_r.html]:

{noformat}
cqlsh> select started_at from system_traces.sessions limit 10;

 started_at
--
 2016-03-11 23:28:40+
 2016-03-14 21:09:07+
 2016-03-14 16:42:25+
 2016-03-14 16:13:13+
 2016-03-14 19:12:11+
 2016-03-14 21:25:57+
 2016-03-29 22:45:28+
 2016-03-14 19:56:27+
 2016-03-09 23:31:41+
 2016-03-10 23:08:44+

(10 rows)
{noformat}

My systems on 1.2 do not exhibit this problem:

{noformat}
cqlsh> select started_at from system_traces.sessions limit 10;

 started_at
--
 2016-04-13 22:49:31+
 2016-04-14 18:06:45+
 2016-04-14 07:57:00+
 2016-04-14 04:35:05+
 2016-04-14 03:54:20+
 2016-04-14 10:54:38+
 2016-04-14 18:34:04+
 2016-04-14 12:56:57+
 2016-04-14 01:57:20+
 2016-04-13 21:36:01+
{noformat}

The event records also persist alongside the session records, for example:

{noformat}
cqlsh> select session_id, dateOf(event_id) from system_traces.events where 
session_id = fc8c1e80-e7e0-11e5-a2fb-1968ff3c067b;

 session_id   | dateOf(event_id)
--+--
 fc8c1e80-e7e0-11e5-a2fb-1968ff3c067b | 2016-03-11 23:28:40+
{noformat}

Between these versions, the table parameter {{default_time_to_live}} was 
introduced.  The {{system_traces}} tables report the default value of 0:

{noformat}
cqlsh> desc table system_traces.sessions

CREATE TABLE system_traces.sessions (
session_id uuid PRIMARY KEY,
coordinator inet,
duration int,
parameters map,
request text,
started_at timestamp
) WITH bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = 'traced sessions'
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
AND compression = {'sstable_compression': 
'org.apache.cassandra.io.compress.SnappyCompressor'}
AND dclocal_read_repair_chance = 0.0
AND default_time_to_live = 0
AND gc_grace_seconds = 0
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
{noformat}

I suspect that {{default_time_to_live}} is superseding the mechanism used in 
1.2 to expire the trace records.  Evidently I cannot change this parameter for 
this table:

{noformat}
cqlsh> alter table system_traces.sessions with default_time_to_live = 86400;
Unauthorized: code=2100 [Unauthorized] message="Cannot ALTER "
{noformat}

I realize Cassandra 1.2 is no longer supported, but the problem is being 
manifested in Cassandra 2.1 for me (I included 1.2 only for comparison).  Since 
I couldn't find an existing ticket addressing this issue, I'm concerned that it 
may be present in more recent versions of Cassandra as well, but I have not 
tested these.

The persistent trace records are contributing to disk filling, and more 
importantly, making it more difficult to analyze the trace data.  Is there a 
workaround for this?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (CASSANDRA-11117) ColUpdateTimeDeltaHistogram histogram overflow

2016-04-14 Thread Joel Knighton (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Knighton reassigned CASSANDRA-7:
-

Assignee: Joel Knighton

> ColUpdateTimeDeltaHistogram histogram overflow
> --
>
> Key: CASSANDRA-7
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Chris Lohfink
>Assignee: Joel Knighton
>Priority: Minor
> Fix For: 3.0.x, 3.x
>
>
> {code}
> getting attribute Mean of 
> org.apache.cassandra.metrics:type=ColumnFamily,name=ColUpdateTimeDeltaHistogram
>  threw an exceptionjavax.management.RuntimeMBeanException: 
> java.lang.IllegalStateException: Unable to compute ceiling for max when 
> histogram overflowed
> {code}
> Although the fact that this histogram has 164 buckets already, I wonder if 
> there is something weird with the computation thats causing this to be so 
> large? It appears to be coming from updates to system.local
> {code}
> org.apache.cassandra.metrics:type=Table,keyspace=system,scope=local,name=ColUpdateTimeDeltaHistogram
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-11576) Add support for JNA mlockall(2) on POWER

2016-04-14 Thread Rei Odaira (JIRA)
Rei Odaira created CASSANDRA-11576:
--

 Summary: Add support for JNA mlockall(2) on POWER
 Key: CASSANDRA-11576
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11576
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: POWER architecture
Reporter: Rei Odaira
Priority: Minor
 Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x


org.apache.cassandra.utils.CLibrary contains hard-coded C-macro values to be 
passed to system calls through JNA. These values are system-dependent, and as 
far as I investigated, Linux and AIX on the IBM POWER architecture define 
{{MCL_CURRENT}} and {{MCL_FUTURE}} (for mlockall(2)) as different values than 
the current hard-coded values.  As a result, mlockall(2) fails on these 
platforms.
{code}
WARN  18:51:51 Unknown mlockall error 22
{code}
I am going to provide a patch to support JNA mlockall(2) on POWER.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11206) Support large partitions on the 3.0 sstable format

2016-04-14 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241925#comment-15241925
 ] 

T Jake Luciani commented on CASSANDRA-11206:


* You need to change the version of sstable since this change alters the Index 
component.
* Please run dtests/unit test with column_index_cache_size_in_kb: 0 
* Is the AutoSavingCache change require a step on the users part or will it 
naturally skip the saved cache on startup?
* The 0,1,2 magic bytes that encode what type of index entry this is should be 
made constants

> Support large partitions on the 3.0 sstable format
> --
>
> Key: CASSANDRA-11206
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11206
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jonathan Ellis
>Assignee: Robert Stupp
> Fix For: 3.x
>
> Attachments: 11206-gc.png, trunk-gc.png
>
>
> Cassandra saves a sample of IndexInfo objects that store the offset within 
> each partition of every 64KB (by default) range of rows.  To find a row, we 
> binary search this sample, then scan the partition of the appropriate range.
> The problem is that this scales poorly as partitions grow: on a cache miss, 
> we deserialize the entire set of IndexInfo, which both creates a lot of GC 
> overhead (as noted in CASSANDRA-9754) but is also non-negligible i/o activity 
> (relative to reading a single 64KB row range) as partitions get truly large.
> We introduced an "offset map" in CASSANDRA-10314 that allows us to perform 
> the IndexInfo bsearch while only deserializing IndexInfo that we need to 
> compare against, i.e. log(N) deserializations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11452) Cache implementation using LIRS eviction for in-process page cache

2016-04-14 Thread Ben Manes (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241872#comment-15241872
 ] 

Ben Manes commented on CASSANDRA-11452:
---

Thanks. I won't have the bandwidth to test this until the evening. Roy flew 
into SF for a conference (from Israel) so we're going to meet. If you have any 
questions for me to discuss with him I'll proxy.

A quick glance and your trick has a nice distribution. A 1M iteration into a 
multiset showed,
[0 x 750485, 1 x 186958, 2 x 46910, 3 x 11731, 4 x 2901, 5 x 776, 6 x 171, 7 x 
49, 8 x 15, 9 x 3, 11]

I'd probably jitter when as the selection of the victim near the top of the 
loop and add a check to handle zero weight entries. I'll take care of that part.

It seems like we'd need both your jitter and the hash check added in the prior 
commit. It does sound that the combination would be an effective guard against 
this type of attack. Do you think the random seed used by the sketch is still a 
good addition?

> Cache implementation using LIRS eviction for in-process page cache
> --
>
> Key: CASSANDRA-11452
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11452
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>
> Following up from CASSANDRA-5863, to make best use of caching and to avoid 
> having to explicitly marking compaction accesses as non-cacheable, we need a 
> cache implementation that uses an eviction algorithm that can better handle 
> non-recurring accesses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-11575) Add out-of-process testing for CDC

2016-04-14 Thread Carl Yeksigian (JIRA)
Carl Yeksigian created CASSANDRA-11575:
--

 Summary: Add out-of-process testing for CDC
 Key: CASSANDRA-11575
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11575
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Carl Yeksigian
Assignee: Carl Yeksigian


There are currently no dtests for the new cdc feature. We should have some, at 
least to ensure that the cdc files have a lifecycle that makes sense, and make 
sure that things like a continually cleaning daemon and a lazy daemon have the 
properties we expect; for this, we don't need to actually process the files, 
but make sure they fit the characteristics we expect from them. A more complex 
daemon would need to be written in Java.

I already hit a problem where if the cdc is over capacity, the cdc properly 
throws the WTE, but it will not reset after the overflow directory is undersize 
again. It is supposed to correct the size within 250ms and allow more writes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11437) Make number of cores used for copy tasks visible

2016-04-14 Thread Jim Witschey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241796#comment-15241796
 ] 

Jim Witschey commented on CASSANDRA-11437:
--

Sorry for the delay, and thank you for the ping. I'm +1, looks great. I ran it 
locally and on a parameterized job here:

http://cassci.datastax.com/view/Parameterized/job/parameterized_dtest_multiplexer/67/

> Make number of cores used for copy tasks visible
> 
>
> Key: CASSANDRA-11437
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11437
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Testing
>Reporter: Jim Witschey
>Assignee: Stefania
>Priority: Minor
>  Labels: lhf
> Fix For: 3.x
>
>
> As per this conversation with [~Stefania]:
> https://github.com/riptano/cassandra-dtest/pull/869#issuecomment-200597829
> we don't currently have a way to verify that the test environment variable 
> {{CQLSH_COPY_TEST_NUM_CORES}} actually affects the behavior of {{COPY}} in 
> the intended way. If this were added, we could make our tests of the one-core 
> edge case a little stricter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11485) ArithmeticException in avgFunctionForDecimal

2016-04-14 Thread Robert Stupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp updated CASSANDRA-11485:
-
 Assignee: Robert Stupp
Fix Version/s: 3.0.x
   Status: Patch Available  (was: Open)

1 divided by 3 - eh, yea.

I've changed the avg() for decimal to use RoundingMode.HALF_EVEN in the patch.

Since we cannot pass a "parameter" to an aggregation (the API doesn't support 
that at the moment), the way to use another rounding mode would be to implement 
a UDA.


|3.0|[branch|https://github.com/apache/cassandra/compare/cassandra-3.0...snazy:11485-avg-decimal-round-3.0?expand=1]|[testall|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-11485-avg-decimal-round-3.0-testall/lastBuild/]|[testall|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-11485-avg-decimal-round-3.0-dtest/lastBuild/]
|trunk|[branch|https://github.com/apache/cassandra/compare/trunk...snazy:11485-avg-decimal-round-trunk?expand=1]|[testall|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-11485-avg-decimal-round-trunk-testall/lastBuild/]|[testall|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-11485-avg-decimal-round-trunk-dtest/lastBuild/]


> ArithmeticException in avgFunctionForDecimal
> 
>
> Key: CASSANDRA-11485
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11485
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
>Reporter: Nico Haller
>Assignee: Robert Stupp
>Priority: Minor
> Fix For: 3.0.x
>
>
> I am running into issues when using avg in queries on decimal values.
> It throws an ArithmeticException in 
> org/apache/cassandra/cql3/functions/AggregateFcts.java (Line 184).
> So whenever an exact representation of the quotient is not possible it will 
> throw that error and it never returns to the querying client.
> I am not so sure if this is intended behavior or a bug, but in my opinion if 
> an exact representation of the value is not possible, it should automatically 
> round the value.
> Specifying a rounding mode when calling the divide function should solve the 
> issue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11553) hadoop.cql3.CqlRecordWriter does not close cluster on reconnect

2016-04-14 Thread Jeremiah Jordan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremiah Jordan updated CASSANDRA-11553:

Status: Patch Available  (was: Open)

> hadoop.cql3.CqlRecordWriter does not close cluster on reconnect
> ---
>
> Key: CASSANDRA-11553
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11553
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Artem Aliev
>Assignee: Artem Aliev
> Fix For: 2.2.x, 3.0.x, 3.x
>
> Attachments: CASSANDRA-11553-2.2.txt
>
>
> CASSANDRA-10058 add session and cluster close to all places in hadoop except 
> one place on reconnection.
> The writer uses one connection per new cluster, so I added cluster.close() 
> call to sesseionClose() method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-5977) Structure for cfstats output (JSON, YAML, or XML)

2016-04-14 Thread Yuki Morishita (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241751#comment-15241751
 ] 

Yuki Morishita commented on CASSANDRA-5977:
---

Thanks, and I like the change to snake case.

I uploaded your patch and running tests. If tests are good, I will commit.

||branch||testall||dtest||
|[5977|https://github.com/yukim/cassandra/tree/5977]|[testall|http://cassci.datastax.com/view/Dev/view/yukim/job/yukim-5977-testall/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/yukim/job/yukim-5977-dtest/lastCompletedBuild/testReport/]|

I changed the code a bit for styling.
If we can upgrade jackson to 2.0 and have annotations, it will be much cleaner, 
but for now, the patch is sufficient.

> Structure for cfstats output (JSON, YAML, or XML)
> -
>
> Key: CASSANDRA-5977
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5977
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Alyssa Kwan
>Assignee: Shogo Hoshii
>Priority: Minor
>  Labels: Tools
> Fix For: 3.x
>
> Attachments: CASSANDRA-5977-trunk.patch, CASSANDRA-5977-trunk.patch, 
> sample_result.zip, sample_result.zip, tablestats_sample_result.json, 
> tablestats_sample_result.txt, tablestats_sample_result.yaml, 
> trunk-tablestats.patch, trunk-tablestats.patch
>
>
> nodetool cfstats should take a --format arg that structures the output in 
> JSON, YAML, or XML.  This would be useful for piping into another script that 
> can easily parse this and act on it.  It would also help those of us who use 
> things like MCollective gather aggregate stats across clusters/nodes.
> Thoughts?  I can submit a patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-10547) Updating a CQL List many times creates many tombstones

2016-04-14 Thread Alex Petrov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov resolved CASSANDRA-10547.
-
   Resolution: Resolved
Fix Version/s: 2.1.13

> Updating a CQL List many times creates many tombstones 
> ---
>
> Key: CASSANDRA-10547
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10547
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 2.1.9, Java driver 2.1.5
>Reporter: James Bishop
>Assignee: Alex Petrov
> Fix For: 2.1.13
>
> Attachments: tombstone.snippet
>
>
> We encountered a TombstoneOverwhelmingException in cassandra system.log which 
> caused some of our CQL queries to fail.
> We are able to reproduce this issue by updating a CQL List column many times. 
> The number of tombstones created seems to be related to (number of list items 
> * number of list updates). We update the entire list on each update using the 
> java driver. (see attached code for details)
> Running nodetool compact does not help, but nodetool flush does. It appears 
> that the tombstones are being accumulated in memory. 
> For example if we update a list of 100 items 1000 times, this creates more  
> than 100,000 tombstones and exceeds the default tombstone_failure_threshold.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-10547) Updating a CQL List many times creates many tombstones

2016-04-14 Thread Alex Petrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241742#comment-15241742
 ] 

Alex Petrov edited comment on CASSANDRA-10547 at 4/14/16 7:08 PM:
--

You can upgrade to (at least) 2.1.13, the issue doesn't appear on it anymore. 
I've ran similar tests against 2.1.5 and 2.1.13.

2.1.5:
{code}
Read 1 live and 23 tombstoned cells [SharedPool-Worker-3] | 2016-04-14 
21:05:09.391000 | 127.0.0.1 |
{code}

2.1.13
{code}
Read 1 live and 0 tombstone cells [SharedPool-Worker-3] | 2016-04-14 
21:01:01.666000 | 127.0.0.1 |
{code}

Issue doesn't appear on {{3.x}} either. 


was (Author: ifesdjeen):
You can upgrade to (at least) 2.1.13, the issue doesn't appear on it anymore. 
I've ran similar tests against 2.1.5 and 2.1.13.

2.1.5:
{code}
Read 1 live and 23 tombstoned cells [SharedPool-Worker-3] | 2016-04-14 
21:05:09.391000 | 127.0.0.1 |
{code}

2.1.13
{code}
Read 1 live and 0 tombstone cells [SharedPool-Worker-3] | 2016-04-14 
21:01:01.666000 | 127.0.0.1 |
{code}

Issue doesn't appear on {3.x} either. 

> Updating a CQL List many times creates many tombstones 
> ---
>
> Key: CASSANDRA-10547
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10547
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 2.1.9, Java driver 2.1.5
>Reporter: James Bishop
>Assignee: Alex Petrov
> Fix For: 2.1.13
>
> Attachments: tombstone.snippet
>
>
> We encountered a TombstoneOverwhelmingException in cassandra system.log which 
> caused some of our CQL queries to fail.
> We are able to reproduce this issue by updating a CQL List column many times. 
> The number of tombstones created seems to be related to (number of list items 
> * number of list updates). We update the entire list on each update using the 
> java driver. (see attached code for details)
> Running nodetool compact does not help, but nodetool flush does. It appears 
> that the tombstones are being accumulated in memory. 
> For example if we update a list of 100 items 1000 times, this creates more  
> than 100,000 tombstones and exceeds the default tombstone_failure_threshold.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10547) Updating a CQL List many times creates many tombstones

2016-04-14 Thread Alex Petrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241742#comment-15241742
 ] 

Alex Petrov commented on CASSANDRA-10547:
-

You can upgrade to (at least) 2.1.13, the issue doesn't appear on it anymore. 
I've ran similar tests against 2.1.5 and 2.1.13.

2.1.5:
{code}
Read 1 live and 23 tombstoned cells [SharedPool-Worker-3] | 2016-04-14 
21:05:09.391000 | 127.0.0.1 |
{code}

2.1.13
{code}
Read 1 live and 0 tombstone cells [SharedPool-Worker-3] | 2016-04-14 
21:01:01.666000 | 127.0.0.1 |
{code}

Issue doesn't appear on {3.x} either. 

> Updating a CQL List many times creates many tombstones 
> ---
>
> Key: CASSANDRA-10547
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10547
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 2.1.9, Java driver 2.1.5
>Reporter: James Bishop
>Assignee: Alex Petrov
> Attachments: tombstone.snippet
>
>
> We encountered a TombstoneOverwhelmingException in cassandra system.log which 
> caused some of our CQL queries to fail.
> We are able to reproduce this issue by updating a CQL List column many times. 
> The number of tombstones created seems to be related to (number of list items 
> * number of list updates). We update the entire list on each update using the 
> java driver. (see attached code for details)
> Running nodetool compact does not help, but nodetool flush does. It appears 
> that the tombstones are being accumulated in memory. 
> For example if we update a list of 100 items 1000 times, this creates more  
> than 100,000 tombstones and exceeds the default tombstone_failure_threshold.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7186) alter table add column not always propogating

2016-04-14 Thread Uttam Phalnikar (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241725#comment-15241725
 ] 

Uttam Phalnikar commented on CASSANDRA-7186:


We are experiencing similar issue intermittently. It usually happens when table 
has some data (m+ records)

Steps to reproduce:
- Alter table add column
- nodetool describecluster to verify the nodes are in sync
- desc table from any node to verify column is added to the table
- select * from table limit 1 doesn't show the column
- insert into table (id, )values('some-id',  alter table add column not always propogating
> -
>
> Key: CASSANDRA-7186
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7186
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Martin Meyer
>Assignee: Philip Thompson
> Fix For: 2.0.12
>
>
> I've been many times in Cassandra 2.0.6 that adding columns to existing 
> tables seems to not fully propagate to our entire cluster. We add an extra 
> column to various tables maybe 0-2 times a week, and so far many of these 
> ALTERs have resulted in at least one node showing the old table description a 
> pretty long time (~30 mins) after the original ALTER command was issued.
> We originally identified this issue when a connected clients would complain 
> that a column it issued a SELECT for wasn't a known column, at which point we 
> have to ask each node to describe the most recently altered table. One of 
> them will not know about the newly added field. Issuing the original ALTER 
> statement on that node makes everything work correctly.
> We have seen this issue on multiple tables (we don't always alter the same 
> one). It has affected various nodes in the cluster (not always the same one 
> is not getting the mutation propagated). No new nodes have been added to the 
> cluster recently. All nodes are homogenous (hardware and software), running 
> 2.0.6. We don't see any particular errors or exceptions on the node that 
> didn't get the schema update, only the later error from a Java client about 
> asking for an unknown column in a SELECT. We have to check each node manually 
> to find the offender. The tables he have seen this on are under fairly heavy 
> read and write load, but we haven't altered any tables that are not, so that 
> might not be important.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11552) Reduce amount of logging calls from ColumnFamilyStore.selectAndReference

2016-04-14 Thread Robert Stupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp updated CASSANDRA-11552:
-
Status: Patch Available  (was: Open)

Changed the code to use {{NoSpamLogger}}. Patch is against 2.1 and merges 
unanimously to trunk.
Normally this piece of code is completely irrelevant, but if referencing the 
sstables fails, it will log a huge number of MB per second effectively rotating 
out log files with the messages of the root cause.

||branch||testall||dtest
|[2.1|https://github.com/apache/cassandra/compare/cassandra-2.1...snazy:11552-selectAndRef-spin-spam-2.1?expand=1]|[testall|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-11552-selectAndRef-spin-spam-2.1-testall/lastBuild/]|[dtest|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-11552-selectAndRef-spin-spam-2.1-dtest/lastBuild/]
|[2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...snazy:11552-selectAndRef-spin-spam-2.2?expand=1]|[testall|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-11552-selectAndRef-spin-spam-2.2-testall/lastBuild/]|[dtest|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-11552-selectAndRef-spin-spam-2.2-dtest/lastBuild/]
|[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...snazy:11552-selectAndRef-spin-spam-3.0?expand=1]|[testall|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-11552-selectAndRef-spin-spam-3.0-testall/lastBuild/]|[dtest|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-11552-selectAndRef-spin-spam-3.0-dtest/lastBuild/]
|[trunk|https://github.com/apache/cassandra/compare/trunk...snazy:11552-selectAndRef-spin-spam-trunk?expand=1]|[testall|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-11552-selectAndRef-spin-spam-trunk-testall/lastBuild/]|[dtest|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-11552-selectAndRef-spin-spam-trunk-dtest/lastBuild/]


> Reduce amount of logging calls from ColumnFamilyStore.selectAndReference
> 
>
> Key: CASSANDRA-11552
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11552
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>
> {{org.apache.cassandra.db.ColumnFamilyStore#selectAndReference}} logs two 
> messages at _info_ level "as fast as it can" if it waits for more than 100ms.
> The following code is executed in a while-true fashion in this case:
> {code}
> logger.info("Spinning trying to capture released readers {}", 
> released);
> logger.info("Spinning trying to capture all readers {}", 
> view.sstables);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (CASSANDRA-11552) Reduce amount of logging calls from ColumnFamilyStore.selectAndReference

2016-04-14 Thread Robert Stupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp reassigned CASSANDRA-11552:


Assignee: Robert Stupp

> Reduce amount of logging calls from ColumnFamilyStore.selectAndReference
> 
>
> Key: CASSANDRA-11552
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11552
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>
> {{org.apache.cassandra.db.ColumnFamilyStore#selectAndReference}} logs two 
> messages at _info_ level "as fast as it can" if it waits for more than 100ms.
> The following code is executed in a while-true fashion in this case:
> {code}
> logger.info("Spinning trying to capture released readers {}", 
> released);
> logger.info("Spinning trying to capture all readers {}", 
> view.sstables);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11555) Make prepared statement cache size configurable

2016-04-14 Thread Robert Stupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp updated CASSANDRA-11555:
-
Status: Patch Available  (was: Open)

[branch|https://github.com/apache/cassandra/compare/trunk...snazy:11555-pstmt-cache-config-trunk]
[testall|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-11555-pstmt-cache-config-trunk-testall/lastBuild/]
[dtest|http://cassci.datastax.com/view/Dev/view/snazy/job/snazy-11555-pstmt-cache-config-trunk-dtest/lastBuild/]


> Make prepared statement cache size configurable
> ---
>
> Key: CASSANDRA-11555
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11555
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Minor
>
> The prepared statement caches in {{org.apache.cassandra.cql3.QueryProcessor}} 
> are configured using the formula {{Runtime.getRuntime().maxMemory() / 256}}. 
> Sometimes applications may need more than that. Proposal is to make that 
> value configurable - probably also distinguish thrift and native CQL3 queries 
> (new applications don't need the thrift stuff).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (CASSANDRA-11555) Make prepared statement cache size configurable

2016-04-14 Thread Robert Stupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp reassigned CASSANDRA-11555:


Assignee: Robert Stupp

> Make prepared statement cache size configurable
> ---
>
> Key: CASSANDRA-11555
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11555
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Minor
>
> The prepared statement caches in {{org.apache.cassandra.cql3.QueryProcessor}} 
> are configured using the formula {{Runtime.getRuntime().maxMemory() / 256}}. 
> Sometimes applications may need more than that. Proposal is to make that 
> value configurable - probably also distinguish thrift and native CQL3 queries 
> (new applications don't need the thrift stuff).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9830) Option to disable bloom filter in highest level of LCS sstables

2016-04-14 Thread Paulo Motta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-9830:
---
Status: Open  (was: Patch Available)

> Option to disable bloom filter in highest level of LCS sstables
> ---
>
> Key: CASSANDRA-9830
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9830
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Compaction
>Reporter: Jonathan Ellis
>Assignee: Paulo Motta
>Priority: Minor
>  Labels: performance
> Fix For: 3.x
>
>
> We expect about 90% of data to be in the highest level of LCS in a fully 
> populated series.  (See also CASSANDRA-9829.)
> Thus if the user is primarily asking for data (partitions) that has actually 
> been inserted, the bloom filter on the highest level only helps reject 
> sstables about 10% of the time.
> We should add an option that suppresses bloom filter creation on top-level 
> sstables.  This will dramatically reduce memory usage for LCS and may even 
> improve performance as we no longer check a low-value filter.
> (This is also an idea from RocksDB.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11422) Eliminate temporary object[] allocations in ColumnDefinition::hashCode

2016-04-14 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-11422:
---
Status: Patch Available  (was: Open)

> Eliminate temporary object[] allocations in ColumnDefinition::hashCode
> --
>
> Key: CASSANDRA-11422
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11422
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Nitsan Wakart
>Assignee: Nitsan Wakart
>
> ColumnDefinition::hashCode currently calls Objects.hashCode(Object...)
> This triggers the allocation of a short lived Object[] which is not 
> eliminated by EscapeAnalysis. I have implemented a fix by inlining the 
> hashcode logic and also added a caching hashcode field. This improved 
> performance on the read workload.
> Fix is available here:
> https://github.com/nitsanw/cassandra/tree/objects-hashcode-fix



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11423) Eliminate Pair allocations for default DataType conversions

2016-04-14 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-11423:
---
Status: Patch Available  (was: Open)

> Eliminate Pair allocations for default DataType conversions
> ---
>
> Key: CASSANDRA-11423
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11423
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Nitsan Wakart
>Assignee: Nitsan Wakart
>
> The method DataType::fromType returns a Pair. The common path through the 
> method is:
> {
>DataType dt = dataTypeMap.get(type);
>return new Pair(dt, null);
> }
> This results in many redundant allocation and is easy to fix by adding a 
> DataType field to cache this result per DataType and replacing the last line 
> with:
>   return dt.pair;
> see fix:
> https://github.com/nitsanw/cassandra/tree/data-type-dafault-pair



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11452) Cache implementation using LIRS eviction for in-process page cache

2016-04-14 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241560#comment-15241560
 ] 

Benedict commented on CASSANDRA-11452:
--

Something like [this|https://github.com/belliottsmith/caffeine/tree/random-hack]

I haven't checked it works as I haven't time to get it all compiling etc, but 
it should clearly demonstrate what I'm talking about.

It looks like Caffeine has changed a great deal since I last looked at it! I'll 
have to have a poke around when I have time.

> Cache implementation using LIRS eviction for in-process page cache
> --
>
> Key: CASSANDRA-11452
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11452
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>
> Following up from CASSANDRA-5863, to make best use of caching and to avoid 
> having to explicitly marking compaction accesses as non-cacheable, we need a 
> cache implementation that uses an eviction algorithm that can better handle 
> non-recurring accesses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11416) No longer able to load backups into new cluster if there was a dropped column

2016-04-14 Thread Jeremiah Jordan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241558#comment-15241558
 ] 

Jeremiah Jordan commented on CASSANDRA-11416:
-

Maybe we should just log a warning/error about the columns instead of throwing 
an exception?  And then ignore them?  Aka assume they are there because someone 
dropped them in a previous life.

> No longer able to load backups into new cluster if there was a dropped column
> -
>
> Key: CASSANDRA-11416
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11416
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jeremiah Jordan
>Assignee: Aleksey Yeschenko
> Fix For: 3.0.x, 3.x
>
>
> The following change to the sstableloader test works in 2.1/2.2 but fails in 
> 3.0+
> https://github.com/JeremiahDJordan/cassandra-dtest/commit/7dc66efb8d24239f0a488ec5a613240531aeb7db
> {code}
> CREATE TABLE test_drop (key text PRIMARY KEY, c1 text, c2 text, c3 text, c4 
> text)
> ...insert data...
> ALTER TABLE test_drop DROP c4
> ...insert more data...
> {code}
> Make a snapshot and save off a describe to backup table test_drop.
> Decide to restore the snapshot to a new cluster.   First restore the schema 
> from describe. (column c4 isn't there)
> {code}
> CREATE TABLE test_drop (key text PRIMARY KEY, c1 text, c2 text, c3 text)
> {code}
> sstableload the snapshot data.
> Works in 2.1/2.2.  Fails in 3.0+ with:
> {code}
> java.lang.RuntimeException: Unknown column c4 during deserialization
> java.lang.RuntimeException: Failed to list files in 
> /var/folders/t4/rlc2b6450qbg92762l9l4mt8gn/T/dtest-3eKv_g/test/node1/data1_copy/ks/drop_one-bcef5280f11b11e5825a43f0253f18b5
>   at 
> org.apache.cassandra.db.lifecycle.LogAwareFileLister.list(LogAwareFileLister.java:53)
>   at 
> org.apache.cassandra.db.lifecycle.LifecycleTransaction.getFiles(LifecycleTransaction.java:544)
>   at 
> org.apache.cassandra.io.sstable.SSTableLoader.openSSTables(SSTableLoader.java:76)
>   at 
> org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:165)
>   at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:104)
> Caused by: java.lang.RuntimeException: Unknown column c4 during 
> deserialization
>   at 
> org.apache.cassandra.db.SerializationHeader$Component.toHeader(SerializationHeader.java:331)
>   at 
> org.apache.cassandra.io.sstable.format.SSTableReader.openForBatch(SSTableReader.java:430)
>   at 
> org.apache.cassandra.io.sstable.SSTableLoader.lambda$openSSTables$193(SSTableLoader.java:121)
>   at 
> org.apache.cassandra.db.lifecycle.LogAwareFileLister.lambda$innerList$184(LogAwareFileLister.java:75)
>   at 
> java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:174)
>   at 
> java.util.TreeMap$EntrySpliterator.forEachRemaining(TreeMap.java:2965)
>   at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
>   at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
>   at 
> java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
>   at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>   at 
> java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
>   at 
> org.apache.cassandra.db.lifecycle.LogAwareFileLister.innerList(LogAwareFileLister.java:77)
>   at 
> org.apache.cassandra.db.lifecycle.LogAwareFileLister.list(LogAwareFileLister.java:49)
>   ... 4 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11264) Repair scheduling - Failure handling and retry

2016-04-14 Thread Marcus Olsson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241559#comment-15241559
 ] 

Marcus Olsson commented on CASSANDRA-11264:
---

bq. After having a look at your original patch I saw that a failed task will be 
re-prioritized against other scheduled jobs/tasks with a high priority (given 
its last run time will not be updated), so that's already a retry mechanism in 
itself.
While this is true, I believe that this part should probably be reworked a bit. 
If we have a scenario where one particular job will always fail, we will end up 
in a loop where that job would get retried constantly which leads to starvation 
of other jobs. One option is to keep it simple and only run it once (by 
removing the retry logic) and also add a flag for the job which is used to 
determine when the job is allowed to run again. Something like:
{code}
execute()
{
 runTasks();
 if (allTasksWasSuccessful())
 {
  nextRun = -1
  lastRunTime = now;
 }
 else
 {
  nextRun = now + defaultWaitTime;
 }
}
{code}
Then that flag would be used to avoid prioritizing the failing job against the 
other jobs until the {{defaultWaitTime}} has elapsed. This flag could also work 
nicely with the rejection policies (assuming that they estimate the time until 
the job can actually be run), especially if we would be able to reject repairs 
on a specific table rather than all tables. WDYT?

bq. Rather than cluttering the scheduled repair mechanism with retry logic, I 
think that it's better to add a retry option to (non-scheduled) repair job, and 
do more fine grained retry on individual steps such as validation and sync, 
since this will be more effective against transient failures rather than 
retrying the whole task and potentially losing work of non-failed tasks.
Great idea! If e.g. a validation would fail on one node, would we clean up the 
resources on that node by CASSANDRA-11190 (specifically about cleaning up 
resources, so that we can safely retry it) or would we need a separate way of 
doing that? 

bq. We can of course log warns and gather statistics when a scheduled task 
fails, but I think we should add retry support to repair independently of this. 
WDYT?
Sounds good!

> Repair scheduling - Failure handling and retry
> --
>
> Key: CASSANDRA-11264
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11264
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
>
> Make it possible for repairs to be run again if they fail and clean up the 
> associated resources (validations and streaming sessions) before retrying. 
> Log a warning for each re-attempt and an error if it can't complete in X 
> times. The number of retries before considering the repair a failure could be 
> configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11523) server side exception on secondary index query through thrift

2016-04-14 Thread Sam Tunnicliffe (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe updated CASSANDRA-11523:

Status: Patch Available  (was: In Progress)

The problem is that when the indexed column is not covered by the Thrift 
query's {{slice_predicate}} it isn't included in the data retrieved from the 
base table using the key obtained from the index lookup. The NPE is coming from 
the staleness check which expects that data to be present in the base table, it 
only affects thrift queries, not CQL queries against the same table. 

I've pushed branches off 3.0 & trunk with a simple fix to {{KeysSearcher}} 
which essentially mimics the pre-2.2 behaviour, which used {{ExtendedFilter}} 
to add any columns required for the read to the filter before pruning them from 
the results returned to the user. 

[~yngwiie], [this 
patch|https://github.com/beobal/cassandra/commit/517e2e78a618d0e9c6225f9b27ed837450bdcc80.patch]
 applies cleanly to 3.0.4. If possible, would you mind applying it and checking 
that it works for you?

Pull request to add a new dtest for this issue is 
[here|https://github.com/riptano/cassandra-dtest/pull/926]


||branch||testall||dtest||
|[11523-3.0|https://github.com/beobal/cassandra/tree/11523-3.0]|[testall|http://cassci.datastax.com/view/Dev/view/beobal/job/beobal-11523-3.0-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/beobal/job/beobal-11523-3.0-dtest]|
|[11523-trunk|https://github.com/beobal/cassandra/tree/11523-trunk]|[testall|http://cassci.datastax.com/view/Dev/view/beobal/job/beobal-11523-trunk-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/beobal/job/beobal-11523-trunk-dtest]|


> server side exception on secondary index query through thrift
> -
>
> Key: CASSANDRA-11523
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11523
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: linux opensuse 13.2, jdk8
>Reporter: Ivan Georgiev
>Assignee: Sam Tunnicliffe
> Fix For: 3.0.x, 3.x
>
>
> Trying to upgrade from 2.x to 3.x, using 3.0.4 for the purpose. We are using 
> thrift interface for the time being. Everything works fine except for 
> secondary index queries. 
> When doing a get_range_slices call with row_filter set in the KeyRange we get 
> a server side exception. Here is a trace of the exception:
> INFO   | jvm 1| 2016/04/07 14:56:35 | 14:56:35.401 [Thrift:12] DEBUG 
> o.a.cassandra.service.ReadCallback - Failed; received 0 of 1 responses
> INFO   | jvm 1| 2016/04/07 14:56:35 | 14:56:35.401 [SharedPool-Worker-1] 
> WARN  o.a.c.c.AbstractLocalAwareExecutorService - Uncaught exception on 
> thread Thread[SharedPool-Worker-1,5,main]: {}
> INFO   | jvm 1| 2016/04/07 14:56:35 | java.lang.RuntimeException: 
> java.lang.NullPointerException
> INFO   | jvm 1| 2016/04/07 14:56:35 | at 
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2450)
>  ~[apache-cassandra-3.0.4.jar:3.0.4]
> INFO   | jvm 1| 2016/04/07 14:56:35 | at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_72]
> INFO   | jvm 1| 2016/04/07 14:56:35 | at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
>  ~[apache-cassandra-3.0.4.jar:3.0.4]
> INFO   | jvm 1| 2016/04/07 14:56:35 | at 
> org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> [apache-cassandra-3.0.4.jar:3.0.4]
> INFO   | jvm 1| 2016/04/07 14:56:35 | at 
> java.lang.Thread.run(Thread.java:745) [na:1.8.0_72]
> INFO   | jvm 1| 2016/04/07 14:56:35 | Caused by: 
> java.lang.NullPointerException: null
> INFO   | jvm 1| 2016/04/07 14:56:35 | at 
> org.apache.cassandra.index.internal.keys.KeysSearcher.filterIfStale(KeysSearcher.java:155)
>  ~[apache-cassandra-3.0.4.jar:3.0.4]
> INFO   | jvm 1| 2016/04/07 14:56:35 | at 
> org.apache.cassandra.index.internal.keys.KeysSearcher.access$300(KeysSearcher.java:36)
>  ~[apache-cassandra-3.0.4.jar:3.0.4]
> INFO   | jvm 1| 2016/04/07 14:56:35 | at 
> org.apache.cassandra.index.internal.keys.KeysSearcher$1.prepareNext(KeysSearcher.java:104)
>  ~[apache-cassandra-3.0.4.jar:3.0.4]
> INFO   | jvm 1| 2016/04/07 14:56:35 | at 
> org.apache.cassandra.index.internal.keys.KeysSearcher$1.hasNext(KeysSearcher.java:70)
>  ~[apache-cassandra-3.0.4.jar:3.0.4]
> INFO   | jvm 1| 2016/04/07 14:56:35 | at 
> org.apache.cassandra.db.transform.BasePartitions.hasNext(BasePartitions.java:72)
>  ~[apache-cassandra-3.0.4.jar:3.0.4]
> INFO   | jvm 1| 2016/04/07 14:56:35 | at 
> 

[jira] [Commented] (CASSANDRA-11452) Cache implementation using LIRS eviction for in-process page cache

2016-04-14 Thread Ben Manes (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241524#comment-15241524
 ] 

Ben Manes commented on CASSANDRA-11452:
---

Sorry if I'm being a bit obtuse. If you write a short snippet I can try 
applying that approach.

> Cache implementation using LIRS eviction for in-process page cache
> --
>
> Key: CASSANDRA-11452
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11452
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>
> Following up from CASSANDRA-5863, to make best use of caching and to avoid 
> having to explicitly marking compaction accesses as non-cacheable, we need a 
> cache implementation that uses an eviction algorithm that can better handle 
> non-recurring accesses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11452) Cache implementation using LIRS eviction for in-process page cache

2016-04-14 Thread Ben Manes (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241521#comment-15241521
 ] 

Ben Manes commented on CASSANDRA-11452:
---

I assumed that it would be acceptable to reduce the penalty when a clash was 
detected. The current version ejects the victim so that the candidates flow 
through the probation space. I think that should be similar to your >= 
approach, without reducing the hit rate in the small traces. Can you review the 
[patch|https://github.com/ben-manes/caffeine/commit/22ce6339ec91fd7eadfb462fcb176aac69aeb47f]?

> Cache implementation using LIRS eviction for in-process page cache
> --
>
> Key: CASSANDRA-11452
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11452
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>
> Following up from CASSANDRA-5863, to make best use of caching and to avoid 
> having to explicitly marking compaction accesses as non-cacheable, we need a 
> cache implementation that uses an eviction algorithm that can better handle 
> non-recurring accesses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8844) Change Data Capture (CDC)

2016-04-14 Thread Carl Yeksigian (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241510#comment-15241510
 ] 

Carl Yeksigian commented on CASSANDRA-8844:
---

While working on figuring out a separate issue related to being over the CDC 
limit, I realized that currently the keyspace could have CDC DCs and have 
{{durable_writes=false}}. This would mean that we would not be writing to the 
CDC logs in all of our DCs. We can either:
# Add the CDC local DC check in {{Mutation#apply()}}, where we currently only 
check whether the keyspace has durable writes
# Validate that CDC isn't used with {{durable_writes=false}} keyspaces

1 seems more in line with CDC - allowing the performance to only affect 
operations in a single datacenter. However, we would also probably have to 
replay the CDC logs on startup even though {{durable_writes=false}}; otherwise 
there would be data in the CDC log that doesn't exist in the cluster.

> Change Data Capture (CDC)
> -
>
> Key: CASSANDRA-8844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8844
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Coordination, Local Write-Read Paths
>Reporter: Tupshin Harper
>Assignee: Joshua McKenzie
>Priority: Critical
> Fix For: 3.x
>
>
> "In databases, change data capture (CDC) is a set of software design patterns 
> used to determine (and track) the data that has changed so that action can be 
> taken using the changed data. Also, Change data capture (CDC) is an approach 
> to data integration that is based on the identification, capture and delivery 
> of the changes made to enterprise data sources."
> -Wikipedia
> As Cassandra is increasingly being used as the Source of Record (SoR) for 
> mission critical data in large enterprises, it is increasingly being called 
> upon to act as the central hub of traffic and data flow to other systems. In 
> order to try to address the general need, we (cc [~brianmhess]), propose 
> implementing a simple data logging mechanism to enable per-table CDC patterns.
> h2. The goals:
> # Use CQL as the primary ingestion mechanism, in order to leverage its 
> Consistency Level semantics, and in order to treat it as the single 
> reliable/durable SoR for the data.
> # To provide a mechanism for implementing good and reliable 
> (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) 
> continuous semi-realtime feeds of mutations going into a Cassandra cluster.
> # To eliminate the developmental and operational burden of users so that they 
> don't have to do dual writes to other systems.
> # For users that are currently doing batch export from a Cassandra system, 
> give them the opportunity to make that realtime with a minimum of coding.
> h2. The mechanism:
> We propose a durable logging mechanism that functions similar to a commitlog, 
> with the following nuances:
> - Takes place on every node, not just the coordinator, so RF number of copies 
> are logged.
> - Separate log per table.
> - Per-table configuration. Only tables that are specified as CDC_LOG would do 
> any logging.
> - Per DC. We are trying to keep the complexity to a minimum to make this an 
> easy enhancement, but most likely use cases would prefer to only implement 
> CDC logging in one (or a subset) of the DCs that are being replicated to
> - In the critical path of ConsistencyLevel acknowledgment. Just as with the 
> commitlog, failure to write to the CDC log should fail that node's write. If 
> that means the requested consistency level was not met, then clients *should* 
> experience UnavailableExceptions.
> - Be written in a Row-centric manner such that it is easy for consumers to 
> reconstitute rows atomically.
> - Written in a simple format designed to be consumed *directly* by daemons 
> written in non JVM languages
> h2. Nice-to-haves
> I strongly suspect that the following features will be asked for, but I also 
> believe that they can be deferred for a subsequent release, and to guage 
> actual interest.
> - Multiple logs per table. This would make it easy to have multiple 
> "subscribers" to a single table's changes. A workaround would be to create a 
> forking daemon listener, but that's not a great answer.
> - Log filtering. Being able to apply filters, including UDF-based filters 
> would make Casandra a much more versatile feeder into other systems, and 
> again, reduce complexity that would otherwise need to be built into the 
> daemons.
> h2. Format and Consumption
> - Cassandra would only write to the CDC log, and never delete from it. 
> - Cleaning up consumed logfiles would be the client daemon's responibility
> - Logfile size should probably be configurable.
> - Logfiles should be named with a predictable naming schema, making it 
> triivial to process them 

[jira] [Commented] (CASSANDRA-11416) No longer able to load backups into new cluster if there was a dropped column

2016-04-14 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241503#comment-15241503
 ] 

Aleksey Yeschenko commented on CASSANDRA-11416:
---

True. I'm looking into options, and not a fan of any of them tbh. The easiest 
would be to include {{ALTER TABLE DROP}} output in {{DESCRIBE}}, and have a 
variant of it that accepts the timestamp of drop.

> No longer able to load backups into new cluster if there was a dropped column
> -
>
> Key: CASSANDRA-11416
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11416
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jeremiah Jordan
>Assignee: Aleksey Yeschenko
> Fix For: 3.0.x, 3.x
>
>
> The following change to the sstableloader test works in 2.1/2.2 but fails in 
> 3.0+
> https://github.com/JeremiahDJordan/cassandra-dtest/commit/7dc66efb8d24239f0a488ec5a613240531aeb7db
> {code}
> CREATE TABLE test_drop (key text PRIMARY KEY, c1 text, c2 text, c3 text, c4 
> text)
> ...insert data...
> ALTER TABLE test_drop DROP c4
> ...insert more data...
> {code}
> Make a snapshot and save off a describe to backup table test_drop.
> Decide to restore the snapshot to a new cluster.   First restore the schema 
> from describe. (column c4 isn't there)
> {code}
> CREATE TABLE test_drop (key text PRIMARY KEY, c1 text, c2 text, c3 text)
> {code}
> sstableload the snapshot data.
> Works in 2.1/2.2.  Fails in 3.0+ with:
> {code}
> java.lang.RuntimeException: Unknown column c4 during deserialization
> java.lang.RuntimeException: Failed to list files in 
> /var/folders/t4/rlc2b6450qbg92762l9l4mt8gn/T/dtest-3eKv_g/test/node1/data1_copy/ks/drop_one-bcef5280f11b11e5825a43f0253f18b5
>   at 
> org.apache.cassandra.db.lifecycle.LogAwareFileLister.list(LogAwareFileLister.java:53)
>   at 
> org.apache.cassandra.db.lifecycle.LifecycleTransaction.getFiles(LifecycleTransaction.java:544)
>   at 
> org.apache.cassandra.io.sstable.SSTableLoader.openSSTables(SSTableLoader.java:76)
>   at 
> org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:165)
>   at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:104)
> Caused by: java.lang.RuntimeException: Unknown column c4 during 
> deserialization
>   at 
> org.apache.cassandra.db.SerializationHeader$Component.toHeader(SerializationHeader.java:331)
>   at 
> org.apache.cassandra.io.sstable.format.SSTableReader.openForBatch(SSTableReader.java:430)
>   at 
> org.apache.cassandra.io.sstable.SSTableLoader.lambda$openSSTables$193(SSTableLoader.java:121)
>   at 
> org.apache.cassandra.db.lifecycle.LogAwareFileLister.lambda$innerList$184(LogAwareFileLister.java:75)
>   at 
> java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:174)
>   at 
> java.util.TreeMap$EntrySpliterator.forEachRemaining(TreeMap.java:2965)
>   at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
>   at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
>   at 
> java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
>   at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>   at 
> java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
>   at 
> org.apache.cassandra.db.lifecycle.LogAwareFileLister.innerList(LogAwareFileLister.java:77)
>   at 
> org.apache.cassandra.db.lifecycle.LogAwareFileLister.list(LogAwareFileLister.java:49)
>   ... 4 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11428) Eliminate Allocations

2016-04-14 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241487#comment-15241487
 ] 

T Jake Luciani commented on CASSANDRA-11428:


Actually looks like we can remove the changes you made to decodeString since 
CASSANDRA-8101 is no longer needed (fixed in netty 4.0.35)

> Eliminate Allocations
> -
>
> Key: CASSANDRA-11428
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11428
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: T Jake Luciani
>Assignee: Nitsan Wakart
>Priority: Minor
> Fix For: 3.0.x
>
> Attachments: benchmarks.tar.gz, pom.xml
>
>
> Linking relevant issues under this master ticket.  For small changes I'd like 
> to test and commit these in bulk 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11535) Add dtests for PER PARTITION LIMIT queries with paging

2016-04-14 Thread Alex Petrov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov updated CASSANDRA-11535:

Resolution: Fixed
Status: Resolved  (was: Ready to Commit)

> Add dtests for PER PARTITION LIMIT queries with paging
> --
>
> Key: CASSANDRA-11535
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11535
> Project: Cassandra
>  Issue Type: Test
>  Components: Testing
>Reporter: Alex Petrov
>Assignee: Alex Petrov
>Priority: Minor
>
> [#7017|https://issues.apache.org/jira/browse/CASSANDRA-7017] introduces {{PER 
> PARTITION LIMIT}} queries. In order to ensure they work with paging, with 
> partitions containing only static columns, we need to add {{dtests}} to it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11535) Add dtests for PER PARTITION LIMIT queries with paging

2016-04-14 Thread Alex Petrov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov updated CASSANDRA-11535:

Status: Ready to Commit  (was: Patch Available)

> Add dtests for PER PARTITION LIMIT queries with paging
> --
>
> Key: CASSANDRA-11535
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11535
> Project: Cassandra
>  Issue Type: Test
>  Components: Testing
>Reporter: Alex Petrov
>Assignee: Alex Petrov
>Priority: Minor
>
> [#7017|https://issues.apache.org/jira/browse/CASSANDRA-7017] introduces {{PER 
> PARTITION LIMIT}} queries. In order to ensure they work with paging, with 
> partitions containing only static columns, we need to add {{dtests}} to it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11535) Add dtests for PER PARTITION LIMIT queries with paging

2016-04-14 Thread Alex Petrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241473#comment-15241473
 ] 

Alex Petrov commented on CASSANDRA-11535:
-

Merged as 
[2e8b5f7b80ddf4c59bffb2f259fc992b79287028|https://github.com/riptano/cassandra-dtest/commit/2e8b5f7b80ddf4c59bffb2f259fc992b79287028]

> Add dtests for PER PARTITION LIMIT queries with paging
> --
>
> Key: CASSANDRA-11535
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11535
> Project: Cassandra
>  Issue Type: Test
>  Components: Testing
>Reporter: Alex Petrov
>Assignee: Alex Petrov
>Priority: Minor
>
> [#7017|https://issues.apache.org/jira/browse/CASSANDRA-7017] introduces {{PER 
> PARTITION LIMIT}} queries. In order to ensure they work with paging, with 
> partitions containing only static columns, we need to add {{dtests}} to it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (CASSANDRA-11560) dtest failure in user_types_test.TestUserTypes.udt_subfield_test

2016-04-14 Thread Tyler Hobbs (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tyler Hobbs reassigned CASSANDRA-11560:
---

Assignee: Tyler Hobbs  (was: DS Test Eng)

> dtest failure in user_types_test.TestUserTypes.udt_subfield_test
> 
>
> Key: CASSANDRA-11560
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11560
> Project: Cassandra
>  Issue Type: Test
>Reporter: Michael Shuler
>Assignee: Tyler Hobbs
>  Labels: dtest
>
> example failure:
> http://cassci.datastax.com/job/trunk_dtest/1125/testReport/user_types_test/TestUserTypes/udt_subfield_test
> Failed on CassCI build trunk_dtest #1125
> Appears to be a test problem:
> {noformat}
> Error Message
> 'NoneType' object is not iterable
>  >> begin captured logging << 
> dtest: DEBUG: cluster ccm directory: /mnt/tmp/dtest-Kzg9Sk
> dtest: DEBUG: Custom init_config not found. Setting defaults.
> dtest: DEBUG: Done setting configuration options:
> {   'initial_token': None,
> 'num_tokens': '32',
> 'phi_convict_threshold': 5,
> 'range_request_timeout_in_ms': 1,
> 'read_request_timeout_in_ms': 1,
> 'request_timeout_in_ms': 1,
> 'truncate_request_timeout_in_ms': 1,
> 'write_request_timeout_in_ms': 1}
> - >> end captured logging << -
> Stacktrace
>   File "/usr/lib/python2.7/unittest/case.py", line 329, in run
> testMethod()
>   File "/home/automaton/cassandra-dtest/tools.py", line 253, in wrapped
> f(obj)
>   File "/home/automaton/cassandra-dtest/user_types_test.py", line 767, in 
> udt_subfield_test
> self.assertEqual(listify(rows[0]), [[None]])
>   File "/home/automaton/cassandra-dtest/user_types_test.py", line 25, in 
> listify
> for i in item:
> "'NoneType' object is not iterable\n >> begin captured 
> logging << \ndtest: DEBUG: cluster ccm directory: 
> /mnt/tmp/dtest-Kzg9Sk\ndtest: DEBUG: Custom init_config not found. Setting 
> defaults.\ndtest: DEBUG: Done setting configuration options:\n{   
> 'initial_token': None,\n'num_tokens': '32',\n'phi_convict_threshold': 
> 5,\n'range_request_timeout_in_ms': 1,\n
> 'read_request_timeout_in_ms': 1,\n'request_timeout_in_ms': 1,\n   
>  'truncate_request_timeout_in_ms': 1,\n'write_request_timeout_in_ms': 
> 1}\n- >> end captured logging << 
> -"
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11428) Eliminate Allocations

2016-04-14 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241443#comment-15241443
 ] 

T Jake Luciani commented on CASSANDRA-11428:


Ah thanks for that, I force pushed with those changes and will re-start tests

> Eliminate Allocations
> -
>
> Key: CASSANDRA-11428
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11428
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: T Jake Luciani
>Assignee: Nitsan Wakart
>Priority: Minor
> Fix For: 3.0.x
>
> Attachments: benchmarks.tar.gz, pom.xml
>
>
> Linking relevant issues under this master ticket.  For small changes I'd like 
> to test and commit these in bulk 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11428) Eliminate Allocations

2016-04-14 Thread Nitsan Wakart (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241440#comment-15241440
 ] 

Nitsan Wakart commented on CASSANDRA-11428:
---

You seem to have dropped the Pair allocation change?
Also, in CBUtil if we take the netty approach the TL ByteBuffer and encoder are 
not needed.

> Eliminate Allocations
> -
>
> Key: CASSANDRA-11428
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11428
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: T Jake Luciani
>Assignee: Nitsan Wakart
>Priority: Minor
> Fix For: 3.0.x
>
> Attachments: benchmarks.tar.gz, pom.xml
>
>
> Linking relevant issues under this master ticket.  For small changes I'd like 
> to test and commit these in bulk 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11428) Eliminate Allocations

2016-04-14 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241430#comment-15241430
 ] 

T Jake Luciani commented on CASSANDRA-11428:


Combined the sub tickets into one patch and removed the copied netty util now 
that CASSANDRA-11567 is in. I also changed the ThreadLocals in CBUtils to be 
FastThreadLocals since they are accessed from netty FastThreadLocalThreads, I 
can see a slight improvement in performance.

[branch| http://github.com/tjake/cassandra/tree/rm-allocations]
[testall| 
https://cassci.datastax.com/view/trunk/job/tjake-rm-allocations-testall]
[dtest| https://cassci.datastax.com/view/trunk/job/tjake-rm-allocations-dtest]

> Eliminate Allocations
> -
>
> Key: CASSANDRA-11428
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11428
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: T Jake Luciani
>Assignee: Nitsan Wakart
>Priority: Minor
> Fix For: 3.0.x
>
> Attachments: benchmarks.tar.gz, pom.xml
>
>
> Linking relevant issues under this master ticket.  For small changes I'd like 
> to test and commit these in bulk 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11452) Cache implementation using LIRS eviction for in-process page cache

2016-04-14 Thread Ben Manes (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241420#comment-15241420
 ] 

Ben Manes commented on CASSANDRA-11452:
---

For large traces the difference is marginal, with s3 showing a 2% loss. For 
small traces the difference can be substantial

db: 51.29 -> 51.52
s3: 51.10 -> 49.12
oltp: 37.91 -> 38.10
multi1: 55.59 -> 50.50
gli: 34.16 -> 16.11
cs: 30.31 -> 26.74

> Cache implementation using LIRS eviction for in-process page cache
> --
>
> Key: CASSANDRA-11452
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11452
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>
> Following up from CASSANDRA-5863, to make best use of caching and to avoid 
> having to explicitly marking compaction accesses as non-cacheable, we need a 
> cache implementation that uses an eviction algorithm that can better handle 
> non-recurring accesses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10988) isInclusive and boundsAsComposites in Restriction take bounds in different order

2016-04-14 Thread Alex Petrov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov updated CASSANDRA-10988:

Summary: isInclusive and boundsAsComposites in Restriction take bounds in 
different order  (was: isInclusive and boundsAsComposites take bounds in 
different order)

> isInclusive and boundsAsComposites in Restriction take bounds in different 
> order
> 
>
> Key: CASSANDRA-10988
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10988
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
>Reporter: Vassil Hristov
>Assignee: Alex Petrov
> Fix For: 2.2.x
>
>
> After we've upgraded our cluster to version 2.1.11, we started getting the 
> bellow exceptions for some of our queries. Issue seems to be very similar to 
> CASSANDRA-7284.
> Code to reproduce:
> {code:java}
> createTable("CREATE TABLE %s (" +
> "a text," +
> "b int," +
> "PRIMARY KEY (a, b)" +
> ") WITH COMPACT STORAGE" +
> "AND CLUSTERING ORDER BY (b DESC)");
> execute("insert into %s (a, b) values ('a', 2)");
> execute("SELECT * FROM %s WHERE a = 'a' AND b > 0");
> {code}
> {code:java}
> java.lang.ClassCastException: 
> org.apache.cassandra.db.composites.Composites$EmptyComposite cannot be cast 
> to org.apache.cassandra.db.composites.CellName
> at 
> org.apache.cassandra.db.composites.AbstractCellNameType.cellFromByteBuffer(AbstractCellNameType.java:188)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.composites.AbstractSimpleCellNameType.makeCellName(AbstractSimpleCellNameType.java:125)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.composites.AbstractCellNameType.makeCellName(AbstractCellNameType.java:254)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.makeExclusiveSliceBound(SelectStatement.java:1197)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.applySliceRestriction(SelectStatement.java:1205)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.processColumnFamily(SelectStatement.java:1283)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.process(SelectStatement.java:1250)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.processResults(SelectStatement.java:299)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:276)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:224)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:67)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:238)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:493)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:138)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:439)
>  [apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:335)
>  [apache-cassandra-2.1.11.jar:2.1.11]
> at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_66]
> at 
> org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
>  [apache-cassandra-2.1.11.jar:2.1.11]
> at 

[jira] [Updated] (CASSANDRA-10988) isInclusive and boundsAsComposites take bounds in different order

2016-04-14 Thread Alex Petrov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov updated CASSANDRA-10988:

Summary: isInclusive and boundsAsComposites take bounds in different order  
(was: ClassCastException in SelectStatement)

> isInclusive and boundsAsComposites take bounds in different order
> -
>
> Key: CASSANDRA-10988
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10988
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
>Reporter: Vassil Hristov
>Assignee: Alex Petrov
> Fix For: 2.2.x
>
>
> After we've upgraded our cluster to version 2.1.11, we started getting the 
> bellow exceptions for some of our queries. Issue seems to be very similar to 
> CASSANDRA-7284.
> Code to reproduce:
> {code:java}
> createTable("CREATE TABLE %s (" +
> "a text," +
> "b int," +
> "PRIMARY KEY (a, b)" +
> ") WITH COMPACT STORAGE" +
> "AND CLUSTERING ORDER BY (b DESC)");
> execute("insert into %s (a, b) values ('a', 2)");
> execute("SELECT * FROM %s WHERE a = 'a' AND b > 0");
> {code}
> {code:java}
> java.lang.ClassCastException: 
> org.apache.cassandra.db.composites.Composites$EmptyComposite cannot be cast 
> to org.apache.cassandra.db.composites.CellName
> at 
> org.apache.cassandra.db.composites.AbstractCellNameType.cellFromByteBuffer(AbstractCellNameType.java:188)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.composites.AbstractSimpleCellNameType.makeCellName(AbstractSimpleCellNameType.java:125)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.composites.AbstractCellNameType.makeCellName(AbstractCellNameType.java:254)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.makeExclusiveSliceBound(SelectStatement.java:1197)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.applySliceRestriction(SelectStatement.java:1205)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.processColumnFamily(SelectStatement.java:1283)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.process(SelectStatement.java:1250)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.processResults(SelectStatement.java:299)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:276)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:224)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:67)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:238)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:493)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:138)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:439)
>  [apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:335)
>  [apache-cassandra-2.1.11.jar:2.1.11]
> at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_66]
> at 
> org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
>  [apache-cassandra-2.1.11.jar:2.1.11]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> 

[jira] [Commented] (CASSANDRA-10091) Integrated JMX authn & authz

2016-04-14 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241276#comment-15241276
 ] 

T Jake Luciani commented on CASSANDRA-10091:


Sorry didn't follow up. +1 assuming CI looks good

> Integrated JMX authn & authz
> 
>
> Key: CASSANDRA-10091
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10091
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Jan Karlsson
>Assignee: Sam Tunnicliffe
>Priority: Minor
> Fix For: 3.x
>
>
> It would be useful to authenticate with JMX through Cassandra's internal 
> authentication. This would reduce the overhead of keeping passwords in files 
> on the machine and would consolidate passwords to one location. It would also 
> allow the possibility to handle JMX permissions in Cassandra.
> It could be done by creating our own JMX server and setting custom classes 
> for the authenticator and authorizer. We could then add some parameters where 
> the user could specify what authenticator and authorizer to use in case they 
> want to make their own.
> This could also be done by creating a premain method which creates a jmx 
> server. This would give us the feature without changing the Cassandra code 
> itself. However I believe this would be a good feature to have in Cassandra.
> I am currently working on a solution which creates a JMX server and uses a 
> custom authenticator and authorizer. It is currently build as a premain, 
> however it would be great if we could put this in Cassandra instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11574) COPY FROM command in cqlsh throws error

2016-04-14 Thread Mahafuzur Rahman (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahafuzur Rahman updated CASSANDRA-11574:
-
Description: 
Any COPY FROM command in cqlsh is throwing the following error:

"get_num_processes() takes no keyword arguments"

Example command: 

COPY inboxdata 
(to_user_id,to_user_network,created_time,attachments,from_user_id,from_user_name,from_user_network,id,message,to_user_name,updated_time)
 FROM 'inbox.csv';

Similar commands worked parfectly in the previous versions such as 3.0.4

  was:
Any COPY FROM command in cqlsh is throwing the following error:

"get_num_processes() takes no keyword arguments"

Example command: 

COPY inboxdata 
(to_user_id,to_user_network,created_time,attachments,from_user_id,from_user_name,from_user_network,id,message,to_user_name,updated_time)
 FROM 'inbox.csv';




> COPY FROM command in cqlsh throws error
> ---
>
> Key: CASSANDRA-11574
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11574
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
> Environment: Operating System: Ubuntu Server 14.04
> JDK: Oracle JDK 8 update 77
> Python: 2.7.6
>Reporter: Mahafuzur Rahman
> Fix For: 3.0.6
>
>
> Any COPY FROM command in cqlsh is throwing the following error:
> "get_num_processes() takes no keyword arguments"
> Example command: 
> COPY inboxdata 
> (to_user_id,to_user_network,created_time,attachments,from_user_id,from_user_name,from_user_network,id,message,to_user_name,updated_time)
>  FROM 'inbox.csv';
> Similar commands worked parfectly in the previous versions such as 3.0.4



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-11567) Update netty version

2016-04-14 Thread T Jake Luciani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

T Jake Luciani resolved CASSANDRA-11567.

Resolution: Fixed
  Reviewer: Jason Brown

committed {{a0d070764ab9cf0a1eb16d7ffd7d57cbcefd2a82}}

> Update netty version
> 
>
> Key: CASSANDRA-11567
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11567
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
>Priority: Minor
> Fix For: 3.6
>
>
> Mainly for prereq to CASSANDRA-11421. 
> Netty 4.0.34 -> 4.0.36.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-11574) COPY FROM command in cqlsh throws error

2016-04-14 Thread Mahafuzur Rahman (JIRA)
Mahafuzur Rahman created CASSANDRA-11574:


 Summary: COPY FROM command in cqlsh throws error
 Key: CASSANDRA-11574
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11574
 Project: Cassandra
  Issue Type: Bug
  Components: CQL
 Environment: Operating System: Ubuntu Server 14.04
JDK: Oracle JDK 8 update 77
Python: 2.7.6
Reporter: Mahafuzur Rahman
 Fix For: 3.0.6


Any COPY FROM command in cqlsh is throwing the following error:

"get_num_processes() takes no keyword arguments"

Example command: 

COPY inboxdata 
(to_user_id,to_user_network,created_time,attachments,from_user_id,from_user_name,from_user_network,id,message,to_user_name,updated_time)
 FROM 'inbox.csv';





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


cassandra git commit: upgrade netty to 4.0.36

2016-04-14 Thread jake
Repository: cassandra
Updated Branches:
  refs/heads/trunk ccacf7d1a -> a0d070764


upgrade netty to 4.0.36

patch by tjake; reviewed by Jason Brown for CASSANDRA-11567


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a0d07076
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a0d07076
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a0d07076

Branch: refs/heads/trunk
Commit: a0d070764ab9cf0a1eb16d7ffd7d57cbcefd2a82
Parents: ccacf7d
Author: T Jake Luciani 
Authored: Wed Apr 13 13:49:53 2016 -0400
Committer: T Jake Luciani 
Committed: Thu Apr 14 10:31:53 2016 -0400

--
 CHANGES.txt|   1 +
 build.xml  |   2 +-
 lib/netty-all-4.0.34.Final.jar | Bin 2144516 -> 0 bytes
 lib/netty-all-4.0.36.Final.jar | Bin 0 -> 2195921 bytes
 4 files changed, 2 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/a0d07076/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 329e55c..43d1c3c 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.6
+ * Update Netty to 4.0.36 (CASSANDRA-11567)
  * Fix PER PARTITION LIMIT for queries requiring post-query ordering 
(CASSANDRA-11556)
  * Allow instantiation of UDTs and tuples in UDFs (CASSANDRA-10818)
  * Support UDT in CQLSSTableWriter (CASSANDRA-10624)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/a0d07076/build.xml
--
diff --git a/build.xml b/build.xml
index c6b2246..034fb29 100644
--- a/build.xml
+++ b/build.xml
@@ -411,7 +411,7 @@
   
   
   
-  
+  
   
   
   

http://git-wip-us.apache.org/repos/asf/cassandra/blob/a0d07076/lib/netty-all-4.0.34.Final.jar
--
diff --git a/lib/netty-all-4.0.34.Final.jar b/lib/netty-all-4.0.34.Final.jar
deleted file mode 100644
index 590b429..000
Binary files a/lib/netty-all-4.0.34.Final.jar and /dev/null differ

http://git-wip-us.apache.org/repos/asf/cassandra/blob/a0d07076/lib/netty-all-4.0.36.Final.jar
--
diff --git a/lib/netty-all-4.0.36.Final.jar b/lib/netty-all-4.0.36.Final.jar
new file mode 100644
index 000..5e278c4
Binary files /dev/null and b/lib/netty-all-4.0.36.Final.jar differ



[jira] [Commented] (CASSANDRA-11264) Repair scheduling - Failure handling and retry

2016-04-14 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241220#comment-15241220
 ] 

Paulo Motta commented on CASSANDRA-11264:
-

After having a look at your original patch I saw that a failed task will be 
re-prioritized against other scheduled jobs/tasks with a high priority (given 
its last run time will not be updated), so that's already a retry mechanism in 
itself.

Rather than cluttering the scheduled repair mechanism with retry logic, I think 
that it's better to add a retry option to (non-scheduled) repair job, and do 
more fine grained retry on individual steps such as validation and sync, since 
this will be more effective against transient failures rather than retrying the 
whole task and potentially losing work of non-failed tasks.

We can of course log warns and gather statistics when a scheduled task fails, 
but I think we should add retry support to repair independently of this. WDYT?

> Repair scheduling - Failure handling and retry
> --
>
> Key: CASSANDRA-11264
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11264
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
>
> Make it possible for repairs to be run again if they fail and clean up the 
> associated resources (validations and streaming sessions) before retrying. 
> Log a warning for each re-attempt and an error if it can't complete in X 
> times. The number of retries before considering the repair a failure could be 
> configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8844) Change Data Capture (CDC)

2016-04-14 Thread Branimir Lambov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241183#comment-15241183
 ] 

Branimir Lambov commented on CASSANDRA-8844:


First round of comments (I haven't looked at the read/replay part yet):

- I was a fan of the {{ReplayPosition}} name. It stands for a more general 
concept which happens be the commit log position for us. Further to this, it 
should be a {{CommitLogPosition}} rather than {{..SegmentPosition}} as it does 
not just specify a position within a given segment but an overall position in 
the log (for a specific keyspace). I am also wondering if it should not include 
a keyspace id / reference now that it is keyspace-specific to be able to fail 
fast on mismatch.
- I'd prefer to throw the {{WriteTimeoutException}} directly from {{allocate}} 
(instead of catching null in {{CommitLog}} and doing the same). Doing the check 
inside the {{while}} loop will avoid the over-allocation and do less work in 
the common case.
- Do we really need to have separate buffer pools per manager? Static (or not) 
shared will offer slightly better cache locality, and it's better to block both 
commit logs if we're running beyond allowed memory (we may want to double the 
default limit).
- [{{segmentManagers}} 
array|https://github.com/apache/cassandra/compare/trunk...josh-mckenzie:8844_review#diff-05c1e4fd86fea19b8e0552b1f289be85R119]:
 An {{EnumMap}} (which boils down to the same thing) would be cleaner and 
should not have any performance impact.
- 
[{{shutdownBlocking}}|https://github.com/apache/cassandra/compare/trunk...josh-mckenzie:8844_review#diff-05c1e4fd86fea19b8e0552b1f289be85R465]:
 Better shutdown in parallel, i.e. initiate and await termination separately.
- [{{reCalculating}} cas in 
{{maybeUpdateCDCSizeCounterAsync}}|https://github.com/apache/cassandra/compare/trunk...josh-mckenzie:8844_review#diff-878dc31866184d5ef750ccd9befc8382R72]
 is fishy: makes you think it would clear on exception in running update, which 
isn't the case. The {{updateCDCDirectorySize}} body should be wrapped in {{try 
... finally}} as well to do that. 
- You could use a scheduled executor to avoid the explicit delays. Or a 
{{RateLimiter}} (we'd prefer to update ASAP when triggered, but not too often) 
instead of the delay.
- 
[{{updateCDCOverflowSize}}|https://github.com/apache/cassandra/compare/trunk...josh-mckenzie:8844_review#diff-878dc31866184d5ef750ccd9befc8382R227]:
 use {{while (!reCalculating.compareAndSet(false, true)) {};}}. You should 
reset the value afterwards.
- I don't get the {{DirectorySizeCalculator}}. Why the {{alive}} and 
{{visited}} sets, the {{listFiles}} step? Either list the files and just loop 
through them, or do the {{walkFileTree}} operation -- you are now doing the 
same work twice. Use a plain long instead of the atomic as the class is still 
thread-unsafe.
- {{CDCSizeCalculator.calculateSize}} should return the size, and maybe made 
synchronized for a bit of additional safety.
- [Scrubber 
change|https://github.com/apache/cassandra/compare/trunk...josh-mckenzie:8844_review#diff-30afe7671ae9073cb81bb7c364d37f3fR327]
 should be reverted.
- "Permissible" changed to "permissable" at some places in the code; the latter 
is a misspelling.

> Change Data Capture (CDC)
> -
>
> Key: CASSANDRA-8844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8844
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Coordination, Local Write-Read Paths
>Reporter: Tupshin Harper
>Assignee: Joshua McKenzie
>Priority: Critical
> Fix For: 3.x
>
>
> "In databases, change data capture (CDC) is a set of software design patterns 
> used to determine (and track) the data that has changed so that action can be 
> taken using the changed data. Also, Change data capture (CDC) is an approach 
> to data integration that is based on the identification, capture and delivery 
> of the changes made to enterprise data sources."
> -Wikipedia
> As Cassandra is increasingly being used as the Source of Record (SoR) for 
> mission critical data in large enterprises, it is increasingly being called 
> upon to act as the central hub of traffic and data flow to other systems. In 
> order to try to address the general need, we (cc [~brianmhess]), propose 
> implementing a simple data logging mechanism to enable per-table CDC patterns.
> h2. The goals:
> # Use CQL as the primary ingestion mechanism, in order to leverage its 
> Consistency Level semantics, and in order to treat it as the single 
> reliable/durable SoR for the data.
> # To provide a mechanism for implementing good and reliable 
> (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) 
> continuous semi-realtime feeds of mutations going into a Cassandra cluster.
> # To eliminate the 

[jira] [Commented] (CASSANDRA-11562) "Could not retrieve endpoint ranges" for sstableloader

2016-04-14 Thread Yuki Morishita (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241177#comment-15241177
 ] 

Yuki Morishita commented on CASSANDRA-11562:


Can you try sstableloader from 2.1.13?
This should be fixed by CASSANDRA-10700.

> "Could not retrieve endpoint ranges" for sstableloader
> --
>
> Key: CASSANDRA-11562
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11562
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
> Environment: $ uname -a
> Linux bigdb-100 3.2.0-99-virtual #139-Ubuntu SMP Mon Feb 1 23:52:21 UTC 2016 
> x86_64 x86_64 x86_64 GNU/Linux
> I am using Datastax Enterprise 4.7.5-1 which is based on 2.1.11.
>Reporter: Jens Rantil
>
> I am setting up a second datacenter and have a very slow and shaky VPN 
> connection to my old datacenter. To speed up import process I am trying to 
> seed the new datacenter with a backup (that has been transferred encrypted 
> out of bands from the VPN). When this is done I will issue a final 
> clusterwide repair.
> However...sstableloader crashes with the following:
> {noformat}
> sstableloader -v --nodes XXX --username MYUSERNAME --password MYPASSWORD 
> --ignore YYY,ZZZ ./backupdir/MYKEYSPACE/MYTABLE/
> Could not retrieve endpoint ranges:
> java.lang.IllegalArgumentException
> java.lang.RuntimeException: Could not retrieve endpoint ranges:
> at 
> org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:338)
> at 
> org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:156)
> at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:106)
> Caused by: java.lang.IllegalArgumentException
> at java.nio.Buffer.limit(Buffer.java:267)
> at 
> org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java:543)
> at 
> org.apache.cassandra.serializers.CollectionSerializer.readValue(CollectionSerializer.java:124)
> at 
> org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:101)
> at 
> org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:30)
> at 
> org.apache.cassandra.serializers.CollectionSerializer.deserialize(CollectionSerializer.java:50)
> at 
> org.apache.cassandra.db.marshal.AbstractType.compose(AbstractType.java:68)
> at 
> org.apache.cassandra.cql3.UntypedResultSet$Row.getMap(UntypedResultSet.java:287)
> at 
> org.apache.cassandra.config.CFMetaData.fromSchemaNoTriggers(CFMetaData.java:1833)
> at 
> org.apache.cassandra.config.CFMetaData.fromThriftCqlRow(CFMetaData.java:1126)
> at 
> org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:330)
> ... 2 more
> {noformat}
> (where YYY,ZZZ are nodes in the old DC)
> The files in ./backupdir/MYKEYSPACE/MYTABLE/ are an exact copy of a snapshot 
> from the older datacenter that has been taken with the exact same version of 
> Datastax Enterprise/Cassandra. The backup was taken 2-3 days ago.
> Question: ./backupdir/MYKEYSPACE/MYTABLE/ contains the non-"*.db" file  
> "manifest.json". Is that an issue?
> My workaround for my quest will probably be to copy the snapshot directories 
> out to the nodes of the new datacenter and do a DC-local repair+cleanup.
> Let me know if I can assist in debugging this further.
> References:
>  * This _might_ be a duplicate of 
> https://issues.apache.org/jira/browse/CASSANDRA-10629.
>  * http://stackoverflow.com/q/34757922/260805. 
> http://stackoverflow.com/a/35213418/260805 claims this could happen when 
> dropping a column, but don't think I've dropped any column for this column 
> ever.
>  * http://stackoverflow.com/q/28632555/260805
>  * http://stackoverflow.com/q/34487567/260805



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-11572) SStableloader does not stream data if the Cassandra table was altered to drop some column

2016-04-14 Thread Yuki Morishita (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuki Morishita resolved CASSANDRA-11572.

Resolution: Duplicate

This should be fixed in 2.1.13 by CASSANDRA-10700.

> SStableloader does not stream data if the Cassandra table was altered to drop 
> some column
> -
>
> Key: CASSANDRA-11572
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11572
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
>Reporter: manuj singh
>
> Sstabble loader stops working whenever the cassandra table is altered to drop 
> some column. 
> the following error shows:
> Error:
> Could not retrieve endpoint ranges:
> java.lang.IllegalArgumentException
> java.lang.RuntimeException: Could not retrieve endpoint ranges:
> at 
> org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:338)
> at 
> org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:156)
> at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:106)
> Caused by: java.lang.IllegalArgumentException
> at java.nio.Buffer.limit(Buffer.java:275)
> at 
> org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java:543)
> at 
> org.apache.cassandra.serializers.CollectionSerializer.readValue(CollectionSerializer.java:124)
> at 
> org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:101)
> at 
> org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:30)
> at 
> org.apache.cassandra.serializers.CollectionSerializer.deserialize(CollectionSerializer.java:50)
> at 
> org.apache.cassandra.db.marshal.AbstractType.compose(AbstractType.java:68)
> at 
> org.apache.cassandra.cql3.UntypedResultSet$Row.getMap(UntypedResultSet.java:287)
> at 
> org.apache.cassandra.config.CFMetaData.fromSchemaNoTriggers(CFMetaData.java:1833)
> at 
> org.apache.cassandra.config.CFMetaData.fromThriftCqlRow(CFMetaData.java:1126)
> at 
> org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:330)
> ... 2 more
> The only solution is then to drop the table and create it again. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9625) GraphiteReporter not reporting

2016-04-14 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241167#comment-15241167
 ] 

T Jake Luciani edited comment on CASSANDRA-9625 at 4/14/16 1:41 PM:


[~ruoranwang] thanks for all the info.  This seems to be some kind of 
compaction bug which is affecting the graphite reporter.  Can you reproduce 
this? How are you calling repair? Can you attach logs from the node where this 
happens.


was (Author: tjake):
[~ruoranwang] thanks for all the info.  This seems to be some kind of 
compaction bug which is affecting the graphite reporter.  Can you reproduce 
this? How are you calling repair?

> GraphiteReporter not reporting
> --
>
> Key: CASSANDRA-9625
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9625
> Project: Cassandra
>  Issue Type: Bug
> Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3
>Reporter: Eric Evans
>Assignee: T Jake Luciani
> Attachments: Screen Shot 2016-04-13 at 10.40.58 AM.png, metrics.yaml, 
> thread-dump.log
>
>
> When upgrading from 2.1.3 to 2.1.6, the Graphite metrics reporter stops 
> working.  The usual startup is logged, and one batch of samples is sent, but 
> the reporting interval comes and goes, and no other samples are ever sent.  
> The logs are free from errors.
> Frustratingly, metrics reporting works in our smaller (staging) environment 
> on 2.1.6; We are able to reproduce this on all 6 of production nodes, but not 
> on a 3 node (otherwise identical) staging cluster (maybe it takes a certain 
> level of concurrency?).
> Attached is a thread dump, and our metrics.yaml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11258) Repair scheduling - Resource locking API

2016-04-14 Thread Paulo Motta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-11258:

Reviewer: Paulo Motta

> Repair scheduling - Resource locking API
> 
>
> Key: CASSANDRA-11258
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11258
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Marcus Olsson
>Assignee: Marcus Olsson
>Priority: Minor
>
> Create a resource locking API & implementation that is able to lock a 
> resource in a specified data center. It should handle priorities to avoid 
> node starvation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9625) GraphiteReporter not reporting

2016-04-14 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241167#comment-15241167
 ] 

T Jake Luciani commented on CASSANDRA-9625:
---

[~ruoranwang] thanks for all the info.  This seems to be some kind of 
compaction bug which is affecting the graphite reporter.  Can you reproduce 
this? How are you calling repair?

> GraphiteReporter not reporting
> --
>
> Key: CASSANDRA-9625
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9625
> Project: Cassandra
>  Issue Type: Bug
> Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3
>Reporter: Eric Evans
>Assignee: T Jake Luciani
> Attachments: Screen Shot 2016-04-13 at 10.40.58 AM.png, metrics.yaml, 
> thread-dump.log
>
>
> When upgrading from 2.1.3 to 2.1.6, the Graphite metrics reporter stops 
> working.  The usual startup is logged, and one batch of samples is sent, but 
> the reporting interval comes and goes, and no other samples are ever sent.  
> The logs are free from errors.
> Frustratingly, metrics reporting works in our smaller (staging) environment 
> on 2.1.6; We are able to reproduce this on all 6 of production nodes, but not 
> on a 3 node (otherwise identical) staging cluster (maybe it takes a certain 
> level of concurrency?).
> Attached is a thread dump, and our metrics.yaml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11522) batch_size_fail_threshold_in_kb shouldn't only apply to batch

2016-04-14 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241123#comment-15241123
 ] 

Paulo Motta commented on CASSANDRA-11522:
-

Perhaps we should rename the properties to 
{{multi_partition_batch_size_warn_threshold}} and 
{{multi_partition_batch_size_fail_threshold}} to avoid confusion ?

> batch_size_fail_threshold_in_kb shouldn't only apply to batch
> -
>
> Key: CASSANDRA-11522
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11522
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Giampaolo
>Priority: Minor
>  Labels: lhf
>
> I can buy that C* is not good at dealing with large (in bytes) inserts and 
> that it makes sense to provide a user configurable protection against inserts 
> larger than a certain size, but it doesn't make sense to limit this to 
> batches. It's absolutely possible to insert a single very large row and 
> internally a batch with a single statement is exactly the same than a single 
> similar insert, so rejecting the former and not the later is confusing and 
> well, wrong.
> Note that I get that batches are more likely to get big and that's where the 
> protection is most often useful, but limiting the option to batch is still 
> less useful (it's a hole in the protection) and it's going to confuse users 
> in thinking that batches to a single partition are different from single 
> inserts.
> Of course that also mean that we should rename that option to 
> {{write_size_fail_threshold_in_kb}}. Which means we probably want to add this 
> new option and just deprecate {{batch_size_fail_threshold_in_kb}} for now 
> (with removal in 4.0).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11474) cqlsh: COPY FROM should use regular inserts for single statement batches

2016-04-14 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241121#comment-15241121
 ] 

Paulo Motta commented on CASSANDRA-11474:
-

LGTM, but given that CASSANDRA-10876 removed this limitation for single 
partition batches on trunk, special casing single mutation batches on COPY FROM 
doesn't bring us much additional benefits, so I think we should include this 
only on 2.2 and 3.0 for code simplicity.

Could you provide a trunk patch without the single insert optimization, but 
only with the error report and empty chunk fixes?

> cqlsh: COPY FROM should use regular inserts for single statement batches
> 
>
> Key: CASSANDRA-11474
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11474
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Stefania
>Assignee: Stefania
>Priority: Minor
>  Labels: lhf
> Fix For: 2.2.x, 3.0.x, 3.x
>
>
> I haven't reproduced it with a test yet but, from code inspection, if CQL 
> rows are larger than {{batch_size_fail_threshold_in_kb}} and this parameter 
> cannot be changed, then data import will fail.
> Users can control the batch size by setting MAXBATCHSIZE.
> If a batch contains a single statement, there is no need to use a batch and 
> we should use normal inserts instead or, alternatively, we should skip the 
> batch size check for unlogged batches with only one statement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11522) batch_size_fail_threshold_in_kb shouldn't only apply to batch

2016-04-14 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241105#comment-15241105
 ] 

Paulo Motta commented on CASSANDRA-11522:
-

I just noticed that CASSANDRA-10876 effectively removed this protection for 
single partition batches, given they do not have the same concerns as 
multi-partition batches (as discussed on CASSANDRA-8011). So I'm not sure we 
should introduce this limitation to single inserts.

> batch_size_fail_threshold_in_kb shouldn't only apply to batch
> -
>
> Key: CASSANDRA-11522
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11522
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Giampaolo
>Priority: Minor
>  Labels: lhf
>
> I can buy that C* is not good at dealing with large (in bytes) inserts and 
> that it makes sense to provide a user configurable protection against inserts 
> larger than a certain size, but it doesn't make sense to limit this to 
> batches. It's absolutely possible to insert a single very large row and 
> internally a batch with a single statement is exactly the same than a single 
> similar insert, so rejecting the former and not the later is confusing and 
> well, wrong.
> Note that I get that batches are more likely to get big and that's where the 
> protection is most often useful, but limiting the option to batch is still 
> less useful (it's a hole in the protection) and it's going to confuse users 
> in thinking that batches to a single partition are different from single 
> inserts.
> Of course that also mean that we should rename that option to 
> {{write_size_fail_threshold_in_kb}}. Which means we probably want to add this 
> new option and just deprecate {{batch_size_fail_threshold_in_kb}} for now 
> (with removal in 4.0).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10853) deb package migration to dh_python2

2016-04-14 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241099#comment-15241099
 ] 

Igor Galić commented on CASSANDRA-10853:


ansible has this already fixed with {{dh-python | python-support}} in their 
{{Depends}} and {{Build-Depends}} 
https://github.com/ansible/ansible/pull/15031/files

> deb package migration to dh_python2
> ---
>
> Key: CASSANDRA-10853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10853
> Project: Cassandra
>  Issue Type: Task
>  Components: Packaging
>Reporter: Michael Shuler
>Assignee: Michael Shuler
> Fix For: 3.0.x, 3.x
>
>
> I'm working on a deb job in jenkins, and I had forgotten to open a bug for 
> this. There is no urgent need, since {{python-support}} is in Jessie, but 
> this package is currently in transition to be removed.
> http://deb.li/dhs2p
> During deb build:
> {noformat}
> dh_pysupport: This program is deprecated, you should use dh_python2 instead. 
> Migration guide: http://deb.li/dhs2p
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11570) Concurrent execution of prepared statement returns invalid JSON as result

2016-04-14 Thread Alexander Ryabets (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Ryabets updated CASSANDRA-11570:
--
Description: 
When I use prepared statement for async execution of multiple statements I get 
JSON with broken data. Keys got totally corrupted when values seems to be 
normal though.

First I encoutered this issue when I were performing stress testing of our 
project using custom script. We are using DataStax C++ driver and execute 
statements from different fibers.

Then I was trying to isolate problem and wrote simple C# program which starts 
multiple Tasks in a loop. Each task uses the once created prepared statement to 
read data from the base. As you can see results are totally mess.

I 've attached archive with console C# project (1 cs file) which just print 
resulting JSON to user. 
Here is the main part of C# code.

{noformat}
static void Main(string[] args)
{
  const int task_count = 300;

  using(var cluster = Cluster.Builder().AddContactPoints(/*contact points 
here*/).Build())
  {
using(var session = cluster.Connect())
{
  var prepared = session.Prepare("select json * from test_neptunao.ubuntu 
where id=?");
  var tasks = new Task[task_count];
  for(int i = 0; i < task_count; i++)
  {
tasks[i] = Query(prepared, session);
  }
  Task.WaitAll(tasks);
}
  }
  Console.ReadKey();
}

private static Task Query(PreparedStatement prepared, ISession session)
{
  string id = GetIdOfRandomRow();
  var stmt = prepared.Bind(id);
  stmt.SetConsistencyLevel(ConsistencyLevel.One);
  return session.ExecuteAsync(stmt).ContinueWith(tr =>
  {
foreach(var row in tr.Result)
{
  var value = row.GetValue(0);
  //some kind of output
}
  });
}
{noformat}

I also attached cql script with test DB schema.

{noformat}
CREATE KEYSPACE IF NOT EXISTS test_neptunao
WITH replication = {
'class' : 'SimpleStrategy',
'replication_factor' : 3
};

use test_neptunao;

create table if not exists ubuntu (
id timeuuid PRIMARY KEY,
precise_pangolin text,
trusty_tahr text,
wily_werewolf text, 
vivid_vervet text,
saucy_salamander text,
lucid_lynx text
);
{noformat}

  was:
When I use prepared statement for async execution of multiple statements I get 
JSON with broken data. Keys got totally corrupted when values seems to be 
normal though.

First I encoutered this issue when I were performing stress testing of our 
project using custom script. We are using DataStax C++ driver and execute 
statements from different fibers.

Then I was trying to isolate problem and wrote simple C# program which starts 
multiple Tasks in a loop. Each task uses the once created prepared statement to 
read data from the base. As you can see results are totally mess.

I 've attached archive with console C# project (1 cs file) which just print 
resulting JSON to user. 
Here is the main part of C# code.

{noformat}
static void Main(string[] args)
{
  const int task_count = 300;

  using(var cluster = 
Cluster.Builder().AddContactPoints("127.0.0.1").Build())
  {
using(var session = cluster.Connect())
{
  var prepared = session.Prepare("select json * from 
test_neptunao.ubuntu");
  var tasks = new Task[task_count];
  for(int i = 0; i < task_count; i++)
  {
tasks[i] = Query(prepared, session);
  }
  Task.WaitAll(tasks);
}
  }
  Console.ReadKey();
}

private static Task Query(PreparedStatement prepared, ISession session)
{
  var stmt = prepared.Bind();
  stmt.SetConsistencyLevel(ConsistencyLevel.One);
  return session.ExecuteAsync(stmt).ContinueWith(tr =>
  {
foreach(var row in tr.Result)
{
  var value = row.GetValue(0);
  Console.WriteLine(value);
}
  });
}
{noformat}

I also attached cql script with test DB schema.

{noformat}
CREATE KEYSPACE IF NOT EXISTS test_neptunao
WITH replication = {
'class' : 'SimpleStrategy',
'replication_factor' : 3
};

use test_neptunao;

create table if not exists ubuntu (
id timeuuid PRIMARY KEY,
precise_pangolin text,
trusty_tahr text,
wily_werewolf text, 
vivid_vervet text,
saucy_salamander text,
lucid_lynx text
);
{noformat}


> Concurrent execution of prepared statement returns invalid JSON as result
> -
>
> Key: CASSANDRA-11570
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11570
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.2, C++ or C# driver
>Reporter: Alexander Ryabets
> Attachments: CassandraPreparedStatementsTest.zip, 

[jira] [Commented] (CASSANDRA-11522) batch_size_fail_threshold_in_kb shouldn't only apply to batch

2016-04-14 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241042#comment-15241042
 ] 

Paulo Motta commented on CASSANDRA-11522:
-

Yes, since this an improvement it normally goes in trunk.

> batch_size_fail_threshold_in_kb shouldn't only apply to batch
> -
>
> Key: CASSANDRA-11522
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11522
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Giampaolo
>Priority: Minor
>  Labels: lhf
>
> I can buy that C* is not good at dealing with large (in bytes) inserts and 
> that it makes sense to provide a user configurable protection against inserts 
> larger than a certain size, but it doesn't make sense to limit this to 
> batches. It's absolutely possible to insert a single very large row and 
> internally a batch with a single statement is exactly the same than a single 
> similar insert, so rejecting the former and not the later is confusing and 
> well, wrong.
> Note that I get that batches are more likely to get big and that's where the 
> protection is most often useful, but limiting the option to batch is still 
> less useful (it's a hole in the protection) and it's going to confuse users 
> in thinking that batches to a single partition are different from single 
> inserts.
> Of course that also mean that we should rename that option to 
> {{write_size_fail_threshold_in_kb}}. Which means we probably want to add this 
> new option and just deprecate {{batch_size_fail_threshold_in_kb}} for now 
> (with removal in 4.0).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11452) Cache implementation using LIRS eviction for in-process page cache

2016-04-14 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15241025#comment-15241025
 ] 

Benedict commented on CASSANDRA-11452:
--

Also, just to clarify, I'm not proposing a random _eviction_, just a random 
selection of who to compare against for _admission_ - the eviction candidate 
would still be the LRU.  Thus the collision would always be removed within a 
short number of steps after reaching the LRU spot, and ordinarily rapidly after.

It's also worth noting that a RNF whose average walk distance was only a little 
larger than 1 (so that it usually compared against the eviction candidate) 
would more than suffice - if the chance of each distance was 1/4 of the prior 
distance, the average walk length would only be 1.33, but it would still take 
only a few comparisons for the eviction to unblock, and a few more for multiple 
such collisions to be resolved.

> Cache implementation using LIRS eviction for in-process page cache
> --
>
> Key: CASSANDRA-11452
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11452
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>
> Following up from CASSANDRA-5863, to make best use of caching and to avoid 
> having to explicitly marking compaction accesses as non-cacheable, we need a 
> cache implementation that uses an eviction algorithm that can better handle 
> non-recurring accesses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11339) WHERE clause in SELECT DISTINCT can be ignored

2016-04-14 Thread Benjamin Lerer (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-11339:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed in 2.2 at 69edeaa46b78bb168f7e9d0b1c991c07b90f41ca.
Committed in 3.0 at 6ad874509d6c7edd53bb3a4b897477d6a2753c19 and merged into 
trunk.

> WHERE clause in SELECT DISTINCT can be ignored
> --
>
> Key: CASSANDRA-11339
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11339
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
>Reporter: Philip Thompson
>Assignee: Alex Petrov
> Fix For: 2.2.x, 3.x
>
> Attachments: 
> 0001-Add-validation-for-distinct-queries-disallowing-quer.patch
>
>
> I've tested this out on 2.1-head. I'm not sure if it's the same behavior on 
> newer versions.
> For a given table t, with {{PRIMARY KEY (id, v)}} the following two queries 
> return the same result:
> {{SELECT DISTINCT id FROM t WHERE v > X ALLOW FILTERING}}
> {{SELECT DISTINCT id FROM t}}
> The WHERE clause in the former is silently ignored, and all id are returned, 
> regardless of the value of v in any row. 
> It seems like this has been a known issue for a while:
> http://stackoverflow.com/questions/26548788/select-distinct-cql-ignores-where-clause
> However, if we don't support filtering on anything but the partition key, we 
> should reject the query, rather than silently dropping the where clause



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11452) Cache implementation using LIRS eviction for in-process page cache

2016-04-14 Thread Branimir Lambov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240956#comment-15240956
 ] 

Branimir Lambov commented on CASSANDRA-11452:
-

I'm sorry, I don't see how this helps. Once both the hot key and its collision 
are in the main area (this check is not enough to guarantee that won't happen 
though it probably manages to do so for this specific test), this path is no 
longer triggered.

I think we should be looking for a way to eject an offending entry after it has 
entered. I think an ideal test would verify that CLASH is among the cached keys 
before "// Now run a repeating sequence ...", but not there after the loop has 
finished.

Did you run traces with candidate preference on equality? Is it still bad?

> Cache implementation using LIRS eviction for in-process page cache
> --
>
> Key: CASSANDRA-11452
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11452
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>
> Following up from CASSANDRA-5863, to make best use of caching and to avoid 
> having to explicitly marking compaction accesses as non-cacheable, we need a 
> cache implementation that uses an eviction algorithm that can better handle 
> non-recurring accesses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[1/4] cassandra git commit: Allow only DISTINCT queries with partition keys restrictions

2016-04-14 Thread blerer
Repository: cassandra
Updated Branches:
  refs/heads/trunk 9a0eb9a31 -> ccacf7d1a


Allow only DISTINCT queries with partition keys restrictions

patch by Alex Petrov; reviewed by Benjamin Lerer for CASSANDRA-11339


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/69edeaa4
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/69edeaa4
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/69edeaa4

Branch: refs/heads/trunk
Commit: 69edeaa46b78bb168f7e9d0b1c991c07b90f41ca
Parents: 19b4b63
Author: Alex Petrov 
Authored: Thu Apr 14 12:26:52 2016 +0200
Committer: Benjamin Lerer 
Committed: Thu Apr 14 12:26:52 2016 +0200

--
 CHANGES.txt |  1 +
 .../restrictions/StatementRestrictions.java |  9 
 .../cql3/statements/SelectStatement.java|  3 ++
 .../cql3/validation/operations/SelectTest.java  | 45 
 4 files changed, 58 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/69edeaa4/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 54013a3..c72b6cb 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.2.6
+ * Allow only DISTINCT queries with partition keys restrictions 
(CASSANDRA-11339)
  * CqlConfigHelper no longer requires both a keystore and truststore to work 
(CASSANDRA-11532)
  * Make deprecated repair methods backward-compatible with previous 
notification service (CASSANDRA-11430)
  * IncomingStreamingConnection version check message wrong (CASSANDRA-11462)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/69edeaa4/src/java/org/apache/cassandra/cql3/restrictions/StatementRestrictions.java
--
diff --git 
a/src/java/org/apache/cassandra/cql3/restrictions/StatementRestrictions.java 
b/src/java/org/apache/cassandra/cql3/restrictions/StatementRestrictions.java
index e0cf743..3934f33 100644
--- a/src/java/org/apache/cassandra/cql3/restrictions/StatementRestrictions.java
+++ b/src/java/org/apache/cassandra/cql3/restrictions/StatementRestrictions.java
@@ -279,6 +279,15 @@ public final class StatementRestrictions
 }
 
 /**
+ * Checks if the restrictions contain any non-primary key restrictions
+ * @return true if the restrictions contain any non-primary 
key restrictions, false otherwise.
+ */
+public boolean hasNonPrimaryKeyRestrictions()
+{
+return !nonPrimaryKeyRestrictions.isEmpty();
+}
+
+/**
  * Returns the partition key components that are not restricted.
  * @return the partition key components that are not restricted.
  */

http://git-wip-us.apache.org/repos/asf/cassandra/blob/69edeaa4/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
--
diff --git a/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java 
b/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
index 291e3e4..7bba330 100644
--- a/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
+++ b/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
@@ -885,6 +885,9 @@ public class SelectStatement implements CQLStatement
   StatementRestrictions 
restrictions)
   throws 
InvalidRequestException
 {
+checkFalse(restrictions.hasClusteringColumnsRestriction() || 
restrictions.hasNonPrimaryKeyRestrictions(),
+   "SELECT DISTINCT with WHERE clause only supports 
restriction by partition key.");
+
 Collection requestedColumns = 
selection.getColumns();
 for (ColumnDefinition def : requestedColumns)
 checkFalse(!def.isPartitionKey() && !def.isStatic(),

http://git-wip-us.apache.org/repos/asf/cassandra/blob/69edeaa4/test/unit/org/apache/cassandra/cql3/validation/operations/SelectTest.java
--
diff --git 
a/test/unit/org/apache/cassandra/cql3/validation/operations/SelectTest.java 
b/test/unit/org/apache/cassandra/cql3/validation/operations/SelectTest.java
index d8cd3c3..d444fde 100644
--- a/test/unit/org/apache/cassandra/cql3/validation/operations/SelectTest.java
+++ b/test/unit/org/apache/cassandra/cql3/validation/operations/SelectTest.java
@@ -1253,6 +1253,51 @@ public class SelectTest extends CQLTester
 Assert.assertEquals(9, rows.length);
 }
 
+@Test
+public void testSelectDistinctWithWhereClause() throws Throwable {
+createTable("CREATE TABLE %s (k int, a int, b int, PRIMARY 

[4/4] cassandra git commit: Merge branch cassandra-3.0 into trunk

2016-04-14 Thread blerer
Merge branch cassandra-3.0 into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/ccacf7d1
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/ccacf7d1
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/ccacf7d1

Branch: refs/heads/trunk
Commit: ccacf7d1a94875c2da10bacbe63f99b630030fdf
Parents: 9a0eb9a 6ad8745
Author: Benjamin Lerer 
Authored: Thu Apr 14 12:38:04 2016 +0200
Committer: Benjamin Lerer 
Committed: Thu Apr 14 12:38:04 2016 +0200

--
 CHANGES.txt |  1 +
 .../restrictions/StatementRestrictions.java |  9 +++
 .../cql3/statements/SelectStatement.java|  4 ++
 .../cql3/validation/operations/SelectTest.java  | 72 
 4 files changed, 86 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/ccacf7d1/CHANGES.txt
--
diff --cc CHANGES.txt
index 443c8bc,3b4d473..329e55c
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,48 -1,5 +1,49 @@@
 -3.0.6
 +3.6
 + * Fix PER PARTITION LIMIT for queries requiring post-query ordering 
(CASSANDRA-11556)
 + * Allow instantiation of UDTs and tuples in UDFs (CASSANDRA-10818)
 + * Support UDT in CQLSSTableWriter (CASSANDRA-10624)
 + * Support for non-frozen user-defined types, updating
 +   individual fields of user-defined types (CASSANDRA-7423)
 + * Make LZ4 compression level configurable (CASSANDRA-11051)
 + * Allow per-partition LIMIT clause in CQL (CASSANDRA-7017)
 + * Make custom filtering more extensible with UserExpression (CASSANDRA-11295)
 + * Improve field-checking and error reporting in cassandra.yaml 
(CASSANDRA-10649)
 + * Print CAS stats in nodetool proxyhistograms (CASSANDRA-11507)
 + * More user friendly error when providing an invalid token to nodetool 
(CASSANDRA-9348)
 + * Add static column support to SASI index (CASSANDRA-11183)
 + * Support EQ/PREFIX queries in SASI CONTAINS mode without tokenization 
(CASSANDRA-11434)
 + * Support LIKE operator in prepared statements (CASSANDRA-11456)
 + * Add a command to see if a Materialized View has finished building 
(CASSANDRA-9967)
 + * Log endpoint and port associated with streaming operation (CASSANDRA-8777)
 + * Print sensible units for all log messages (CASSANDRA-9692)
 + * Upgrade Netty to version 4.0.34 (CASSANDRA-11096)
 + * Break the CQL grammar into separate Parser and Lexer (CASSANDRA-11372)
 + * Compress only inter-dc traffic by default (CASSANDRA-)
 + * Add metrics to track write amplification (CASSANDRA-11420)
 + * cassandra-stress: cannot handle "value-less" tables (CASSANDRA-7739)
 + * Add/drop multiple columns in one ALTER TABLE statement (CASSANDRA-10411)
 + * Add require_endpoint_verification opt for internode encryption 
(CASSANDRA-9220)
 + * Add auto import java.util for UDF code block (CASSANDRA-11392)
 + * Add --hex-format option to nodetool getsstables (CASSANDRA-11337)
 + * sstablemetadata should print sstable min/max token (CASSANDRA-7159)
 + * Do not wrap CassandraException in TriggerExecutor (CASSANDRA-9421)
 + * COPY TO should have higher double precision (CASSANDRA-11255)
 + * Stress should exit with non-zero status after failure (CASSANDRA-10340)
 + * Add client to cqlsh SHOW_SESSION (CASSANDRA-8958)
 + * Fix nodetool tablestats keyspace level metrics (CASSANDRA-11226)
 + * Store repair options in parent_repair_history (CASSANDRA-11244)
 + * Print current leveling in sstableofflinerelevel (CASSANDRA-9588)
 + * Change repair message for keyspaces with RF 1 (CASSANDRA-11203)
 + * Remove hard-coded SSL cipher suites and protocols (CASSANDRA-10508)
 + * Improve concurrency in CompactionStrategyManager (CASSANDRA-10099)
 + * (cqlsh) interpret CQL type for formatting blobs (CASSANDRA-11274)
 + * Refuse to start and print txn log information in case of disk
 +   corruption (CASSANDRA-10112)
 + * Resolve some eclipse-warnings (CASSANDRA-11086)
 + * (cqlsh) Show static columns in a different color (CASSANDRA-11059)
 + * Allow to remove TTLs on table with default_time_to_live (CASSANDRA-11207)
 +Merged from 3.0:
+  * Allow only DISTINCT queries with partition keys or static columns 
restrictions (CASSANDRA-11339)
   * LogAwareFileLister should only use OLD sstable files in current folder to 
determine disk consistency (CASSANDRA-11470)
   * Notify indexers of expired rows during compaction (CASSANDRA-11329)
   * Properly respond with ProtocolError when a v1/v2 native protocol

http://git-wip-us.apache.org/repos/asf/cassandra/blob/ccacf7d1/src/java/org/apache/cassandra/cql3/restrictions/StatementRestrictions.java
--


[2/4] cassandra git commit: Merge branch cassandra-2.2 into cassandra-3.0

2016-04-14 Thread blerer
Merge branch cassandra-2.2 into cassandra-3.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0818e1b1
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0818e1b1
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0818e1b1

Branch: refs/heads/trunk
Commit: 0818e1b16af36adb2fbbd3dffacdccc2ecf60a9a
Parents: fd24b7c 69edeaa
Author: Benjamin Lerer 
Authored: Thu Apr 14 12:32:56 2016 +0200
Committer: Benjamin Lerer 
Committed: Thu Apr 14 12:33:05 2016 +0200

--

--




[3/4] cassandra git commit: Allow only DISTINCT queries with partition keys or static columns restrictions

2016-04-14 Thread blerer
Allow only DISTINCT queries with partition keys or static columns restrictions

patch by Alex Petrov; reviewed by Benjamin Lerer for CASSANDRA-11339


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/6ad87450
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/6ad87450
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/6ad87450

Branch: refs/heads/trunk
Commit: 6ad874509d6c7edd53bb3a4b897477d6a2753c19
Parents: 0818e1b
Author: Alex Petrov 
Authored: Thu Apr 14 12:35:07 2016 +0200
Committer: Benjamin Lerer 
Committed: Thu Apr 14 12:35:07 2016 +0200

--
 CHANGES.txt |  1 +
 .../restrictions/StatementRestrictions.java |  9 +++
 .../cql3/statements/SelectStatement.java|  4 ++
 .../cql3/validation/operations/SelectTest.java  | 72 
 4 files changed, 86 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/6ad87450/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index ed4c412..3b4d473 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.0.6
+ * Allow only DISTINCT queries with partition keys or static columns 
restrictions (CASSANDRA-11339)
  * LogAwareFileLister should only use OLD sstable files in current folder to 
determine disk consistency (CASSANDRA-11470)
  * Notify indexers of expired rows during compaction (CASSANDRA-11329)
  * Properly respond with ProtocolError when a v1/v2 native protocol

http://git-wip-us.apache.org/repos/asf/cassandra/blob/6ad87450/src/java/org/apache/cassandra/cql3/restrictions/StatementRestrictions.java
--
diff --git 
a/src/java/org/apache/cassandra/cql3/restrictions/StatementRestrictions.java 
b/src/java/org/apache/cassandra/cql3/restrictions/StatementRestrictions.java
index 797b8e4..763a7be 100644
--- a/src/java/org/apache/cassandra/cql3/restrictions/StatementRestrictions.java
+++ b/src/java/org/apache/cassandra/cql3/restrictions/StatementRestrictions.java
@@ -396,6 +396,15 @@ public final class StatementRestrictions
 }
 
 /**
+ * Checks if the restrictions contain any non-primary key restrictions
+ * @return true if the restrictions contain any non-primary 
key restrictions, false otherwise.
+ */
+public boolean hasNonPrimaryKeyRestrictions()
+{
+return !nonPrimaryKeyRestrictions.isEmpty();
+}
+
+/**
  * Returns the partition key components that are not restricted.
  * @return the partition key components that are not restricted.
  */

http://git-wip-us.apache.org/repos/asf/cassandra/blob/6ad87450/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
--
diff --git a/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java 
b/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
index 51d675b..b4215ac 100644
--- a/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
+++ b/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
@@ -896,6 +896,10 @@ public class SelectStatement implements CQLStatement
   StatementRestrictions 
restrictions)
   throws 
InvalidRequestException
 {
+checkFalse(restrictions.hasClusteringColumnsRestriction() ||
+   (restrictions.hasNonPrimaryKeyRestrictions() && 
!restrictions.nonPKRestrictedColumns(true).stream().allMatch(ColumnDefinition::isStatic)),
+   "SELECT DISTINCT with WHERE clause only supports 
restriction by partition key and/or static columns.");
+
 Collection requestedColumns = 
selection.getColumns();
 for (ColumnDefinition def : requestedColumns)
 checkFalse(!def.isPartitionKey() && !def.isStatic(),

http://git-wip-us.apache.org/repos/asf/cassandra/blob/6ad87450/test/unit/org/apache/cassandra/cql3/validation/operations/SelectTest.java
--
diff --git 
a/test/unit/org/apache/cassandra/cql3/validation/operations/SelectTest.java 
b/test/unit/org/apache/cassandra/cql3/validation/operations/SelectTest.java
index a7eeeb8..5c19e1b 100644
--- a/test/unit/org/apache/cassandra/cql3/validation/operations/SelectTest.java
+++ b/test/unit/org/apache/cassandra/cql3/validation/operations/SelectTest.java
@@ -1253,6 +1253,78 @@ public class SelectTest extends CQLTester
 Assert.assertEquals(9, rows.length);
 }
 
+@Test
+public void testSelectDistinctWithWhereClause() throws Throwable 

cassandra git commit: Allow only DISTINCT queries with partition keys or static columns restrictions

2016-04-14 Thread blerer
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-3.0 0818e1b16 -> 6ad874509


Allow only DISTINCT queries with partition keys or static columns restrictions

patch by Alex Petrov; reviewed by Benjamin Lerer for CASSANDRA-11339


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/6ad87450
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/6ad87450
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/6ad87450

Branch: refs/heads/cassandra-3.0
Commit: 6ad874509d6c7edd53bb3a4b897477d6a2753c19
Parents: 0818e1b
Author: Alex Petrov 
Authored: Thu Apr 14 12:35:07 2016 +0200
Committer: Benjamin Lerer 
Committed: Thu Apr 14 12:35:07 2016 +0200

--
 CHANGES.txt |  1 +
 .../restrictions/StatementRestrictions.java |  9 +++
 .../cql3/statements/SelectStatement.java|  4 ++
 .../cql3/validation/operations/SelectTest.java  | 72 
 4 files changed, 86 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/6ad87450/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index ed4c412..3b4d473 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.0.6
+ * Allow only DISTINCT queries with partition keys or static columns 
restrictions (CASSANDRA-11339)
  * LogAwareFileLister should only use OLD sstable files in current folder to 
determine disk consistency (CASSANDRA-11470)
  * Notify indexers of expired rows during compaction (CASSANDRA-11329)
  * Properly respond with ProtocolError when a v1/v2 native protocol

http://git-wip-us.apache.org/repos/asf/cassandra/blob/6ad87450/src/java/org/apache/cassandra/cql3/restrictions/StatementRestrictions.java
--
diff --git 
a/src/java/org/apache/cassandra/cql3/restrictions/StatementRestrictions.java 
b/src/java/org/apache/cassandra/cql3/restrictions/StatementRestrictions.java
index 797b8e4..763a7be 100644
--- a/src/java/org/apache/cassandra/cql3/restrictions/StatementRestrictions.java
+++ b/src/java/org/apache/cassandra/cql3/restrictions/StatementRestrictions.java
@@ -396,6 +396,15 @@ public final class StatementRestrictions
 }
 
 /**
+ * Checks if the restrictions contain any non-primary key restrictions
+ * @return true if the restrictions contain any non-primary 
key restrictions, false otherwise.
+ */
+public boolean hasNonPrimaryKeyRestrictions()
+{
+return !nonPrimaryKeyRestrictions.isEmpty();
+}
+
+/**
  * Returns the partition key components that are not restricted.
  * @return the partition key components that are not restricted.
  */

http://git-wip-us.apache.org/repos/asf/cassandra/blob/6ad87450/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
--
diff --git a/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java 
b/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
index 51d675b..b4215ac 100644
--- a/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
+++ b/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
@@ -896,6 +896,10 @@ public class SelectStatement implements CQLStatement
   StatementRestrictions 
restrictions)
   throws 
InvalidRequestException
 {
+checkFalse(restrictions.hasClusteringColumnsRestriction() ||
+   (restrictions.hasNonPrimaryKeyRestrictions() && 
!restrictions.nonPKRestrictedColumns(true).stream().allMatch(ColumnDefinition::isStatic)),
+   "SELECT DISTINCT with WHERE clause only supports 
restriction by partition key and/or static columns.");
+
 Collection requestedColumns = 
selection.getColumns();
 for (ColumnDefinition def : requestedColumns)
 checkFalse(!def.isPartitionKey() && !def.isStatic(),

http://git-wip-us.apache.org/repos/asf/cassandra/blob/6ad87450/test/unit/org/apache/cassandra/cql3/validation/operations/SelectTest.java
--
diff --git 
a/test/unit/org/apache/cassandra/cql3/validation/operations/SelectTest.java 
b/test/unit/org/apache/cassandra/cql3/validation/operations/SelectTest.java
index a7eeeb8..5c19e1b 100644
--- a/test/unit/org/apache/cassandra/cql3/validation/operations/SelectTest.java
+++ b/test/unit/org/apache/cassandra/cql3/validation/operations/SelectTest.java
@@ -1253,6 +1253,78 @@ public class SelectTest extends CQLTester
 Assert.assertEquals(9, 

[1/2] cassandra git commit: Allow only DISTINCT queries with partition keys restrictions

2016-04-14 Thread blerer
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-3.0 fd24b7c0d -> 0818e1b16


Allow only DISTINCT queries with partition keys restrictions

patch by Alex Petrov; reviewed by Benjamin Lerer for CASSANDRA-11339


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/69edeaa4
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/69edeaa4
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/69edeaa4

Branch: refs/heads/cassandra-3.0
Commit: 69edeaa46b78bb168f7e9d0b1c991c07b90f41ca
Parents: 19b4b63
Author: Alex Petrov 
Authored: Thu Apr 14 12:26:52 2016 +0200
Committer: Benjamin Lerer 
Committed: Thu Apr 14 12:26:52 2016 +0200

--
 CHANGES.txt |  1 +
 .../restrictions/StatementRestrictions.java |  9 
 .../cql3/statements/SelectStatement.java|  3 ++
 .../cql3/validation/operations/SelectTest.java  | 45 
 4 files changed, 58 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/69edeaa4/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 54013a3..c72b6cb 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.2.6
+ * Allow only DISTINCT queries with partition keys restrictions 
(CASSANDRA-11339)
  * CqlConfigHelper no longer requires both a keystore and truststore to work 
(CASSANDRA-11532)
  * Make deprecated repair methods backward-compatible with previous 
notification service (CASSANDRA-11430)
  * IncomingStreamingConnection version check message wrong (CASSANDRA-11462)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/69edeaa4/src/java/org/apache/cassandra/cql3/restrictions/StatementRestrictions.java
--
diff --git 
a/src/java/org/apache/cassandra/cql3/restrictions/StatementRestrictions.java 
b/src/java/org/apache/cassandra/cql3/restrictions/StatementRestrictions.java
index e0cf743..3934f33 100644
--- a/src/java/org/apache/cassandra/cql3/restrictions/StatementRestrictions.java
+++ b/src/java/org/apache/cassandra/cql3/restrictions/StatementRestrictions.java
@@ -279,6 +279,15 @@ public final class StatementRestrictions
 }
 
 /**
+ * Checks if the restrictions contain any non-primary key restrictions
+ * @return true if the restrictions contain any non-primary 
key restrictions, false otherwise.
+ */
+public boolean hasNonPrimaryKeyRestrictions()
+{
+return !nonPrimaryKeyRestrictions.isEmpty();
+}
+
+/**
  * Returns the partition key components that are not restricted.
  * @return the partition key components that are not restricted.
  */

http://git-wip-us.apache.org/repos/asf/cassandra/blob/69edeaa4/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
--
diff --git a/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java 
b/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
index 291e3e4..7bba330 100644
--- a/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
+++ b/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
@@ -885,6 +885,9 @@ public class SelectStatement implements CQLStatement
   StatementRestrictions 
restrictions)
   throws 
InvalidRequestException
 {
+checkFalse(restrictions.hasClusteringColumnsRestriction() || 
restrictions.hasNonPrimaryKeyRestrictions(),
+   "SELECT DISTINCT with WHERE clause only supports 
restriction by partition key.");
+
 Collection requestedColumns = 
selection.getColumns();
 for (ColumnDefinition def : requestedColumns)
 checkFalse(!def.isPartitionKey() && !def.isStatic(),

http://git-wip-us.apache.org/repos/asf/cassandra/blob/69edeaa4/test/unit/org/apache/cassandra/cql3/validation/operations/SelectTest.java
--
diff --git 
a/test/unit/org/apache/cassandra/cql3/validation/operations/SelectTest.java 
b/test/unit/org/apache/cassandra/cql3/validation/operations/SelectTest.java
index d8cd3c3..d444fde 100644
--- a/test/unit/org/apache/cassandra/cql3/validation/operations/SelectTest.java
+++ b/test/unit/org/apache/cassandra/cql3/validation/operations/SelectTest.java
@@ -1253,6 +1253,51 @@ public class SelectTest extends CQLTester
 Assert.assertEquals(9, rows.length);
 }
 
+@Test
+public void testSelectDistinctWithWhereClause() throws Throwable {
+createTable("CREATE TABLE %s (k int, a int, b 

[2/2] cassandra git commit: Merge branch cassandra-2.2 into cassandra-3.0

2016-04-14 Thread blerer
Merge branch cassandra-2.2 into cassandra-3.0


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0818e1b1
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0818e1b1
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0818e1b1

Branch: refs/heads/cassandra-3.0
Commit: 0818e1b16af36adb2fbbd3dffacdccc2ecf60a9a
Parents: fd24b7c 69edeaa
Author: Benjamin Lerer 
Authored: Thu Apr 14 12:32:56 2016 +0200
Committer: Benjamin Lerer 
Committed: Thu Apr 14 12:33:05 2016 +0200

--

--




[jira] [Commented] (CASSANDRA-11339) WHERE clause in SELECT DISTINCT can be ignored

2016-04-14 Thread Benjamin Lerer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240952#comment-15240952
 ] 

Benjamin Lerer commented on CASSANDRA-11339:


+1
Thanks for the patch.

> WHERE clause in SELECT DISTINCT can be ignored
> --
>
> Key: CASSANDRA-11339
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11339
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
>Reporter: Philip Thompson
>Assignee: Alex Petrov
> Fix For: 2.2.x, 3.x
>
> Attachments: 
> 0001-Add-validation-for-distinct-queries-disallowing-quer.patch
>
>
> I've tested this out on 2.1-head. I'm not sure if it's the same behavior on 
> newer versions.
> For a given table t, with {{PRIMARY KEY (id, v)}} the following two queries 
> return the same result:
> {{SELECT DISTINCT id FROM t WHERE v > X ALLOW FILTERING}}
> {{SELECT DISTINCT id FROM t}}
> The WHERE clause in the former is silently ignored, and all id are returned, 
> regardless of the value of v in any row. 
> It seems like this has been a known issue for a while:
> http://stackoverflow.com/questions/26548788/select-distinct-cql-ignores-where-clause
> However, if we don't support filtering on anything but the partition key, we 
> should reject the query, rather than silently dropping the where clause



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


cassandra git commit: Allow only DISTINCT queries with partition keys restrictions

2016-04-14 Thread blerer
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.2 19b4b637a -> 69edeaa46


Allow only DISTINCT queries with partition keys restrictions

patch by Alex Petrov; reviewed by Benjamin Lerer for CASSANDRA-11339


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/69edeaa4
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/69edeaa4
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/69edeaa4

Branch: refs/heads/cassandra-2.2
Commit: 69edeaa46b78bb168f7e9d0b1c991c07b90f41ca
Parents: 19b4b63
Author: Alex Petrov 
Authored: Thu Apr 14 12:26:52 2016 +0200
Committer: Benjamin Lerer 
Committed: Thu Apr 14 12:26:52 2016 +0200

--
 CHANGES.txt |  1 +
 .../restrictions/StatementRestrictions.java |  9 
 .../cql3/statements/SelectStatement.java|  3 ++
 .../cql3/validation/operations/SelectTest.java  | 45 
 4 files changed, 58 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/69edeaa4/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 54013a3..c72b6cb 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.2.6
+ * Allow only DISTINCT queries with partition keys restrictions 
(CASSANDRA-11339)
  * CqlConfigHelper no longer requires both a keystore and truststore to work 
(CASSANDRA-11532)
  * Make deprecated repair methods backward-compatible with previous 
notification service (CASSANDRA-11430)
  * IncomingStreamingConnection version check message wrong (CASSANDRA-11462)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/69edeaa4/src/java/org/apache/cassandra/cql3/restrictions/StatementRestrictions.java
--
diff --git 
a/src/java/org/apache/cassandra/cql3/restrictions/StatementRestrictions.java 
b/src/java/org/apache/cassandra/cql3/restrictions/StatementRestrictions.java
index e0cf743..3934f33 100644
--- a/src/java/org/apache/cassandra/cql3/restrictions/StatementRestrictions.java
+++ b/src/java/org/apache/cassandra/cql3/restrictions/StatementRestrictions.java
@@ -279,6 +279,15 @@ public final class StatementRestrictions
 }
 
 /**
+ * Checks if the restrictions contain any non-primary key restrictions
+ * @return true if the restrictions contain any non-primary 
key restrictions, false otherwise.
+ */
+public boolean hasNonPrimaryKeyRestrictions()
+{
+return !nonPrimaryKeyRestrictions.isEmpty();
+}
+
+/**
  * Returns the partition key components that are not restricted.
  * @return the partition key components that are not restricted.
  */

http://git-wip-us.apache.org/repos/asf/cassandra/blob/69edeaa4/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
--
diff --git a/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java 
b/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
index 291e3e4..7bba330 100644
--- a/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
+++ b/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java
@@ -885,6 +885,9 @@ public class SelectStatement implements CQLStatement
   StatementRestrictions 
restrictions)
   throws 
InvalidRequestException
 {
+checkFalse(restrictions.hasClusteringColumnsRestriction() || 
restrictions.hasNonPrimaryKeyRestrictions(),
+   "SELECT DISTINCT with WHERE clause only supports 
restriction by partition key.");
+
 Collection requestedColumns = 
selection.getColumns();
 for (ColumnDefinition def : requestedColumns)
 checkFalse(!def.isPartitionKey() && !def.isStatic(),

http://git-wip-us.apache.org/repos/asf/cassandra/blob/69edeaa4/test/unit/org/apache/cassandra/cql3/validation/operations/SelectTest.java
--
diff --git 
a/test/unit/org/apache/cassandra/cql3/validation/operations/SelectTest.java 
b/test/unit/org/apache/cassandra/cql3/validation/operations/SelectTest.java
index d8cd3c3..d444fde 100644
--- a/test/unit/org/apache/cassandra/cql3/validation/operations/SelectTest.java
+++ b/test/unit/org/apache/cassandra/cql3/validation/operations/SelectTest.java
@@ -1253,6 +1253,51 @@ public class SelectTest extends CQLTester
 Assert.assertEquals(9, rows.length);
 }
 
+@Test
+public void testSelectDistinctWithWhereClause() throws Throwable {
+createTable("CREATE TABLE %s (k int, a int, b 

[jira] [Updated] (CASSANDRA-11570) Concurrent execution of prepared statement returns invalid JSON as result

2016-04-14 Thread Alexander Ryabets (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Ryabets updated CASSANDRA-11570:
--
Attachment: valid_output.txt
broken_output.txt
CassandraPreparedStatementsTest.zip

> Concurrent execution of prepared statement returns invalid JSON as result
> -
>
> Key: CASSANDRA-11570
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11570
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.2, C++ or C# driver
>Reporter: Alexander Ryabets
> Attachments: CassandraPreparedStatementsTest.zip, broken_output.txt, 
> test_neptunao.cql, valid_output.txt
>
>
> When I use prepared statement for async execution of multiple statements I 
> get JSON with broken data. Keys got totally corrupted when values seems to be 
> normal though.
> First I encoutered this issue when I were performing stress testing of our 
> project using custom script. We are using DataStax C++ driver and execute 
> statements from different fibers.
> Then I was trying to isolate problem and wrote simple C# program which starts 
> multiple Tasks in a loop. Each task uses the once created prepared statement 
> to read data from the base. As you can see results are totally mess.
> I 've attached archive with console C# project (1 cs file) which just print 
> resulting JSON to user. 
> Here is the main part of C# code.
> {noformat}
> static void Main(string[] args)
> {
>   const int task_count = 300;
>   using(var cluster = 
> Cluster.Builder().AddContactPoints("127.0.0.1").Build())
>   {
> using(var session = cluster.Connect())
> {
>   var prepared = session.Prepare("select json * from 
> test_neptunao.ubuntu");
>   var tasks = new Task[task_count];
>   for(int i = 0; i < task_count; i++)
>   {
> tasks[i] = Query(prepared, session);
>   }
>   Task.WaitAll(tasks);
> }
>   }
>   Console.ReadKey();
> }
> private static Task Query(PreparedStatement prepared, ISession session)
> {
>   var stmt = prepared.Bind();
>   stmt.SetConsistencyLevel(ConsistencyLevel.One);
>   return session.ExecuteAsync(stmt).ContinueWith(tr =>
>   {
> foreach(var row in tr.Result)
> {
>   var value = row.GetValue(0);
>   Console.WriteLine(value);
> }
>   });
> }
> {noformat}
> I also attached cql script with test DB schema.
> {noformat}
> CREATE KEYSPACE IF NOT EXISTS test_neptunao
> WITH replication = {
>   'class' : 'SimpleStrategy',
>   'replication_factor' : 3
> };
> use test_neptunao;
> create table if not exists ubuntu (
>   id timeuuid PRIMARY KEY,
>   precise_pangolin text,
>   trusty_tahr text,
>   wily_werewolf text, 
>   vivid_vervet text,
>   saucy_salamander text,
>   lucid_lynx text
> );
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11570) Concurrent execution of prepared statement returns invalid JSON as result

2016-04-14 Thread Alexander Ryabets (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240924#comment-15240924
 ] 

Alexander Ryabets commented on CASSANDRA-11570:
---

I've changed the samle a bit to fetch only one row per query. 

I've also attached the broken and expected outputs. You can see invalid result 
in the row with id `516b00a2-01a7-11e6-8630-c04f49e62c6b` for example.

> Concurrent execution of prepared statement returns invalid JSON as result
> -
>
> Key: CASSANDRA-11570
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11570
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.2, C++ or C# driver
>Reporter: Alexander Ryabets
> Attachments: test_neptunao.cql
>
>
> When I use prepared statement for async execution of multiple statements I 
> get JSON with broken data. Keys got totally corrupted when values seems to be 
> normal though.
> First I encoutered this issue when I were performing stress testing of our 
> project using custom script. We are using DataStax C++ driver and execute 
> statements from different fibers.
> Then I was trying to isolate problem and wrote simple C# program which starts 
> multiple Tasks in a loop. Each task uses the once created prepared statement 
> to read data from the base. As you can see results are totally mess.
> I 've attached archive with console C# project (1 cs file) which just print 
> resulting JSON to user. 
> Here is the main part of C# code.
> {noformat}
> static void Main(string[] args)
> {
>   const int task_count = 300;
>   using(var cluster = 
> Cluster.Builder().AddContactPoints("127.0.0.1").Build())
>   {
> using(var session = cluster.Connect())
> {
>   var prepared = session.Prepare("select json * from 
> test_neptunao.ubuntu");
>   var tasks = new Task[task_count];
>   for(int i = 0; i < task_count; i++)
>   {
> tasks[i] = Query(prepared, session);
>   }
>   Task.WaitAll(tasks);
> }
>   }
>   Console.ReadKey();
> }
> private static Task Query(PreparedStatement prepared, ISession session)
> {
>   var stmt = prepared.Bind();
>   stmt.SetConsistencyLevel(ConsistencyLevel.One);
>   return session.ExecuteAsync(stmt).ContinueWith(tr =>
>   {
> foreach(var row in tr.Result)
> {
>   var value = row.GetValue(0);
>   Console.WriteLine(value);
> }
>   });
> }
> {noformat}
> I also attached cql script with test DB schema.
> {noformat}
> CREATE KEYSPACE IF NOT EXISTS test_neptunao
> WITH replication = {
>   'class' : 'SimpleStrategy',
>   'replication_factor' : 3
> };
> use test_neptunao;
> create table if not exists ubuntu (
>   id timeuuid PRIMARY KEY,
>   precise_pangolin text,
>   trusty_tahr text,
>   wily_werewolf text, 
>   vivid_vervet text,
>   saucy_salamander text,
>   lucid_lynx text
> );
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11570) Concurrent execution of prepared statement returns invalid JSON as result

2016-04-14 Thread Alexander Ryabets (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Ryabets updated CASSANDRA-11570:
--
Attachment: (was: CassandraPreparedStatementsTest.zip)

> Concurrent execution of prepared statement returns invalid JSON as result
> -
>
> Key: CASSANDRA-11570
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11570
> Project: Cassandra
>  Issue Type: Bug
> Environment: Cassandra 3.2, C++ or C# driver
>Reporter: Alexander Ryabets
> Attachments: test_neptunao.cql
>
>
> When I use prepared statement for async execution of multiple statements I 
> get JSON with broken data. Keys got totally corrupted when values seems to be 
> normal though.
> First I encoutered this issue when I were performing stress testing of our 
> project using custom script. We are using DataStax C++ driver and execute 
> statements from different fibers.
> Then I was trying to isolate problem and wrote simple C# program which starts 
> multiple Tasks in a loop. Each task uses the once created prepared statement 
> to read data from the base. As you can see results are totally mess.
> I 've attached archive with console C# project (1 cs file) which just print 
> resulting JSON to user. 
> Here is the main part of C# code.
> {noformat}
> static void Main(string[] args)
> {
>   const int task_count = 300;
>   using(var cluster = 
> Cluster.Builder().AddContactPoints("127.0.0.1").Build())
>   {
> using(var session = cluster.Connect())
> {
>   var prepared = session.Prepare("select json * from 
> test_neptunao.ubuntu");
>   var tasks = new Task[task_count];
>   for(int i = 0; i < task_count; i++)
>   {
> tasks[i] = Query(prepared, session);
>   }
>   Task.WaitAll(tasks);
> }
>   }
>   Console.ReadKey();
> }
> private static Task Query(PreparedStatement prepared, ISession session)
> {
>   var stmt = prepared.Bind();
>   stmt.SetConsistencyLevel(ConsistencyLevel.One);
>   return session.ExecuteAsync(stmt).ContinueWith(tr =>
>   {
> foreach(var row in tr.Result)
> {
>   var value = row.GetValue(0);
>   Console.WriteLine(value);
> }
>   });
> }
> {noformat}
> I also attached cql script with test DB schema.
> {noformat}
> CREATE KEYSPACE IF NOT EXISTS test_neptunao
> WITH replication = {
>   'class' : 'SimpleStrategy',
>   'replication_factor' : 3
> };
> use test_neptunao;
> create table if not exists ubuntu (
>   id timeuuid PRIMARY KEY,
>   precise_pangolin text,
>   trusty_tahr text,
>   wily_werewolf text, 
>   vivid_vervet text,
>   saucy_salamander text,
>   lucid_lynx text
> );
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11452) Cache implementation using LIRS eviction for in-process page cache

2016-04-14 Thread Ben Manes (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240902#comment-15240902
 ] 

Ben Manes commented on CASSANDRA-11452:
---

The simple hack recycling when the hash codes are equal seems to work well. 
This would be done in {{evictFromMain}} at the end after the candidate was 
evicted. Since a weighted cache might evict multiple entries we have to reset 
the victim for the next loop.

{code}
// Recycle to guard against hash collision attacks
if (victimKey.hashCode() == candidateKey.hashCode()) {
  Node nextVictim = victim.getNextInAccessOrder();
  accessOrderProbationDeque().moveToBack(victim);
  victim = nextVictim;
}
{code}

The LIRS paper's traces (short) indicate that the difference noise.

{{
multi1: 55.28 -> 55.40
multi2: 48.37 -> 48.42
multi3: 41.78 -> 42.00
gli: 34.15 -> 34.06
ps: 57.15 -> 57.17
sprite: 54.95 -> 55.33
cs: 30.19 -> 29.82
loop: 49.95 -> 49.90
2_pools: 52.02 -> 51.96
}}

 Tomorrow I'll check some of the ARC traces, clean-up the patch, and convert 
Branimir into a unit test. Thoughts?

> Cache implementation using LIRS eviction for in-process page cache
> --
>
> Key: CASSANDRA-11452
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11452
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local Write-Read Paths
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>
> Following up from CASSANDRA-5863, to make best use of caching and to avoid 
> having to explicitly marking compaction accesses as non-cacheable, we need a 
> cache implementation that uses an eviction algorithm that can better handle 
> non-recurring accesses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >