[jira] [Commented] (CASSANDRA-8755) Replace trivial uses of String.replace/replaceAll/split with StringUtils methods

2015-11-14 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15005267#comment-15005267
 ] 

Robert Stupp commented on CASSANDRA-8755:
-

Thanks for your work so far! Some comments on your changes:

* the changes in {{CommitLogReplayer}}, {{StorageService}} should use 
{{StringUtils.countMatches}}
* the changes that remove a single char from a string should use 
{{StringUtils.remove(String,char)}}
* you can omit all the changes in the {{tools}} and {{stress}} packages and to 
{{Client}}, {{CQLTester}}, {{Sample}} classes. These are not on a hot path and 
only affect the initialization of these tools or are just used very rarely. 
Just don’t want to change something that buys us basically nothing.
* also prefer not to change the hadoop classes
* the constant {{SLASH}} in {{PropertyFileSnitch}} should be at the beginning 
of the class
* the name {{PATTERN_FINAL_DOLLAR}} in {{CassandraMetricsRegistry}} is incorrect

Generally we require unit tests that ensure the changes work as expected. You 
can use the old code in the unit tests to verify the new production code 
against a bunch of input parameters.

I’ve triggered a CI run against your changes. Unit tests 
([here|http://cassci.datastax.com/job/snazy-8755-testall/]) look good, but some 
dtests failed ([here|http://cassci.datastax.com/job/snazy-8755-dtest/]) (more 
than currently on trunk) - but probably not caused by your patch.

I recommend that you create a separate branch for this change off of trunk. You 
can safely squash these 4 commits into a single one - also the changes 
mentioned above. Having a separate branch also has the advantage that you can 
rebase and/or base on another branch.


> Replace trivial uses of String.replace/replaceAll/split with StringUtils 
> methods
> 
>
> Key: CASSANDRA-8755
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8755
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jaroslav Kamenik
>Priority: Trivial
>  Labels: lhf
> Attachments: 8755.tar.gz, trunk-8755.patch, trunk-8755.txt
>
>
> There are places in the code where those regex based methods are  used with 
> plain, not regexp, strings, so StringUtils alternatives should be faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (CASSANDRA-10660) Support user-defined compactions through nodetool

2015-11-14 Thread Jeff Jirsa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa reassigned CASSANDRA-10660:
--

Assignee: Jeff Jirsa

> Support user-defined compactions through nodetool
> -
>
> Key: CASSANDRA-10660
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10660
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Tyler Hobbs
>Assignee: Jeff Jirsa
>Priority: Minor
>  Labels: lhf
>
> For a long time, we've supported running user-defined compactions through 
> JMX.  This comes in handy fairly often, mostly when dealing with low disk 
> space or tombstone purging, so it would be good to add something to nodetool 
> for this.  An extra option for {{nodetool compact}} would probably suffice.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10660) Support user-defined compactions through nodetool

2015-11-14 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15005744#comment-15005744
 ] 

Jeff Jirsa commented on CASSANDRA-10660:


Added as {{--user-defined}} argument to {{nodetool compact}}, converting the 
list of arguments to a comma-joined string for {{forceUserDefinedCompaction}}

2.1: https://github.com/jeffjirsa/cassandra/tree/cassandra-10660-2.1
2.2: https://github.com/jeffjirsa/cassandra/tree/cassandra-10660-2.2
2.2 applies cleanly to trunk, but just in case here's an explicit patch on that 
branch: https://github.com/jeffjirsa/cassandra/tree/cassandra-10660



> Support user-defined compactions through nodetool
> -
>
> Key: CASSANDRA-10660
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10660
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Tyler Hobbs
>Assignee: Jeff Jirsa
>Priority: Minor
>  Labels: lhf
>
> For a long time, we've supported running user-defined compactions through 
> JMX.  This comes in handy fairly often, mostly when dealing with low disk 
> space or tombstone purging, so it would be good to add something to nodetool 
> for this.  An extra option for {{nodetool compact}} would probably suffice.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)



[Cassandra Wiki] Update of "ContributorsGroup" by JonathanEllis

2015-11-14 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.

The "ContributorsGroup" page has been changed by JonathanEllis:
https://wiki.apache.org/cassandra/ContributorsGroup?action=diff=49=50

Comment:
add MichaelEdge

   * MarcusEriksson
   * MarkWatson
   * MatthewDennis
+  * MichaelEdge
   * MichaelKjellman
   * NickBailey
   * Nick Neuberger


[jira] [Issue Comment Deleted] (CASSANDRA-10705) A typo fix

2015-11-14 Thread Hobin Yoon (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hobin Yoon updated CASSANDRA-10705:
---
Comment: was deleted

(was: From a64010ac878b35b065586ead2eb92878e73cd249 Mon Sep 17 00:00:00 2001
From: Hobin Yoon 
Date: Sat, 14 Nov 2015 11:59:00 -0500
Subject: [PATCH] fix a typo

---
 conf/cassandra.yaml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/conf/cassandra.yaml b/conf/cassandra.yaml
index 89dda76..fd28447 100644
--- a/conf/cassandra.yaml
+++ b/conf/cassandra.yaml
@@ -378,7 +378,7 @@ concurrent_materialized_view_writes: 32
 # memtable_offheap_space_in_mb: 2048
 
 # Ratio of occupied non-flushing memtable size to total permitted size
-# that will trigger a flush of the largest memtable.  Lager mct will
+# that will trigger a flush of the largest memtable.  Larger mct will
 # mean larger flushes and hence less compaction, but also less concurrent
 # flush activity which can make it difficult to keep your disks fed
 # under heavy write load.)

> A typo fix
> --
>
> Key: CASSANDRA-10705
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10705
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Hobin Yoon
>Priority: Trivial
>  Labels: easyfix
> Fix For: 3.2
>
> Attachments: typo-fix.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10705) A typo fix

2015-11-14 Thread Hobin Yoon (JIRA)
Hobin Yoon created CASSANDRA-10705:
--

 Summary: A typo fix
 Key: CASSANDRA-10705
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10705
 Project: Cassandra
  Issue Type: Improvement
  Components: Configuration
Reporter: Hobin Yoon
Priority: Trivial
 Fix For: 3.2
 Attachments: typo-fix.patch





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10706) just here

2015-11-14 Thread Hans Christian Holm (JIRA)
Hans Christian Holm created CASSANDRA-10706:
---

 Summary: just here
 Key: CASSANDRA-10706
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10706
 Project: Cassandra
  Issue Type: Bug
  Components: Distributed Metadata
Reporter: Hans Christian Holm






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (CASSANDRA-10708) Add forceUserDefinedCleanup to allow more flexible cleanup for operators

2015-11-14 Thread Jeff Jirsa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa reassigned CASSANDRA-10708:
--

Assignee: Jeff Jirsa

> Add forceUserDefinedCleanup to allow more flexible cleanup for operators
> 
>
> Key: CASSANDRA-10708
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10708
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jeff Jirsa
>Assignee: Jeff Jirsa
>Priority: Minor
> Fix For: 3.2, 2.1.x, 2.2.x
>
>
> {{nodetool cleanup}} currently executes in parallel on all sstables in a 
> table.  No source sstables are GCd until the parallel operation completes. In 
> certain scenarios, this is nonideal (it has both memory and disk usage 
> implications for operators who try to run the operation on larger tables). 
> Adding {{forceUserDefinedCleanup}} puts cleanup operations closer to parity 
> with compaction {{forceUserDefinedCompaction}} for the rare cases where 
> operators need to do something slightly different than the traditional 
> cleanup. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10708) Add forceUserDefinedCleanup to allow more flexible cleanup for operators

2015-11-14 Thread Jeff Jirsa (JIRA)
Jeff Jirsa created CASSANDRA-10708:
--

 Summary: Add forceUserDefinedCleanup to allow more flexible 
cleanup for operators
 Key: CASSANDRA-10708
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10708
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jeff Jirsa
Priority: Minor
 Fix For: 3.2, 2.1.x, 2.2.x


{{nodetool cleanup}} currently executes in parallel on all sstables in a table. 
 No source sstables are GCd until the parallel operation completes. In certain 
scenarios, this is nonideal (it has both memory and disk usage implications for 
operators who try to run the operation on larger tables). 

Adding {{forceUserDefinedCleanup}} puts cleanup operations closer to parity 
with compaction {{forceUserDefinedCompaction}} for the rare cases where 
operators need to do something slightly different than the traditional cleanup. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10708) Add forceUserDefinedCleanup to allow more flexible cleanup for operators

2015-11-14 Thread Jeff Jirsa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Jirsa updated CASSANDRA-10708:
---
Fix Version/s: (was: 3.2)
   (was: 2.2.x)
   (was: 2.1.x)

> Add forceUserDefinedCleanup to allow more flexible cleanup for operators
> 
>
> Key: CASSANDRA-10708
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10708
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jeff Jirsa
>Assignee: Jeff Jirsa
>Priority: Minor
>
> {{nodetool cleanup}} currently executes in parallel on all sstables in a 
> table.  No source sstables are GCd until the parallel operation completes. In 
> certain scenarios, this is nonideal (it has both memory and disk usage 
> implications for operators who try to run the operation on larger tables). 
> Adding {{forceUserDefinedCleanup}} puts cleanup operations closer to parity 
> with compaction {{forceUserDefinedCompaction}} for the rare cases where 
> operators need to do something slightly different than the traditional 
> cleanup. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7217) Native transport performance (with cassandra-stress) drops precipitously past around 1000 threads

2015-11-14 Thread Ariel Weisberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ariel Weisberg updated CASSANDRA-7217:
--
Attachment: FakeQuerySystem.java

To test stress and threading in general I mocked out interactions between 
stress and the client library. There is no performance regression up to 4000 
threads if you remove the server and client library from the picture.

Attached is what I used to fake the queries. It's a thread pulling queries off 
a delay queue and the delay is set to be a uniform distribution between some 
minimum and maximum latency. I tried 3-9 milliseconds with a server side 
throughput of 100k. The threads issuing the queries are woken up via 
{{java.util.concurrent.FutureTask}}.

I'll mock out the server in the client library next.

> Native transport performance (with cassandra-stress) drops precipitously past 
> around 1000 threads
> -
>
> Key: CASSANDRA-7217
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7217
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Benedict
>Assignee: Ariel Weisberg
>  Labels: performance, stress, triaged
> Fix For: 3.1
>
> Attachments: 2000-threads.svg, 500-threads.svg, FakeQuerySystem.java
>
>
> This is obviously bad. Let's figure out why it's happening and put a stop to 
> it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-10707) Add support for Group By to Select statement

2015-11-14 Thread Benjamin Lerer (JIRA)
Benjamin Lerer created CASSANDRA-10707:
--

 Summary: Add support for Group By to Select statement
 Key: CASSANDRA-10707
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10707
 Project: Cassandra
  Issue Type: Improvement
Reporter: Benjamin Lerer
Assignee: Benjamin Lerer


Now that Cassandra support aggregate functions, it makes sense to support 
{{GROUP BY}} on the {{SELECT}} statements.

It should be possible to group either at the partition level or at the 
clustering column level.

{code}
SELECT partitionKey, max(value) FROM myTable GROUP BY partitionKey;
SELECT partitionKey, clustering0, clustering1, max(value) FROM myTable GROUP BY 
partitionKey, clustering0, clustering1; 
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10708) Add forceUserDefinedCleanup to allow more flexible cleanup for operators

2015-11-14 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15005701#comment-15005701
 ] 

Jeff Jirsa commented on CASSANDRA-10708:


Pending CI:

||2.1||2.2||trunk / 3.2 ? ||
| [branch|https://github.com/jeffjirsa/cassandra/tree/user_defined_cleanup-2.1] 
| [branch|https://github.com/jeffjirsa/cassandra/tree/user_defined_cleanup-2.2] 
| [branch|https://github.com/jeffjirsa/cassandra/tree/user_defined_cleanup] |
| 
[testall|http://cassci.datastax.com/job/jeffjirsa-user_defined_cleanup-2.1-testall/lastBuild/]
 | 
[testall|http://cassci.datastax.com/job/jeffjirsa-user_defined_cleanup-2.2-testall/lastBuild/]
 | 
[testall|http://cassci.datastax.com/job/jeffjirsa-user_defined_cleanup-testall/lastBuild/]
 |



> Add forceUserDefinedCleanup to allow more flexible cleanup for operators
> 
>
> Key: CASSANDRA-10708
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10708
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jeff Jirsa
>Assignee: Jeff Jirsa
>Priority: Minor
>
> {{nodetool cleanup}} currently executes in parallel on all sstables in a 
> table.  No source sstables are GCd until the parallel operation completes. In 
> certain scenarios, this is nonideal (it has both memory and disk usage 
> implications for operators who try to run the operation on larger tables). 
> Adding {{forceUserDefinedCleanup}} puts cleanup operations closer to parity 
> with compaction {{forceUserDefinedCompaction}} for the rare cases where 
> operators need to do something slightly different than the traditional 
> cleanup. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)