[jira] [Commented] (CASSANDRA-9387) Add snitch supporting Windows Azure

2015-10-22 Thread Matt Kennedy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14970263#comment-14970263
 ] 

Matt Kennedy commented on CASSANDRA-9387:
-

This item is waiting on the fault-domain and update-domain data to be exposed 
via the Instance Metadata Service similar to how the instance event metadata is 
exposed in this article: 
https://azure.microsoft.com/en-us/blog/what-just-happened-to-my-vm-in-vm-metadata-service/

> Add snitch supporting Windows Azure
> ---
>
> Key: CASSANDRA-9387
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9387
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Config
>Reporter: Jonathan Ellis
>Assignee: Matt Kennedy
> Fix For: 2.1.x
>
>
> Looks like regions / fault domains are a pretty close analogue to C* 
> DCs/racks.
> http://blogs.technet.com/b/yungchou/archive/2011/05/16/window-azure-fault-domain-and-update-domain-explained-for-it-pros.aspx



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9048) Delimited File Bulk Loader

2015-04-01 Thread Matt Kennedy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14391221#comment-14391221
 ] 

Matt Kennedy commented on CASSANDRA-9048:
-

I'd like to advocate for this loader being part of core Cassandra. Bulk loading 
is a fundamental task for any database. And database operators need multiple 
strategies to address the task. Not only does this tool meet the need in the 
most efficient way that has been identified so far, but it also serves as 
sample code for users to customize to build their own efficient loaders. It 
isn't really as practical for end-users to try to learn how to do customized 
bulk loading the right way by examining the COPY operation. This tool is at 
least as useful and any code in the examples directory and applies to a 
broader set of Cassandra users. Since it happens to be a fully functioning tool 
though, it seems to make more sense for it to live under the tools directory.

 Delimited File Bulk Loader
 --

 Key: CASSANDRA-9048
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9048
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter:  Brian Hess
 Attachments: CASSANDRA-9048.patch


 There is a strong need for bulk loading data from delimited files into 
 Cassandra.  Starting with delimited files means that the data is not 
 currently in the SSTable format, and therefore cannot immediately leverage 
 Cassandra's bulk loading tool, sstableloader, directly.
 A tool supporting delimited files much closer matches the format of the data 
 more often than the SSTable format itself, and a tool that loads from 
 delimited files is very useful.
 In order for this bulk loader to be more generally useful to customers, it 
 should handle a number of options at a minimum:
 - support specifying the input file or to read the data from stdin (so other 
 command-line programs can pipe into the loader)
 - supply the CQL schema for the input data
 - support all data types other than collections (collections is a stretch 
 goal/need)
 - an option to specify the delimiter
 - an option to specify comma as the decimal delimiter (for international use 
 casese)
 - an option to specify how NULL values are specified in the file (e.g., the 
 empty string or the string NULL)
 - an option to specify how BOOLEAN values are specified in the file (e.g., 
 TRUE/FALSE or 0/1)
 - an option to specify the Date and Time format
 - an option to skip some number of rows at the beginning of the file
 - an option to only read in some number of rows from the file
 - an option to indicate how many parse errors to tolerate
 - an option to specify a file that will contain all the lines that did not 
 parse correctly (up to the maximum number of parse errors)
 - an option to specify the CQL port to connect to (with 9042 as the default).
 Additional options would be useful, but this set of options/features is a 
 start.
 A word on COPY.  COPY comes via CQLSH which requires the client to be the 
 same version as the server (e.g., 2.0 CQLSH does not work with 2.1 Cassandra, 
 etc).  This tool should be able to connect to any version of Cassandra 
 (within reason).  For example, it should be able to handle 2.0.x and 2.1.x.  
 Moreover, CQLSH's COPY command does not support a number of the options 
 above.  Lastly, the performance of COPY in 2.0.x is not high enough to be 
 considered a bulk ingest tool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8489) Restrict table visibility across keyspaces

2014-12-15 Thread Matt Kennedy (JIRA)
Matt Kennedy created CASSANDRA-8489:
---

 Summary: Restrict table visibility across keyspaces
 Key: CASSANDRA-8489
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8489
 Project: Cassandra
  Issue Type: Improvement
Reporter: Matt Kennedy
Priority: Minor


This ticket is to capture a specific fine grained authorization request, 
specifically that users should be able to be restricted to only seeing tables 
in specific keyspaces.

For example, given keyspaces K1 and K2 and users U1  U2, allow U1 access to 
K1, but no access to even see table names in K2. Allow U2 access to K2, but no 
access to see table names in K1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7631) Allow Stress to write directly to SSTables

2014-07-29 Thread Matt Kennedy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077730#comment-14077730
 ] 

Matt Kennedy commented on CASSANDRA-7631:
-

Personally I think the current syntax is a massive improvement over the older 
version. It takes a little bit of time to work out, and a small handful of the 
options remain confusing, but overall it's a fairly clear system with useful 
help messages. If anything, some examples of different invocations 
(incantations?) would be useful, but I don't see a reason to massively change 
it.

 Allow Stress to write directly to SSTables
 --

 Key: CASSANDRA-7631
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Russell Alexander Spitzer
Assignee: Russell Alexander Spitzer

 One common difficulty with benchmarking machines is the amount of time it 
 takes to initially load data. For machines with a large amount of ram this 
 becomes especially onerous because a very large amount of data needs to be 
 placed on the machine before page-cache can be circumvented. 
 To remedy this I suggest we add a top level flag to Cassandra-Stress which 
 would cause the tool to write directly to sstables rather than actually 
 performing CQL inserts. Internally this would use CQLSStable writer to write 
 directly to sstables while skipping any keys which are not owned by the node 
 stress is running on. The same stress command run on each node in the cluster 
 would then write unique sstables only containing data which that node is 
 responsible for. Following this no further network IO would be required to 
 distribute data as it would all already be correctly in place.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7631) Allow Stress to write directly to SSTables

2014-07-28 Thread Matt Kennedy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076459#comment-14076459
 ] 

Matt Kennedy commented on CASSANDRA-7631:
-

Having a mechanism like this is extremely important for testing large scale 
clusters. We don't necessarily want/need to test a large scale ingest each 
time, so the sooner we can go from spinning up 100 nodes, to running a mixed 
workload, the better. If one invocation of stress can tell 100 stressd 
processes to write local SSTables according to the user defined yaml, that 
should be massively more efficient than running a write job.

 Allow Stress to write directly to SSTables
 --

 Key: CASSANDRA-7631
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Russell Alexander Spitzer
Assignee: Russell Alexander Spitzer

 One common difficulty with benchmarking machines is the amount of time it 
 takes to initially load data. For machines with a large amount of ram this 
 becomes especially onerous because a very large amount of data needs to be 
 placed on the machine before page-cache can be circumvented. 
 To remedy this I suggest we add a top level flag to Cassandra-Stress which 
 would cause the tool to write directly to sstables rather than actually 
 performing CQL inserts. Internally this would use CQLSStable writer to write 
 directly to sstables while skipping any keys which are not owned by the node 
 stress is running on. The same stress command run on each node in the cluster 
 would then write unique sstables only containing data which that node is 
 responsible for. Following this no further network IO would be required to 
 distribute data as it would all already be correctly in place.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7631) Allow Stress to write directly to SSTables

2014-07-28 Thread Matt Kennedy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076801#comment-14076801
 ] 

Matt Kennedy commented on CASSANDRA-7631:
-

In many cases, we primarily care about mixed workloads, but those need a 
populated cluster to run on. So yes, writes are important, but mostly in the 
context of concurrent reads also happening. 

 Allow Stress to write directly to SSTables
 --

 Key: CASSANDRA-7631
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Russell Alexander Spitzer
Assignee: Russell Alexander Spitzer

 One common difficulty with benchmarking machines is the amount of time it 
 takes to initially load data. For machines with a large amount of ram this 
 becomes especially onerous because a very large amount of data needs to be 
 placed on the machine before page-cache can be circumvented. 
 To remedy this I suggest we add a top level flag to Cassandra-Stress which 
 would cause the tool to write directly to sstables rather than actually 
 performing CQL inserts. Internally this would use CQLSStable writer to write 
 directly to sstables while skipping any keys which are not owned by the node 
 stress is running on. The same stress command run on each node in the cluster 
 would then write unique sstables only containing data which that node is 
 responsible for. Following this no further network IO would be required to 
 distribute data as it would all already be correctly in place.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7631) Allow Stress to write directly to SSTables

2014-07-28 Thread Matt Kennedy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076854#comment-14076854
 ] 

Matt Kennedy commented on CASSANDRA-7631:
-

Yes, ideally formatted using your new user-defined schema stuff. I don't mean 
to speak for Russ, but we fleshed out this idea jointly.

 Allow Stress to write directly to SSTables
 --

 Key: CASSANDRA-7631
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7631
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Russell Alexander Spitzer
Assignee: Russell Alexander Spitzer

 One common difficulty with benchmarking machines is the amount of time it 
 takes to initially load data. For machines with a large amount of ram this 
 becomes especially onerous because a very large amount of data needs to be 
 placed on the machine before page-cache can be circumvented. 
 To remedy this I suggest we add a top level flag to Cassandra-Stress which 
 would cause the tool to write directly to sstables rather than actually 
 performing CQL inserts. Internally this would use CQLSStable writer to write 
 directly to sstables while skipping any keys which are not owned by the node 
 stress is running on. The same stress command run on each node in the cluster 
 would then write unique sstables only containing data which that node is 
 responsible for. Following this no further network IO would be required to 
 distribute data as it would all already be correctly in place.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7468) Add time-based execution to cassandra-stress

2014-07-21 Thread Matt Kennedy (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Kennedy updated CASSANDRA-7468:


Attachment: trunk-7468-rebase.patch

 Add time-based execution to cassandra-stress
 

 Key: CASSANDRA-7468
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7468
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Matt Kennedy
Assignee: Matt Kennedy
Priority: Minor
 Fix For: 2.1.1

 Attachments: trunk-7468-rebase.patch, trunk-7468.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7468) Add time-based execution to cassandra-stress

2014-07-21 Thread Matt Kennedy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14069268#comment-14069268
 ] 

Matt Kennedy commented on CASSANDRA-7468:
-

Rebased to trunk. Changed '-d' parameter to '-duration'. Note, running this 
without the latest DataStax Java driver (2.1-beta2) results in some seemingly 
extraneous stack traces, but they don't seem to affect functionality.

 Add time-based execution to cassandra-stress
 

 Key: CASSANDRA-7468
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7468
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Matt Kennedy
Assignee: Matt Kennedy
Priority: Minor
 Fix For: 2.1.1

 Attachments: trunk-7468-rebase.patch, trunk-7468.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7468) Add time-based execution to cassandra-stress

2014-07-21 Thread Matt Kennedy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14069387#comment-14069387
 ] 

Matt Kennedy commented on CASSANDRA-7468:
-

Thanks for the review, units are a welcome addition. I'm also relieved you got 
rid of the countInSeconds boolean to do it. I felt cheap doing it that way :-)

Everything else looks good to me.

 Add time-based execution to cassandra-stress
 

 Key: CASSANDRA-7468
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7468
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Matt Kennedy
Assignee: Matt Kennedy
Priority: Minor
 Fix For: 2.1.1

 Attachments: 7468v2.txt, trunk-7468-rebase.patch, trunk-7468.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7468) Add time-based execution to cassandra-stress

2014-07-09 Thread Matt Kennedy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056244#comment-14056244
 ] 

Matt Kennedy commented on CASSANDRA-7468:
-

I've tried using the n, n method to get timed execution for the last week or 
so in two different stress testing environments in two cloud providers, 
unfortunately, the test execution times have run over by a significant amount. 
The independent timing thread method in the patch gives much more consistent 
results in terms of executing for the specified execution time. If you don't 
have any strenuous objections, I would like to see this incorporated.

 Add time-based execution to cassandra-stress
 

 Key: CASSANDRA-7468
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7468
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Matt Kennedy
Priority: Minor
 Attachments: trunk-7468.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7468) Add time-based execution to cassandra-stress

2014-07-09 Thread Matt Kennedy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14056654#comment-14056654
 ] 

Matt Kennedy commented on CASSANDRA-7468:
-

Sure.

 Add time-based execution to cassandra-stress
 

 Key: CASSANDRA-7468
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7468
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Matt Kennedy
Priority: Minor
 Fix For: 2.1.1

 Attachments: trunk-7468.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7468) Add time-based execution to cassandra-stress

2014-06-30 Thread Matt Kennedy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047906#comment-14047906
 ] 

Matt Kennedy commented on CASSANDRA-7468:
-

Hm, that might be easy to do, but it isn't exactly obvious that it's possible. 
What if the patch were re-worked to expose the -d param to explicitly set 
duration, but instead of keeping a distinct timer thread, it's internally set 
to use the same execution path  it would have if n30 n30?

 Add time-based execution to cassandra-stress
 

 Key: CASSANDRA-7468
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7468
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Matt Kennedy
Priority: Minor
 Attachments: trunk-7468.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7468) Add time-based execution to cassandra-stress

2014-06-30 Thread Matt Kennedy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047956#comment-14047956
 ] 

Matt Kennedy commented on CASSANDRA-7468:
-

I picked -d for duration because I think people think of -t being associated 
with threads. s(amples)= could work, but we'd need to explain that there is a 
connection between samples and time (specifically a second), that isn't 
immediately obvious. The current help text for n: Run at least this many 
iterations before accepting uncertainty convergence doesn't make it clear that 
an iteration is a second.

If you just go with a completely separate parameter for duration, then there's 
no need to change the language of the other help text just to make the 
time-based use case more obvious.

But it's a simple feature, as long as I can _do_ time based runs, I'm not that 
fussed about how they get done.

 Add time-based execution to cassandra-stress
 

 Key: CASSANDRA-7468
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7468
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Matt Kennedy
Priority: Minor
 Attachments: trunk-7468.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-7468) Add time-based execution to cassandra-stress

2014-06-29 Thread Matt Kennedy (JIRA)
Matt Kennedy created CASSANDRA-7468:
---

 Summary: Add time-based execution to cassandra-stress
 Key: CASSANDRA-7468
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7468
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Matt Kennedy
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7468) Add time-based execution to cassandra-stress

2014-06-29 Thread Matt Kennedy (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Kennedy updated CASSANDRA-7468:


Attachment: trunk-7468.patch

 Add time-based execution to cassandra-stress
 

 Key: CASSANDRA-7468
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7468
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Matt Kennedy
Priority: Minor
 Attachments: trunk-7468.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-7416) Allow cassandra-stress to set timestamp for writes

2014-06-18 Thread Matt Kennedy (JIRA)
Matt Kennedy created CASSANDRA-7416:
---

 Summary: Allow cassandra-stress to set timestamp for writes
 Key: CASSANDRA-7416
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7416
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Matt Kennedy
Priority: Trivial


This is just a convenience for testing and bulk loading prior to a mixed 
workload.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7416) Allow cassandra-stress to set timestamp for writes

2014-06-18 Thread Matt Kennedy (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Kennedy updated CASSANDRA-7416:


Attachment: trunk-7416.txt

 Allow cassandra-stress to set timestamp for writes
 --

 Key: CASSANDRA-7416
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7416
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Matt Kennedy
Priority: Trivial
 Attachments: trunk-7416.txt


 This is just a convenience for testing and bulk loading prior to a mixed 
 workload.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7417) Allow network configuration on interfaces instead of addresses

2014-06-18 Thread Matt Kennedy (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Kennedy updated CASSANDRA-7417:


Attachment: trunk-7417.txt

 Allow network configuration on interfaces instead of addresses
 --

 Key: CASSANDRA-7417
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7417
 Project: Cassandra
  Issue Type: Improvement
  Components: Config, Core
Reporter: Matt Kennedy
Priority: Minor
 Attachments: trunk-7417.txt


 This patch adds two config elements to cassandra.yaml: listen_interface and 
 rpc_interface. 
 These can be used instead of their *_address counterparts to configure bind 
 the addresses C* listens on. This capability can drastically simplify some 
 deployment scenarios, especially in clouds which sometimes have quirky 
 automation capabilities.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7417) Allow network configuration on interfaces instead of addresses

2014-06-18 Thread Matt Kennedy (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Kennedy updated CASSANDRA-7417:


Description: 
This patch adds two config elements to cassandra.yaml: listen_interface and 
rpc_interface. 

These can be used instead of their _address counterparts to configure bind the 
addresses Cassandra listens on. This capability can drastically simplify some 
deployment scenarios, especially in clouds which sometimes have quirky 
automation capabilities.

  was:
This patch adds two config elements to cassandra.yaml: listen_interface and 
rpc_interface. 

These can be used instead of their *_address counterparts to configure bind the 
addresses C* listens on. This capability can drastically simplify some 
deployment scenarios, especially in clouds which sometimes have quirky 
automation capabilities.


 Allow network configuration on interfaces instead of addresses
 --

 Key: CASSANDRA-7417
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7417
 Project: Cassandra
  Issue Type: Improvement
  Components: Config, Core
Reporter: Matt Kennedy
Priority: Minor
 Attachments: trunk-7417.txt


 This patch adds two config elements to cassandra.yaml: listen_interface and 
 rpc_interface. 
 These can be used instead of their _address counterparts to configure bind 
 the addresses Cassandra listens on. This capability can drastically simplify 
 some deployment scenarios, especially in clouds which sometimes have quirky 
 automation capabilities.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-7417) Allow network configuration on interfaces instead of addresses

2014-06-18 Thread Matt Kennedy (JIRA)
Matt Kennedy created CASSANDRA-7417:
---

 Summary: Allow network configuration on interfaces instead of 
addresses
 Key: CASSANDRA-7417
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7417
 Project: Cassandra
  Issue Type: Improvement
  Components: Config, Core
Reporter: Matt Kennedy
Priority: Minor
 Attachments: trunk-7417.txt

This patch adds two config elements to cassandra.yaml: listen_interface and 
rpc_interface. 

These can be used instead of their *_address counterparts to configure bind the 
addresses C* listens on. This capability can drastically simplify some 
deployment scenarios, especially in clouds which sometimes have quirky 
automation capabilities.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (CASSANDRA-7417) Allow network configuration on interfaces instead of addresses

2014-06-18 Thread Matt Kennedy (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Kennedy updated CASSANDRA-7417:


Description: 
This patch adds two config elements to cassandra.yaml: listen_interface and 
rpc_interface. 

These can be used instead of their _address counterparts to bind the addresses 
Cassandra listens on. This capability can drastically simplify some deployment 
scenarios, especially in clouds which sometimes have quirky automation 
capabilities.

  was:
This patch adds two config elements to cassandra.yaml: listen_interface and 
rpc_interface. 

These can be used instead of their _address counterparts to configure bind the 
addresses Cassandra listens on. This capability can drastically simplify some 
deployment scenarios, especially in clouds which sometimes have quirky 
automation capabilities.


 Allow network configuration on interfaces instead of addresses
 --

 Key: CASSANDRA-7417
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7417
 Project: Cassandra
  Issue Type: Improvement
  Components: Config, Core
Reporter: Matt Kennedy
Priority: Minor
 Attachments: trunk-7417.txt


 This patch adds two config elements to cassandra.yaml: listen_interface and 
 rpc_interface. 
 These can be used instead of their _address counterparts to bind the 
 addresses Cassandra listens on. This capability can drastically simplify some 
 deployment scenarios, especially in clouds which sometimes have quirky 
 automation capabilities.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7306) Support edge dcs with more flexible gossip

2014-05-27 Thread Matt Kennedy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010231#comment-14010231
 ] 

Matt Kennedy commented on CASSANDRA-7306:
-

It should be noted that we can do some of this today by defining keyspaces that 
only have # of replicas  0 in some data centers. But, gossip still needs to 
function over all the nodes.

Hub  Spoke functionality is useful in situations where the spokes are 
geographically dispersed, potentially in areas with less than ideal network 
connections. Local clients should be able to read/write locally relevant data 
on small scale clusters and make progress even when completely disconnected 
from the mothership without having to worry about replicating back a lot of 
data from unrelated DCs, or having to be networked to DCs halfway across the 
planet just to gossip between nodes that are otherwise unrelated.

 Support edge dcs with more flexible gossip
 

 Key: CASSANDRA-7306
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7306
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Tupshin Harper
  Labels: ponies

 As Cassandra clusters get bigger and bigger, and their topology becomes more 
 complex, there is more and more need for a notion of hub and spoke 
 datacenters.
 One of the big obstacles to supporting hundreds (or thousands) of remote dcs, 
 is the assumption that all dcs need to talk to each other (and be connected 
 all the time).
 This ticket is a vague placeholder with the goals of achieving:
 1) better behavioral support for occasionally disconnected datacenters
 2) explicit support for custom dc to dc routing. A simple approach would be 
 an optional per-dc annotation of which other DCs that DC could gossip with.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-2853) cassandra-cli has backwards index status message

2011-07-03 Thread Matt Kennedy (JIRA)
cassandra-cli has backwards index status message


 Key: CASSANDRA-2853
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2853
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Matt Kennedy
Priority: Trivial


When a secondary index is building, the total bytes and processed bytes are 
swapped in the message.  Example:
Currently building index cf1, completed 12052040551 of 18047343 bytes.

The problem is a call to CompactionInfo constructor with swapped parameters.  
Patch to follow.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2853) cassandra-cli has backwards index status message

2011-07-03 Thread Matt Kennedy (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Kennedy updated CASSANDRA-2853:


Affects Version/s: 0.8.1

 cassandra-cli has backwards index status message
 

 Key: CASSANDRA-2853
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2853
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.1
Reporter: Matt Kennedy
Priority: Trivial
 Attachments: fix_idx_msg.patch


 When a secondary index is building, the total bytes and processed bytes are 
 swapped in the message.  Example:
 Currently building index cf1, completed 12052040551 of 18047343 bytes.
 The problem is a call to CompactionInfo constructor with swapped parameters.  
 Patch to follow.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2853) cassandra-cli has backwards index status message

2011-07-03 Thread Matt Kennedy (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Kennedy updated CASSANDRA-2853:


Attachment: fix_idx_msg.patch

 cassandra-cli has backwards index status message
 

 Key: CASSANDRA-2853
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2853
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.1
Reporter: Matt Kennedy
Priority: Trivial
 Attachments: fix_idx_msg.patch


 When a secondary index is building, the total bytes and processed bytes are 
 swapped in the message.  Example:
 Currently building index cf1, completed 12052040551 of 18047343 bytes.
 The problem is a call to CompactionInfo constructor with swapped parameters.  
 Patch to follow.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-2276) Pig memory issues with default LIMIT and large rows.

2011-03-07 Thread Matt Kennedy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13003657#comment-13003657
 ] 

Matt Kennedy commented on CASSANDRA-2276:
-

D'oh! I wrote it against a checkout of the 0.7.3 tag instead of trunk.  I'll 
port the changes to trunk tonight.  Sorry for the confusion.

 Pig memory issues with default LIMIT and large rows.
 

 Key: CASSANDRA-2276
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2276
 Project: Cassandra
  Issue Type: Improvement
  Components: Hadoop
Affects Versions: 0.7.0
Reporter: Matt Kennedy
Priority: Trivial
  Labels: hadoop, pig
 Fix For: 0.7.4

 Attachments: cassandrastorage.diff, cassandrastorage_2.diff

   Original Estimate: 1h
  Remaining Estimate: 1h

 Rows with a lot of columns, especially super-colums with a lot of values can 
 cause OutOfMemory errors in Cassandra when queried with Pig.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Updated: (CASSANDRA-2276) Pig memory issues with default LIMIT and large rows.

2011-03-07 Thread Matt Kennedy (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Kennedy updated CASSANDRA-2276:


Attachment: cassandrastorage3.diff

OK, third time's the charm, coded this one against trunk and just successfully 
applied it to a fresh check-out.

 Pig memory issues with default LIMIT and large rows.
 

 Key: CASSANDRA-2276
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2276
 Project: Cassandra
  Issue Type: Improvement
  Components: Hadoop
Affects Versions: 0.7.0
Reporter: Matt Kennedy
Priority: Trivial
  Labels: hadoop, pig
 Fix For: 0.7.4

 Attachments: cassandrastorage.diff, cassandrastorage3.diff, 
 cassandrastorage_2.diff

   Original Estimate: 1h
  Remaining Estimate: 1h

 Rows with a lot of columns, especially super-colums with a lot of values can 
 cause OutOfMemory errors in Cassandra when queried with Pig.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (CASSANDRA-2276) Pig memory issues with default LIMIT and large rows.

2011-03-05 Thread Matt Kennedy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13002972#comment-13002972
 ] 

Matt Kennedy commented on CASSANDRA-2276:
-

Only for the purposes of counting the super columns, no access to the 
subcolumns.

 Pig memory issues with default LIMIT and large rows.
 

 Key: CASSANDRA-2276
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2276
 Project: Cassandra
  Issue Type: Improvement
  Components: Hadoop
Affects Versions: 0.7.0
Reporter: Matt Kennedy
Priority: Trivial
  Labels: hadoop, pig
 Fix For: 0.7.4

 Attachments: cassandrastorage.diff, cassandrastorage_2.diff

   Original Estimate: 1h
  Remaining Estimate: 1h

 Rows with a lot of columns, especially super-colums with a lot of values can 
 cause OutOfMemory errors in Cassandra when queried with Pig.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Created: (CASSANDRA-2276) Pig memory issues with default LIMIT and large rows.

2011-03-04 Thread Matt Kennedy (JIRA)
Pig memory issues with default LIMIT and large rows.


 Key: CASSANDRA-2276
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2276
 Project: Cassandra
  Issue Type: Improvement
  Components: Contrib
Affects Versions: 0.7.3
Reporter: Matt Kennedy
Priority: Trivial


Rows with a lot of columns, especially super-colums with a lot of values can 
cause OutOfMemory errors in Cassandra when queried with Pig.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2276) Pig memory issues with default LIMIT and large rows.

2011-03-04 Thread Matt Kennedy (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Kennedy updated CASSANDRA-2276:


Attachment: cassandrastorage.diff

 Pig memory issues with default LIMIT and large rows.
 

 Key: CASSANDRA-2276
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2276
 Project: Cassandra
  Issue Type: Improvement
  Components: Contrib
Affects Versions: 0.7.3
Reporter: Matt Kennedy
Priority: Trivial
  Labels: hadoop, pig
 Attachments: cassandrastorage.diff

   Original Estimate: 1h
  Remaining Estimate: 1h

 Rows with a lot of columns, especially super-colums with a lot of values can 
 cause OutOfMemory errors in Cassandra when queried with Pig.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-2245) Enable map reduce to use indexes for ColumnFamilyInputFormat

2011-03-04 Thread Matt Kennedy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13002930#comment-13002930
 ] 

Matt Kennedy commented on CASSANDRA-2245:
-

I've taken a crack at coding this up, but I'm not thrilled with the results. I 
agree with Brandon that CASSANDRA-1600 is the best way to deal with this issue. 
 The get_indexed_slices method doesn't offer the parameter for a key_range that 
makes this useful for a map reduce job.  I'm reviewing that discussion at the 
moment to see if there is a way to get a patch for something like this 
functionality out prior to 0.8 without breaking the thrift API.

 Enable map reduce to use indexes for ColumnFamilyInputFormat
 

 Key: CASSANDRA-2245
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2245
 Project: Cassandra
  Issue Type: Improvement
  Components: Hadoop
Affects Versions: 0.7.2
 Environment: Cassandra 0.7 or later and Hadoop 0.20.1 or later
Reporter: Matt Kennedy
Priority: Minor
  Labels: hadoop
 Fix For: 0.8

   Original Estimate: 72h
  Remaining Estimate: 72h

 Enable the ability to run a MapReduce job that takes a value in an indexed 
 column as a parameter, and use that to select the data that the MapReduce job 
 operates on.  Right now, it looks like this isn't possible because 
 org.apache.cassandra.hadoop.ColumnFamilyRecordReader will only fetch data 
 with get_range_slices, not get_indexed_slices.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-2276) Pig memory issues with default LIMIT and large rows.

2011-03-04 Thread Matt Kennedy (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13002936#comment-13002936
 ] 

Matt Kennedy commented on CASSANDRA-2276:
-

Yeah, fair point.  It isn't really useful, I was just letting eclipse write 
code for me.

 Pig memory issues with default LIMIT and large rows.
 

 Key: CASSANDRA-2276
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2276
 Project: Cassandra
  Issue Type: Improvement
  Components: Hadoop
Affects Versions: 0.7.0
Reporter: Matt Kennedy
Priority: Trivial
  Labels: hadoop, pig
 Fix For: 0.7.4

 Attachments: cassandrastorage.diff

   Original Estimate: 1h
  Remaining Estimate: 1h

 Rows with a lot of columns, especially super-colums with a lot of values can 
 cause OutOfMemory errors in Cassandra when queried with Pig.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2276) Pig memory issues with default LIMIT and large rows.

2011-03-04 Thread Matt Kennedy (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Kennedy updated CASSANDRA-2276:


Attachment: cassandrastorage_2.diff

new patch reflecting Jonathan Ellis' comment.

 Pig memory issues with default LIMIT and large rows.
 

 Key: CASSANDRA-2276
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2276
 Project: Cassandra
  Issue Type: Improvement
  Components: Hadoop
Affects Versions: 0.7.0
Reporter: Matt Kennedy
Priority: Trivial
  Labels: hadoop, pig
 Fix For: 0.7.4

 Attachments: cassandrastorage.diff, cassandrastorage_2.diff

   Original Estimate: 1h
  Remaining Estimate: 1h

 Rows with a lot of columns, especially super-colums with a lot of values can 
 cause OutOfMemory errors in Cassandra when queried with Pig.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2276) Pig memory issues with default LIMIT and large rows.

2011-03-04 Thread Matt Kennedy (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Kennedy updated CASSANDRA-2276:


Attachment: (was: cassandrastorage_2.diff)

 Pig memory issues with default LIMIT and large rows.
 

 Key: CASSANDRA-2276
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2276
 Project: Cassandra
  Issue Type: Improvement
  Components: Hadoop
Affects Versions: 0.7.0
Reporter: Matt Kennedy
Priority: Trivial
  Labels: hadoop, pig
 Fix For: 0.7.4

 Attachments: cassandrastorage.diff

   Original Estimate: 1h
  Remaining Estimate: 1h

 Rows with a lot of columns, especially super-colums with a lot of values can 
 cause OutOfMemory errors in Cassandra when queried with Pig.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Updated: (CASSANDRA-2276) Pig memory issues with default LIMIT and large rows.

2011-03-04 Thread Matt Kennedy (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Kennedy updated CASSANDRA-2276:


Attachment: cassandrastorage_2.diff

Corrected patch for final limit.

 Pig memory issues with default LIMIT and large rows.
 

 Key: CASSANDRA-2276
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2276
 Project: Cassandra
  Issue Type: Improvement
  Components: Hadoop
Affects Versions: 0.7.0
Reporter: Matt Kennedy
Priority: Trivial
  Labels: hadoop, pig
 Fix For: 0.7.4

 Attachments: cassandrastorage.diff, cassandrastorage_2.diff

   Original Estimate: 1h
  Remaining Estimate: 1h

 Rows with a lot of columns, especially super-colums with a lot of values can 
 cause OutOfMemory errors in Cassandra when queried with Pig.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Created: (CASSANDRA-2245) Enable map reduce to use indexes for ColumnFamilyInputFormat

2011-02-24 Thread Matt Kennedy (JIRA)
Enable map reduce to use indexes for ColumnFamilyInputFormat


 Key: CASSANDRA-2245
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2245
 Project: Cassandra
  Issue Type: Improvement
  Components: Hadoop
Affects Versions: 0.7.2
 Environment: Cassandra 0.7 or later and Hadoop 0.20.1 or later
Reporter: Matt Kennedy
Priority: Minor
 Fix For: 0.8


Enable the ability to run a MapReduce job that takes a value in an indexed 
column as a parameter, and use that to select the data that the MapReduce job 
operates on.  Right now, it looks like this isn't possible because 
org.apache.cassandra.hadoop.ColumnFamilyRecordReader will only fetch data with 
get_range_slices, not get_indexed_slices.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Created: (CASSANDRA-2246) Enable Pig to use indexed data as described in CASSANDRA-2245

2011-02-24 Thread Matt Kennedy (JIRA)
Enable Pig to use indexed data as described in CASSANDRA-2245
-

 Key: CASSANDRA-2246
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2246
 Project: Cassandra
  Issue Type: Improvement
  Components: Contrib
Affects Versions: 0.7.2
Reporter: Matt Kennedy
Priority: Minor
 Fix For: 0.8


in contrib/pig, add query parameters to CassandraStorage keyspace/column family 
string to specify column search predicates.

For example:
rows = LOAD 'cassandra://mykeyspace/mycolumnfamily?country=UK' using 
CassandraStorage();

This depends on CASSANDRA-2245

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira