Re: Effective allocation of multiple disks

2010-03-12 Thread Ted Zlatanov
On Thu, 11 Mar 2010 12:01:27 -0600 Eric Evans eev...@rackspace.com wrote: EE On Wed, 2010-03-10 at 23:20 -0600, Jonathan Ellis wrote: On Wed, Mar 10, 2010 at 9:31 PM, Anthony Molinaro antho...@alumni.caltech.edu wrote: I would almost recommend just keeping things simple and removing

why have ColumnFamilies?

2010-03-03 Thread Ted Zlatanov
I don't understand the advantages of ColumnFamilies over a SuperColumnFamily with just one supercolumn. Why have the former if the latter is functionally equivalent? Thanks Ted

Re: finding Cassandra servers

2010-03-03 Thread Ted Zlatanov
On Wed, 3 Mar 2010 08:41:18 -0600 Gary Dusbabek gdusba...@gmail.com wrote: GD It wouldn't be a lot work for you to write a mdns service that would GD query the seeds for endpoints and publish it to interested clients. GD It could go in contrib. This requires knowledge of the seeds so I need to

Re: finding Cassandra servers

2010-03-03 Thread Ted Zlatanov
On Wed, 3 Mar 2010 09:32:33 -0600 Gary Dusbabek gdusba...@gmail.com wrote: GD 2010/3/3 Ted Zlatanov t...@lifelogs.com: This requires knowledge of the seeds so I need to at least look in storage-conf.xml to find them.  Are you saying there's no chance of Cassandra nodes (or just seeds

Re: finding Cassandra servers

2010-03-03 Thread Ted Zlatanov
On Wed, 03 Mar 2010 10:43:19 -0600 Eric Evans eev...@rackspace.com wrote: EE It's entirely possible that you've identified a problem that others EE can't see, or haven't yet encountered. I don't see it, but then maybe EE I'm just thick. Getting back to my original question, how do you (and

Re: finding Cassandra servers

2010-03-03 Thread Ted Zlatanov
On Wed, 3 Mar 2010 09:04:37 -0800 Ryan King r...@twitter.com wrote: RK Something like RRDNS is no more complex that managing a list of seed nodes. How do your clients at Twitter find server nodes? Do you just run them local to each node? My concern is that both RRDNS and seed node lists are

Re: finding Cassandra servers

2010-03-03 Thread Ted Zlatanov
On Wed, 3 Mar 2010 12:08:06 -0500 Ian Holsman i...@holsman.net wrote: IH We could create a branch or git fork where you guys could develop it, IH and if it reaches a usable state and others find it interesting it IH could get integrated in then Thanks, Ian. Would it be OK to do it as a patch

Re: finding Cassandra servers

2010-03-03 Thread Ted Zlatanov
randomly pick a set of seeds. CG Seeds can be per datacenter as well. As soon as a machine is CG decommissioned, it no longer gets picked as seed. On Wed, 3 Mar 2010 11:20:07 -0600 Brandon Williams dri...@gmail.com wrote: BW 2010/3/3 Ted Zlatanov t...@lifelogs.com My concern is that both RRDNS

Re: finding Cassandra servers

2010-03-03 Thread Ted Zlatanov
On Wed, 3 Mar 2010 09:35:31 -0800 Ryan King r...@twitter.com wrote: With seed node lists, if I get unlucky I'd be trying to hit a downed node in which case I may as well just use RRDNS and deal with connection failure from the start. RK Why would you not deal with connection failure? I mean

Re: cassandra freezes

2010-02-25 Thread Ted Zlatanov
On Thu, 25 Feb 2010 08:56:25 -0600 Jonathan Ellis jbel...@gmail.com wrote: JE Are you swapping? JE http://spyced.blogspot.com/2010/01/linux-performance-basics.html JE otherwise there's something wrong w/ your vm (?), disk i/o doesn't JE block incoming writes in cassandra If the user has enough

Re: Row with many columns

2010-02-18 Thread Ted Zlatanov
On Thu, 18 Feb 2010 00:44:17 +0300 ruslan usifov ruslan.usi...@gmail.com wrote: ru I have Table where 10 rows have 10 columns about 200 bytes in each ru column, So if read only this 10 records only nodes that have this rows does ru work, another nodes are idle. This bad, and cassandra

Re: easy interface to Cassandra

2010-02-02 Thread Ted Zlatanov
On Tue, 19 Jan 2010 08:09:13 -0600 Ted Zlatanov t...@lifelogs.com wrote: TZ My proposal is as follows: TZ - provide an IPluggableAPI interface; classes that implement it are TZ essentially standalone Cassandra servers. Maybe this can just TZ parallel Thread and implement Runnable. TZ

Re: How to unit test my code calling Cassandra with Thift

2010-01-25 Thread Ted Zlatanov
On Sun, 24 Jan 2010 13:56:07 +0200 Ran Tavory ran...@gmail.com wrote: RT On Sun, Jan 24, 2010 at 1:16 PM, gabriele renzi rff@gmail.com wrote: On Sun, Jan 24, 2010 at 11:02 AM, Ran Tavory ran...@gmail.com wrote: Here's the code I've just written over the weekend and started using in

Re: Too many open files

2010-01-22 Thread Ted Zlatanov
On Fri, 22 Jan 2010 11:27:09 +0100 Dr. Martin Grabmüller martin.grabmuel...@eleven.de wrote: MG The obvious call to lsof did not give me any insight (with 2271 being MG my Cassandra instance's pid): MG (env)cassan...@archive1:~$ lsof -p 2271|wc -l MG 101 MG Maybe the file limit is

Re: 'large' node configuration question

2010-01-21 Thread Ted Zlatanov
On Wed, 20 Jan 2010 21:14:27 -0600 Jonathan Ellis jbel...@gmail.com wrote: JE On Wed, Jan 20, 2010 at 9:10 PM, Phillip Michalak JE phil.micha...@digitalreasoning.com wrote: Does anyone have a recommendation for configuring cassandra on a cluster with 'large' nodes? i.e. multiple nodes, each

Re: 'large' node configuration question

2010-01-21 Thread Ted Zlatanov
On Thu, 21 Jan 2010 11:04:58 -0600 Jonathan Ellis jbel...@gmail.com wrote: JE 2010/1/21 Ted Zlatanov t...@lifelogs.com: Based on that, it seems like a good idea to enable the parallel or concurrent garbage collectors with large heaps.  We're looking at this at our site as well so I'm curious

Re: 'large' node configuration question

2010-01-21 Thread Ted Zlatanov
On Thu, 21 Jan 2010 14:47:41 -0600 Brandon Williams dri...@gmail.com wrote: BW 2010/1/21 Ted Zlatanov t...@lifelogs.com Also, maybe these options: -ea \ -Xdebug \ -XX:+HeapDumpOnOutOfMemoryError \ -Xrunjdwp:transport=dt_socket,server=y,address=,suspend=n \ should go in a debugging

Re: easy interface to Cassandra

2010-01-15 Thread Ted Zlatanov
On Thu, 14 Jan 2010 14:34:58 -0800 Tatu Saloranta tsalora...@gmail.com wrote: TS No specific proposal, or immediate need. But I do know that such TS short-hand notations / languages are popular for accessing structured TS data (xpath/xquery, oql, even sql). Sure. The idea is to make Cassandra

Re: easy interface to Cassandra

2010-01-14 Thread Ted Zlatanov
On Wed, 13 Jan 2010 13:22:02 -0800 Tatu Saloranta tsalora...@gmail.com wrote: TS I think there are 2 separate questions: TS (a) Would a path language make sense, and TS (b) How would that be exposed TS So I think more developers would be opposed to part (b) of exposing TS path queries using

Re: easy interface to Cassandra

2010-01-13 Thread Ted Zlatanov
On Wed, 13 Jan 2010 08:05:45 +1300 Michael Koziarski mich...@koziarski.com wrote: I see no value in pushing for ports of a Perl library to other languages instead of allowing each to grow its own idiomatic one. MK That's definitely the way to go, the Easy.pm magic strings look a MK little

Re: easy interface to Cassandra

2010-01-12 Thread Ted Zlatanov
On Sun, 10 Jan 2010 11:16:20 + Mark Robson mar...@gmail.com wrote: MR I can't see any reason to make an easy Cassandra interface, as the Thrift MR interface isn't really very difficult. Compare this (this is what the easy interface would look like in Java, wrapped in try/catch of course):

easy interface to Cassandra (was: EasyCassandra.pm Perl interface alpha 0.01)

2010-01-09 Thread Ted Zlatanov
I was wondering if it would make sense to add the pseudo-language EasyCassandra.pm uses right into Cassandra and expose it over Thrift. Here's a summary of the requests supported by this language: # read and remove requests: # X/[Y][A,B]: supercolumn family X, super column Y (not a

latest auth patch available in https://issues.apache.org/jira/browse/CASSANDRA-547

2009-12-29 Thread Ted Zlatanov
This patch adds auth support as previously discussed, while also patching Cassandra to support Thrift's new constructors. Only the AllowAll backend is currently provided. Note a newer libthrift, also uploaded there, is required. I have not tested backwards compatibility of old Thrift clients

EasyCassandra.pm Perl interface alpha 0.01

2009-12-28 Thread Ted Zlatanov
Attached is the first alpha (0.01) version of my EasyCassandra.pm Perl interface to Cassandra. I am also attaching a demo script that will show the intended usage, but basically the idea is that the user can express gets, puts, and removals in shorthand like Subscribed/-1[] to mean the latest

Re: Partition data - advantage and disadvantage

2009-12-28 Thread Ted Zlatanov
On Mon, 28 Dec 2009 08:07:18 -0700 Joe Stump j...@joestump.net wrote: JS The advantage of the random partitioner is that it randomly JS distributes your keys across the cluster. This (theoretically) JS avoids key clustering on nodes. The big disadvantage is that you JS can't do key range

Re: Partition data - advantage and disadvantage

2009-12-28 Thread Ted Zlatanov
On Mon, 28 Dec 2009 09:53:56 -0700 Joe Stump j...@joestump.net wrote: JS On Dec 28, 2009, at 9:51 AM, Ted Zlatanov wrote: If each node does a key enumeration, can the results be aggregated somehow? It seems useful to get a list of all the keys across the cluster even if it's not 100

Re: Partition data - advantage and disadvantage

2009-12-28 Thread Ted Zlatanov
On Mon, 28 Dec 2009 11:02:30 -0700 Joe Stump j...@joestump.net wrote: JS On Dec 28, 2009, at 11:00 AM, Ted Zlatanov wrote: Is this worth a JIRA feature request? Or is it something Cassandra will never support fully? From the user's perspective it's very useful. JS I don't know why it'd

Re: Partition data - advantage and disadvantage

2009-12-28 Thread Ted Zlatanov
On Mon, 28 Dec 2009 11:44:27 -0700 Joe Stump j...@joestump.net wrote: JS On Dec 28, 2009, at 11:40 AM, Ted Zlatanov wrote: I can see that's a problem. In my case, row keys represent switches in production so I don't expect more than a few hundred. An application can't find out how many

Re: Cassandra access control

2009-12-02 Thread Ted Zlatanov
On Tue, 01 Dec 2009 16:58:50 -0600 Eric Evans eev...@rackspace.com wrote: EE On Tue, 2009-12-01 at 15:38 -0600, Ted Zlatanov wrote: I disagree, why would you want to forbid switching the keyspace? That's turning off a currently working feature. Also, connections are not free, especially

Re: Cassandra access control

2009-12-02 Thread Ted Zlatanov
On Wed, 2 Dec 2009 20:54:13 + Mark Robson mar...@gmail.com wrote: MR How about we make authentication optional, and have the protocol being MR stateful only if you want to authenticate? MR That way we don't break backwards compatibility or introduce extra MR complexity for people who don't

Re: Cassandra access control

2009-12-02 Thread Ted Zlatanov
On Wed, 02 Dec 2009 14:35:09 -0600 Eric Evans eev...@rackspace.com wrote: EE On Wed, 2009-12-02 at 14:27 -0600, Ted Zlatanov wrote: On Wed, 02 Dec 2009 14:14:53 -0600 Eric Evans eev...@rackspace.com wrote: EE Did you maybe mean...? AuthenticationRequest required for the EE method (has

Re: Cassandra access control

2009-12-02 Thread Ted Zlatanov
On Wed, 2 Dec 2009 15:23:23 -0600 Jonathan Ellis jbel...@gmail.com wrote: JE It's really premature to be holding a vote based on JE first-impression opinions. Somehow we have to make a decision on whether the API will be stateful or stateless. This affects more than just the auth code so I

Re: Cassandra access control

2009-12-02 Thread Ted Zlatanov
On Wed, 2 Dec 2009 15:32:35 -0600 Jonathan Ellis jbel...@gmail.com wrote: JE 2009/12/2 Ted Zlatanov t...@lifelogs.com: I'd still rather pass something back.  As I said, it allows backends to maintain state when it makes sense to do so and can alleviate the problem of redundant auth queries

Re: Cassandra access control

2009-12-01 Thread Ted Zlatanov
On Tue, 01 Dec 2009 14:23:47 -0600 Eric Evans eev...@rackspace.com wrote: EE I'm personally not a big fan of the setKeyspace()/getKeyspace() idea. EE Getting rid of the keyspace argument makes sense because the keyspace is EE the highest (or lowest), level of the data-model so its implicit that

Re: Cassandra users survey

2009-11-24 Thread Ted Zlatanov
On Mon, 23 Nov 2009 23:30:51 -0500 Matt Revelle mreve...@gmail.com wrote: MR Are you both using timestamps as row keys? Would be great to hear MR more details. I'm using super column keys in a super column. So let's say your resource is routerA. Your data will be: Row routerA SuperColumn

Re: cassandra over hbase

2009-11-24 Thread Ted Zlatanov
On Mon, 23 Nov 2009 11:58:08 -0800 Jun Rao jun...@almaden.ibm.com wrote: JR After chatting with some Facebook guys, we realized that one potential JR benefit from using HDFS is that the recovery from losing partial data in a JR node is more efficient. Suppose that one lost a single disk at a

Re: Wish list [from users survey thread]

2009-11-24 Thread Ted Zlatanov
On Mon, 23 Nov 2009 13:45:09 -0600 Jonathan Ellis jbel...@gmail.com wrote: JE 1. Increment/decrement: atomic is a dirty word in a system JE emphasizing availability, but incr/decr can be provided in an JE eventually consistent manner with vector clocks. There are other JE possible approaches

Re: Cassandra access control

2009-11-24 Thread Ted Zlatanov
Looks like I could use: PAM auth: http://jpam.sourceforge.net/ LDAP/AD auth: http://www.openldap.org/jldap/ The first is definitely OK (Apache license), but I'm not sure about the second one (OpenLDAP public license). Looks BSDish to me. It claims to support Windows auth and is officially

Re: Cassandra users survey

2009-11-23 Thread Ted Zlatanov
On Fri, 20 Nov 2009 17:38:39 -0800 Dan Di Spaltro dan.dispal...@gmail.com wrote: DDS At Cloudkick we are using Cassandra to store monitoring statistics and DDS running analytics over the data. I would love to share some ideas DDS about how we set up our data-model, if anyone is interested.

Re: Cassandra access control

2009-11-23 Thread Ted Zlatanov
On Mon, 23 Nov 2009 12:22:37 -0600 Jonathan Ellis jbel...@gmail.com wrote: JE sysauth says it is GPL v2 (also not compatible) Hmm. I guess I have to reimplement SysAuth. At least the code is not terribly complicated, but it's a shame to reinvent the cart and the wheel. Ted

Re: Cassandra access control

2009-11-20 Thread Ted Zlatanov
On Thu, 12 Nov 2009 12:09:19 -0600 Ted Zlatanov t...@lifelogs.com wrote: TZ I created an issue: TZ https://issues.apache.org/jira/browse/CASSANDRA-547 TZ and will post updates there as needed. This is stage 1, meaning this is TZ the 0.5 work that will keep the old API. Stage 2 will remove

out of memory error on malformed Thrift protocol

2009-11-13 Thread Ted Zlatanov
The sequence to trigger the bug: 1) telnet to port 9160 2) type: s s (two letter+RET combinations) This reliably generates the error below. I did not debug further or try to fix it because it seems to be a Thrift issue. ERROR - Fatal exception in thread Thread[pool-1-thread-1,5,main]

Cassandra access control (was: bandwidth limiting Cassandra's replication and access control)

2009-11-12 Thread Ted Zlatanov
On Wed, 11 Nov 2009 16:14:09 -0800 Anthony Molinaro antho...@alumni.caltech.edu wrote: AM How will authentication work with non-java clients? I don't think thrift AM itself has authentication built in, and it sounds like a java library is AM being proposed for the guts. Will it still be

Re: bandwidth limiting Cassandra's replication and access control

2009-11-12 Thread Ted Zlatanov
On Thu, 12 Nov 2009 12:40:05 +1100 Ian Holsman i...@holsman.net wrote: IH most places i've seen don't use DB auth anywhere. there is a common IH login, stored in a property file, sometimes stored in a internally- IH world-readable SVN repo. In my current industry (financials) this is not

Re: bandwidth limiting Cassandra's replication and access control

2009-11-12 Thread Ted Zlatanov
On Wed, 11 Nov 2009 10:14:58 -0600 Eric Evans eev...@rackspace.com wrote: EE On Tue, 2009-11-10 at 16:25 -0600, Ted Zlatanov wrote: (BTW, I use Eclipse for Java development, is there a way to run the Ant tasks automatically to rebuild the generated source if necessary? It works fine otherwise

Cassandra access control (was: bandwidth limiting Cassandra's replication and access control)

2009-11-12 Thread Ted Zlatanov
On Wed, 11 Nov 2009 14:59:04 -0800 Coe, Robin robin@bluecoat.com wrote: CR Java's policy manager controls access to environment variables and CR code execution. All a JAAS service provides is a hook to pass a CR user's principal to the security manager. So, the only CR authorization you

Re: Cassandra access control

2009-11-12 Thread Ted Zlatanov
On Thu, 12 Nov 2009 10:23:21 -0600 Jonathan Mischo jmis...@quagility.com wrote: JM The problem I see with this is that you can't have a single connection JM accessing multiple keyspaces at once. I can think of some cases where JM having a single connection access and differentiate between two

Re: Cassandra access control

2009-11-12 Thread Ted Zlatanov
On Thu, 12 Nov 2009 10:49:59 -0600 Jonathan Ellis jbel...@gmail.com wrote: JE On Thu, Nov 12, 2009 at 10:42 AM, Jonathan Mischo jmis...@quagility.com wrote: Let's keep it simple.  Forcing multiple connections from a purely hypothetical use case is a no-brainer tradeoff.  Connections are not

Re: Cassandra access control

2009-11-12 Thread Ted Zlatanov
On Thu, 12 Nov 2009 10:59:52 -0600 Ted Zlatanov t...@lifelogs.com wrote: TZ On Thu, 12 Nov 2009 10:49:59 -0600 Jonathan Ellis jbel...@gmail.com wrote: JE On Thu, Nov 12, 2009 at 10:42 AM, Jonathan Mischo jmis...@quagility.com wrote: Let's keep it simple.  Forcing multiple connections from

Re: bandwidth limiting Cassandra's replication and access control

2009-11-11 Thread Ted Zlatanov
On Tue, 10 Nov 2009 17:09:44 -0600 Jonathan Ellis jbel...@gmail.com wrote: JE 2009/11/10 Ted Zlatanov t...@lifelogs.com: I see all the methods implementing the server interface in org.apache.cassandra.service.CassandraServer.  Is that where the authentication should happen?  Should I use JAAS

Re: [ANNOUNCE] Cassandra.gem 0.5

2009-11-11 Thread Ted Zlatanov
On Thu, 20 Aug 2009 14:37:55 -0700 Evan Weaver ewea...@gmail.com wrote: EW The Cassandra client gem 0.5 for Ruby is released! EW Highlights since the last ANNOUNCE: EW - gem name changed from cassandra_client to cassandra EW - bin/cassandra_helper script, to build and start the server for

Re: bandwidth limiting Cassandra's replication and access control

2009-11-11 Thread Ted Zlatanov
On Wed, 11 Nov 2009 07:40:00 -0800 Coe, Robin robin@bluecoat.com wrote: CR Just going to chime in here, because I have experience writing apps CR that use JAAS and JNDI to authenticate against LDAP and JDBC CR services. However, I only just started looking at Cassandra this CR week, so I'm

bandwidth limiting Cassandra's replication and access control

2009-11-10 Thread Ted Zlatanov
I'm evaluating Cassandra as a storage mechanism for monitoring data: machine and process status reports, inventory, etc. One of my concerns is bandwidth usage; I don't want Cassandra replication traffic swamping more important traffic. I want to know if there's a way to limit the bandwidth usage

Re: bandwidth limiting Cassandra's replication and access control

2009-11-10 Thread Ted Zlatanov
On Tue, 10 Nov 2009 13:53:58 -0600 Jonathan Ellis jbel...@gmail.com wrote: JE 2009/11/10 Ted Zlatanov t...@lifelogs.com: I also would really like a way to limit access to the Thrift interface with at least some rudimentary username/password combination.  I don't see a way to do that currently