Changing snitch from PropertyFile to Gossip

2016-04-24 Thread AJ
Is it possible to do this without down time i.e. run in mixed mode while doing a rolling upgrade?

Thrift row cache in Cassandra 2.1

2016-03-30 Thread AJ
clarify it, would be appreciated. Thanks, AJ

Re: Anyone using Facebook's flashcache?

2011-07-18 Thread AJ
). If your write rate is low it might work for you. Interesting. So, there is no segregation between read and write cache space? A compaction or flush can evict blocks in the read cache if it needs the space for write buffering? aj

Re: Anyone using Facebook's flashcache?

2011-07-18 Thread AJ
On 7/18/2011 12:08 PM, Héctor Izquierdo Seliva wrote: Interesting. So, there is no segregation between read and write cache space? A compaction or flush can evict blocks in the read cache if it needs the space for write buffering? There are two versions, the -wt (write through) that will

Re: Anyone using Facebook's flashcache?

2011-07-18 Thread AJ
performance boost worth the expense? Thanks, aj

Re: Anyone using Facebook's flashcache?

2011-07-17 Thread AJ
. Right now there's no way to avoid that. If you want, I could dig the numbers for a before/after comparison. Hector, some before/after numbers would be great if you can find them. Thanks! What happens when your cache gets trashed? Do compactions and flushes go slower? aj

Re: Strong Consistency with ONE read/writes

2011-07-12 Thread AJ
to any other replica node. But, then again, you also say that the leader does not forward replicas in your idea; so it's not real clear. I'm still trying to figure out how to make this work with normal Cass operation. aj On 7/11/2011 3:48 PM, Yang wrote: I'm not proposing any changes

Anyone using Facebook's flashcache?

2011-07-12 Thread AJ
developers have any thoughts on this and whether or not it would be helpful considering Cass' architecture and operation? Links: http://www.facebook.com/note.php?note_id=388112370932 https://github.com/facebook/flashcache/wiki aj

Re: Anyone using Facebook's flashcache?

2011-07-12 Thread AJ
On 7/12/2011 10:19 AM, Peter Schuller wrote: Do any Cass developers have any thoughts on this and whether or not it would be helpful considering Cass' architecture and operation? A well-functioning L2 cache should definitely be very useful with Cassandra for read-intensive workloads where the

Re: Anyone using Facebook's flashcache?

2011-07-12 Thread AJ
in (and discarding) the top 25% (250/1000GB) of the usual hot data. aj

Re: Strong Consistency with ONE read/writes

2011-07-12 Thread AJ
how to make this work with normal Cass operation. aj On 7/11/2011 3:48 PM, Yang wrote: I'm not proposing any changes to be done, but this looks like a very interesting topic for thought/hack/learning, so the following are only for thought exercises HBase enforces a single write/read entry

Feature Request: Multi-key Mapping

2011-07-10 Thread AJ
and better suited for range queries. Of course, some indirection would be needed to avoid the naive solution of simply duplicating values. Maybe Unix inodes is the best analogy here. aj

Re: Command Request: rename a column

2011-07-08 Thread AJ
On 7/8/2011 2:18 AM, Sylvain Lebresne wrote: On Fri, Jul 8, 2011 at 9:22 AM, AJa...@dude.podzone.net wrote: I think it would be really cool to be able to rename a column, or, more generally, a move command to move data from one column to another in the same CF without the client having to read

Re: Strong Consistency with ONE read/writes

2011-07-03 Thread AJ
value would probably make this work. But, I'll think more on this later. aj On 7/2/2011 10:55 AM, Yang wrote: there is a JIRA completed in 0.7.x that Prefers a certain node in snitch, so this does roughly what you want MOST of the time but the problem is that it does not GUARANTEE that the same

Re: Strong Consistency with ONE read/writes

2011-07-03 Thread AJ
:20 PM, AJ a...@dude.podzone.net mailto:a...@dude.podzone.net wrote: Yang, How would you deal with the problem when the 1st node responds success but then crashes before completely forwarding any replicas? Then, after switching to the next primary, a read would return stale data. Here's

Re: Strong Consistency with ONE read/writes

2011-07-03 Thread AJ
any different than a TWO write. I'm trying to save a hop (+ 1 data xfer) by ack'ing immediately after the primary successfully writes, i.e., ONE write. On Jul 3, 2011 11:20 AM, AJ a...@dude.podzone.net mailto:a...@dude.podzone.net wrote: Yang, How would you deal with the problem when the 1st

Re: Strong Consistency with ONE read/writes

2011-07-03 Thread AJ
. The point of the placeholders is to handle the crash case that I talked about... like a WAL does. But, C* will propagate the value to N-1 eventually anyways, 'cause that's just what it does anyways :-) will On Sun, Jul 3, 2011 at 7:47 PM, AJ a...@dude.podzone.net mailto:a...@dude.podzone.net wrote

Re: Strong Consistency with ONE read/writes

2011-07-03 Thread AJ
We seem to be having a fundamental misunderstanding. Thanks for your comments. aj On 7/3/2011 8:28 PM, William Oberman wrote: I'm using cassandra as a tool, like a black box with a certain contract to the world. Without modifying the core, C* will send the updates to all replicas, so your

Re: Strong Consistency with ONE read/writes

2011-07-02 Thread AJ
take-over as master, so you still have high availability. I'm not saying it's easy and I'm only coming at this from a customer request point of view. The question is, would this be useful if it could be added to Cass's bag of tricks? Cass is already a hybrid. aj On 7/2/2011 1:57 PM, Yang

Re: Strong Consistency with ONE read/writes

2011-07-02 Thread AJ
On 7/2/2011 6:03 AM, William Oberman wrote: Ok, I see the you happen to choose the 'right' node idea, but it sounds like you want to solve C* problems in the client, and they already wrote that complicated code to make clients simple. You're talking about reimplementing key-node mappings,

Re: Cassandra ACID

2011-07-01 Thread AJ
in cassandra.yaml. For some performance improvement with some cost in durability you can specify commitlog_sync: periodic. See discussion below for more details. Refs: Plenty + this thread. *From:* AJ [mailto:a

Strong Consistency with ONE read/writes

2011-07-01 Thread AJ
Is this possible? All reads and writes for a given key will always go to the same node from a client. It seems the only thing needed is to allow the clients to compute which node is the closes replica for the given key using the same algorithm C* uses. When the first replica receives the

Re: Strong Consistency with ONE read/writes

2011-07-01 Thread AJ
I'm saying I will make my clients forward the C* requests to the first replica instead of forwarding to a random node. -- Sent from my Android phone with K-9 Mail. Please excuse my brevity. Will Oberman ober...@civicscience.com wrote: Sent from my iPhone On Jul 1, 2011, at 9:53 PM, AJ

Re: Meaning of 'nodetool repair has to run within GCGraceSeconds'

2011-06-30 Thread AJ
It would be helpful if this was automated some how.

Re: Sharing Cassandra with Solandra

2011-06-29 Thread AJ
On 6/27/2011 3:39 PM, David Strauss wrote: On Mon, 2011-06-27 at 15:06 -0600, AJ wrote: Would anyone care to talk about their experiences with using Solandra along side another application that uses Cassandra (also on the same node)? I'm curious about any resource contention issues

Re: No Transactions: An Example

2011-06-29 Thread AJ
have to make sure the clean-up process processes the updates in order and only 1 time. If you can't guarantee these, then you'll have to make sure your updates are idempotent and commutative. Oh yeah, and you must use QUORUM read/writes, of course. Any critiques? aj

Re: Clock skew

2011-06-28 Thread AJ
the sleep is required even in a non-virtualized environment? Is it only needed when implementing some kind of lock? Does the type of lock make a difference? Thanks! aj (the other one) On 6/28/2011 11:31 AM, Dominic Williams wrote: Hi, yes you are correct, and this is a potential problem

Sharing Cassandra with Solandra

2011-06-27 Thread AJ
say that you have to run Solandra on every C* node in the ring. I'm not sure if I interpreted that correctly. Also, what's the index size to data size ratio to expect (ballpark)? How does it perform? Any caveats? Thanks! aj

Re: Auto compaction to be staggered ?

2011-06-27 Thread AJ
the same in my *Ideas for Big Data Support* thread, 5.) Postponed Major Compactions: The option to postpone auto-triggered major compactions until a pre-defined time of day or week or until staff can do it manually. aj

Cassandra ACID

2011-06-24 Thread AJ
Can any Cassandra contributors/guru's confirm my understanding of Cassandra's degree of support for the ACID properties? I provide official references when known. Please let me know if I missed some good official documentation. *Atomicity* All individual writes are atomic at the row level.

Re: Concurrency: Does C* support a Happened-Before relation between processes' writes?

2011-06-24 Thread AJ
On 6/24/2011 2:27 PM, Jonathan Ellis wrote: Might be able to do it with http://en.wikipedia.org/wiki/Lamport%27s_bakery_algorithm. It is remarkable that this algorithm is not built on top of some lower level atomic operation, e.g. compare-and-swap. I've been meaning to get back to reading

Re: Cassandra ACID

2011-06-24 Thread AJ
you can specify commitlog_sync: periodic. See discussion below for more details. Refs: Plenty + this thread. On 6/24/2011 1:46 PM, Jim Newsham wrote: On 6/23/2011 8:55 PM, AJ wrote: Can any Cassandra contributors/guru's confirm my understanding of Cassandra's degree of support for the ACID

Re: Concurrency: Does C* support a Happened-Before relation between processes' writes?

2011-06-24 Thread AJ
On 6/24/2011 2:27 PM, Jonathan Ellis wrote: Might be able to do it with http://en.wikipedia.org/wiki/Lamport%27s_bakery_algorithm. It is remarkable that this algorithm is not built on top of some lower level atomic operation, e.g. compare-and-swap. This looks like it may work. Jonathan, have

Re: No Transactions: An Example

2011-06-23 Thread AJ
On 6/23/2011 7:37 AM, Trevor Smith wrote: AJ, Thanks for your input. I don't fully follow though how this would work with a bank scenario. Could you explain in more detail? Thanks. Trevor I don't know yet. I'll be researching that. My working procedure is to figure out a way to handle

Re: Storing files in blob into Cassandra

2011-06-23 Thread AJ
On 6/22/2011 11:43 PM, Sasha Dolgy wrote: maybe you want to spend a few minutes reading about Haystack over at facebook to give you some ideas... https://www.facebook.com/note.php?note_id=76191543919 Not saying what they've done is the right way... just sayin' Thanks for the tip Sasha; will

Re: Atomicity Strategies

2011-06-22 Thread AJ
On 4/9/2011 7:52 PM, aaron morton wrote: My understanding of what they did with locking (based on the examples) was to achieve a level of transaction isolation http://en.wikipedia.org/wiki/Isolation_(database_systems) http://en.wikipedia.org/wiki/Isolation_%28database_systems%29 I think the

Re: No Transactions: An Example

2011-06-22 Thread AJ
I think Sasha's idea is worth studying more. Here is a supporting read referenced in the O'Reilly Cassandra book that talks about alternatives to 2-phase commit and synchronous transactions: http://www.eaipatterns.com/ramblings/18_starbucks.html If it can be done without locks and the

NTS Replication Strategy - only replicating to a subset of data centers

2011-06-22 Thread AJ
I'm just double-checking, but when using NTS, is it required to specify ALL the data centers in the strategy_options attribute? IOW, I do NOT want replication to ALL data centers; only a two of the three. So, if my property file snitch describes all of the existing data centers and nodes as:

Re: Atomicity Strategies

2011-06-22 Thread AJ
Thanks Aaron! On 6/22/2011 5:25 PM, aaron morton wrote: Atomic on a single machine yes. - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 23 Jun 2011, at 09:42, AJ wrote: On 4/9/2011 7:52 PM, aaron morton wrote: My understanding

Is LOCAL_QUORUM as strong as QUORUM?

2011-06-22 Thread AJ
Quorum read/writes guarantees consistency. But, when a keyspace spans multiple data centers, does local quorum read/writes also guarantee consistency? I'm thinking maybe not if two data centers get partitioned. Thanks!

Re: Is LOCAL_QUORUM as strong as QUORUM?

2011-06-22 Thread AJ
On 6/22/2011 5:56 PM, mcasandra wrote: LOCAL_QUORUM gurantees consistency in the local data center only. Other replica nodes in the same DC and other DC not part of the QUORUM will be eventually consistent. If you want to ensure consistency accross DCs you can use EACH_QUORUM but keep in mind

Re: Is LOCAL_QUORUM as strong as QUORUM?

2011-06-22 Thread AJ
On 6/22/2011 6:50 PM, AJ wrote: On 6/22/2011 5:56 PM, mcasandra wrote: LOCAL_QUORUM gurantees consistency in the local data center only. Other replica nodes in the same DC and other DC not part of the QUORUM will be eventually consistent. If you want to ensure consistency accross DCs you can

Re: Is LOCAL_QUORUM as strong as QUORUM?

2011-06-22 Thread AJ
On 6/22/2011 8:20 PM, mcasandra wrote: Well it depends on the requirements. If you use any combination of CL with EACH_QUORUM it means you are accepting the fact that you are ok if one of the DC is down. And in your scenario you care more about DCs being consistent even if writes were to fail.

Re: Storing files in blob into Cassandra

2011-06-22 Thread AJ
On 6/22/2011 1:07 AM, Damien Picard wrote: Hi, I have to store some files (Images, documents, etc.) for my users in a webapp. I use Cassandra for all of my data and I would like to know if this is a good idea to store these files into blob on a Cassandra CF ? Is there some contraindications,

Storing Accounting Data

2011-06-21 Thread AJ
Is C* suitable for storing customer account (financial) data, as well as billing, payroll, etc? This is a new company so migration is not an issue... starting from scratch. Thanks!

Re: Storing Accounting Data

2011-06-21 Thread AJ
the number of databases - Stephen --- Sent from my Android phone, so random spelling mistakes, random nonsense words and other nonsense are a direct result of using swype to type on the screen On 21 Jun 2011 18:30, AJ a...@dude.podzone.net mailto:a...@dude.podzone.net wrote:

Re: Storing Accounting Data

2011-06-21 Thread AJ
, 2011 at 2:03 PM, AJ a...@dude.podzone.net mailto:a...@dude.podzone.net wrote: On 6/21/2011 2:50 PM, Stephen Connolly wrote: how important are things like transactional consistency for you? would you have issues if only one side of a transfer was recorded? Right. Both

Re: Storing Accounting Data

2011-06-21 Thread AJ
And I was thinking of using JTA for transaction processing. I have no experience with it but on the surface it looks like it should work. On 6/21/2011 3:31 PM, AJ wrote: What's the best accepted way to handle that 100% in the client? Retries? On 6/21/2011 3:14 PM, Anand Somani wrote

Re: Storing Accounting Data

2011-06-21 Thread AJ
On 6/21/2011 3:14 PM, Anand Somani wrote: Not sure if it is that simple, a quorum can fail with writes happening on some nodes (there is no rollback). Also there is no concept of atomic compare-and-swap. Good points. I suppose what I need is for the client to implement the part of ACID

Re: Storing Accounting Data

2011-06-21 Thread AJ
. But, I just rediscovered Dominic's Cages http://code.google.com/p/cages/. Has anyone tried it? --- Sent from my Android phone, so random spelling mistakes, random nonsense words and other nonsense are a direct result of using swype to type on the screen On 21 Jun 2011 22:04, AJ

Re: Docs: Token Selection

2011-06-17 Thread AJ
Thanks Jonathan. I assumed since each data center owned the full key space that the first replica would be stored in the dc of the coordinating node, the 2nd in another dc, and the 3rd+ back in the 1st dc. But, are you saying that the first endpoint is selected regardless of the location of

Re: Docs: Token Selection

2011-06-17 Thread AJ
On 6/17/2011 7:26 AM, William Oberman wrote: I haven't done it yet, but when I researched how to make geo-diverse/failover DCs, I figured I'd have to do something like RF=6, strategy = {DC1=3, DC2=3}, and LOCAL_QUORUM for reads/writes. This gives you an ack after 2 local nodes do the

Re: Docs: Token Selection

2011-06-17 Thread AJ
+1 Yes, that is what I'm talking about Eric. Maybe I could write my own strategy, I dunno. I'll have to understand more first. On 6/17/2011 10:37 AM, Sasha Dolgy wrote: +1 for this if it is possible... On Fri, Jun 17, 2011 at 6:31 PM, Eric tammeeta...@gmail.com wrote: What I don't like

Re: Docs: Token Selection

2011-06-17 Thread AJ
Hi Jeremiah, can you give more details? Thanks On 6/17/2011 10:49 AM, Jeremiah Jordan wrote: Run two Cassandra clusters... -Original Message- From: Eric tamme [mailto:eta...@gmail.com] Sent: Friday, June 17, 2011 11:31 AM To: user@cassandra.apache.org Subject: Re: Docs: Token

Re: Docs: Token Selection

2011-06-17 Thread AJ
On 6/17/2011 12:33 PM, Eric tamme wrote: As i said previously, trying to build make cassandra treat things differently based on some kind of persistent locality set it maintains in memory .. or whatever .. sounds like you will be absolutely undermining the core principles of how cassandra

Re: Docs: Token Selection

2011-06-17 Thread AJ
On 6/17/2011 12:32 PM, Jeremiah Jordan wrote: Run two clusters, one which has {DC1:2, DC2:1} and one which is {DC1:1,DC2:2}. You can't have both in the same cluster, otherwise it isn't possible to tell where the data got written when you want to read it. For a given key XYZ you must be

Re: Docs: Token Selection

2011-06-17 Thread AJ
On 6/17/2011 1:27 PM, Sasha Dolgy wrote: Replication factor is defined per keyspace if i'm not mistaken. Can't remember if NTS is per keyspace or per cluster ... if it's per keyspace, that would be a way around it ... without having to maintain multiple clusters just have multiple

Re: Docs: Token Selection

2011-06-16 Thread AJ
LOL, I feel Eric's pain. This double-ring thing can throw you for a loop since, like I said, there is only one place it is documented and it is only *implied*, so one is not sure he is interpreting it correctly. Even the source for NTS doesn't mention this. Thanks for everyone's help on

Re: Docs: Token Selection

2011-06-16 Thread AJ
wrote: AJ, sorry I seemed to miss the original email on this thread. As Aaron said, when computing tokens for multiple data centers, you should compute them independently for each data center - as if it were its own Cassandra cluster. You can have overlapping token ranges between multiple data

Propose new ConsistencyLevel.ALL_AVAIL for reads

2011-06-16 Thread AJ
Good morning all. Hypothetical Setup: 1 data center RF = 3 Total nodes 3 Problem: Suppose I need maximum consistency for one critical operation; thus I specify CL = ALL for reads. However, this will fail if only 1 replica endpoint is down. I don't see why this fail is necessary all of the

Re: Propose new ConsistencyLevel.ALL_AVAIL for reads

2011-06-16 Thread AJ
On 6/16/2011 10:05 AM, Ryan King wrote: I don't think this buys you anything that you can't get with quorum reads and writes. -ryan QUORUM = ALL_AVAIL = ALL == RF

Re: Propose new ConsistencyLevel.ALL_AVAIL for reads

2011-06-16 Thread AJ
On 6/16/2011 10:58 AM, Dan Hendry wrote: I think this would add a lot of complexity behind the scenes and be conceptually confusing, particularly for new users. I'm not so sure about this. Cass is already somewhat sophisticated and I don't see how this could trip-up anyone who can already

Re: Propose new ConsistencyLevel.ALL_AVAIL for reads

2011-06-16 Thread AJ
On 6/16/2011 2:37 PM, Ryan King wrote: On Thu, Jun 16, 2011 at 1:05 PM, AJa...@dude.podzone.net wrote: snip The Cassandra consistency model is pretty elegant and this type of approach breaks that elegance in many ways. It would also only really be useful when the value has a high

Re: Propose new ConsistencyLevel.ALL_AVAIL for reads

2011-06-16 Thread AJ
UPDATE to my suggestion is below. On 6/16/2011 5:50 PM, Ryan King wrote: On Thu, Jun 16, 2011 at 2:12 PM, AJa...@dude.podzone.net wrote: On 6/16/2011 2:37 PM, Ryan King wrote: On Thu, Jun 16, 2011 at 1:05 PM, AJa...@dude.podzone.netwrote: snip The Cassandra consistency model is pretty

Re: Propose new ConsistencyLevel.ALL_AVAIL for reads

2011-06-16 Thread AJ
On 6/16/2011 7:56 PM, Dan Hendry wrote: How would your solution deal with complete network partitions? A node being 'down' does not actually mean it is dead, just that it is unreachable from whatever is making the decision to mark it 'down'. Following from Ryan's example, consider nodes A, B,

Re: Propose new ConsistencyLevel.ALL_AVAIL for reads

2011-06-16 Thread AJ
On 6/16/2011 9:36 PM, Dan Hendry wrote: Help me out here. I'm trying to visualize a situation where the clients can access all the C* nodes but the nodes can't access each other. I don't see how that can happen on a regular ethernet subnet in one data center. Well, Im sure there is a case

Re: Docs: Token Selection

2011-06-16 Thread AJ
On 6/16/2011 9:45 PM, aaron morton wrote: But, I'm thinking about using OldNetworkTopStrat. NetworkTopologyStrategy is where it's at. Oh yeah? It didn't look like it would serve my requirements. I want 2 full production geo-diverse data centers with each serving as a failover for the

Re: New web client future API

2011-06-15 Thread AJ
Nice interface... and someone has good taste in music. BTW, I'm new to web programming, what did you use for the web components? JSF, JavaScript, something else? On 6/14/2011 7:42 AM, Markus Wiesenbacher | Codefreun.de wrote: Hi, what is the future API for Cassandra? Thrift, Avro, CQL? I

Re: Where is my data?

2011-06-15 Thread AJ
Thanks On 6/15/2011 3:20 AM, Sylvain Lebresne wrote: You can use the thrift call describe_ring(). It will returns a map that associate to each range of the ring who is a replica. Once any range has all it's endpoint unavailable, that range of the data is unavailable. -- Sylvain

Re: cascading failures due to memory

2011-06-15 Thread AJ
Sasha, Did you ever nail down the cause of this problem? On 5/31/2011 4:01 AM, Sasha Dolgy wrote: hi everyone, the current nodes i have deployed (4) have all been working fine, with not a lot of data ... more reads than writes at the moment. as i had monitoring disabled, when one node's OS

Re: Forcing Cassandra to free up some space

2011-06-15 Thread AJ
In regards to cleaning-up old sstable files, I posed this question before as I noticed after taking a snapshot, the older files (pre-compaction) shared no links with the snapshots. Therefore, (if the Cass snapshot functionality is working correctly) those older files can be manually deleted.

Re: Docs: Token Selection

2011-06-15 Thread AJ
DC1 Node 1 : token 0 DC1 Node 2 : token 8.. DC2 Node 1 : token 4.. DC2 Node 1 : token 12.. or DC1 Node 1 : token 0 DC1 Node 2 : token 1.. DC2 Node 1 : token 8.. DC2 Node 1 : token 7.. Regards, /VJ On Wed, Jun 15, 2011 at 12:28 PM, AJ a...@dude.podzone.net mailto:a...@dude.podzone.net

Re: Docs: Token Selection

2011-06-15 Thread AJ
On Wed, Jun 15, 2011 at 2:34 PM, AJ a...@dude.podzone.net mailto:a...@dude.podzone.net wrote: Vijay, thank you for your thoughtful reply. Will Cass complain if I don't setup my tokens like in the examples? On 6/15/2011 2:41 PM, Vijay wrote: All you heard is right... You

Re: Is this the proper use of OPP?

2011-06-14 Thread AJ
Thanks. I found that article later. I was definitely off-base with respect to OPP. Random partitioning is pretty much the way to go and datastax has a good article on geographic distribution: http://www.datastax.com/docs/0.8/operations/datacenter Sorry for the long pointless post

Where is my data?

2011-06-14 Thread AJ
Is there an official deterministic formula to compute the various subsets of a given cluster that comprises a complete set of data (redundant rows ok)? IOW, if multiple nodes become unavailable one at a time, at what point can I say 100% of my data is available? Obviously, the method would

Docs: Token Selection

2011-06-14 Thread AJ
This http://wiki.apache.org/cassandra/Operations#Token_selection says: With NetworkTopologyStrategy, you should calculate the tokens the nodes in each DC independantly. and gives the example: DC1 node 1 = 0 node 2 = 85070591730234615865843651857942052864 DC2 node 3 = 1 node 4 =

Re: Docs: Why do deleted keys show up during range scans?

2011-06-14 Thread AJ
the columns, not the whole row. A row key will not be forgotten/deleted until there are no columns or tombstones which reference it. Until there are no references to that row key in any SSTables you can still get that key back from the API. -Jeremiah -Original Message- From: AJ [mailto:a

Re: Docs: Token Selection

2011-06-14 Thread AJ
Owns 50% (Ranges 0 - 1 1 - 8..4) DC2 Node1 Owns 50% (Ranges 8..5 - 0 0 - 1) Node2 Owns 50% (Ranges 1 - 8..4 8..4 - 8..5) Regards, /VJ On Tue, Jun 14, 2011 at 3:47 PM, AJ a...@dude.podzone.net mailto:a...@dude.podzone.net wrote: This http://wiki.apache.org/cassandra

Re: SSL Streaming

2011-06-13 Thread AJ
Performance-wise, I think it would be better to just let the client encrypt sensitive data before storing it, versus encrypting all traffic all the time. If individual values are encrypted, then they don't have to be encrypted/decrypted during transit between nodes during the initial updates

Docs: Why do deleted keys show up during range scans?

2011-06-13 Thread AJ
http://wiki.apache.org/cassandra/FAQ#range_ghosts So to special case leaving out result entries for deletions, we would have to check the entire rest of the row to make sure there is no undeleted data anywhere else either (in which case leaving the key out would be an error). The above

Re: Docs: Why do deleted keys show up during range scans?

2011-06-13 Thread AJ
On 6/13/2011 7:03 AM, Stephen Connolly wrote: It returns the set of columns for the set of rows... how do you determine the difference between a completely empty row and a row that just does not have any of the matching columns? I would expect it to not return anything (no row at all) for both

Re: Docs: Why do deleted keys show up during range scans?

2011-06-13 Thread AJ
On 6/13/2011 9:25 AM, Stephen Connolly wrote: On 13 June 2011 16:14, AJa...@dude.podzone.net wrote: On 6/13/2011 7:03 AM, Stephen Connolly wrote: It returns the set of columns for the set of rows... how do you determine the difference between a completely empty row and a row that just does

Re: Docs: Why do deleted keys show up during range scans?

2011-06-13 Thread AJ
On 6/13/2011 10:14 AM, Stephen Connolly wrote: store the query inverted. that way empty - deleted I don't know what that means... get the other columns? Can you elaborate? Is there docs for this or is this a hack/workaround? the tombstones are stored for each column that had data

Consistency Levels and Replication with Down Nodes

2011-06-10 Thread AJ
The O'Reilly book on Cass says this about READ consistency level ALL: Query all nodes. Wait for all nodes to respond, and return to the client the record with the most recent timestamp. Then, if necessary, perfrom a read repair in the background. If any nodes fail or respond, fail the read

Where is the Overview Documentation on Counters?

2011-06-10 Thread AJ
I can't find any that gives an overview of their purpose/benefits/etc, only how to code them. I can only guess that they are more efficient for some reason but don't know exactly why or exactly what conditions I would choose to use them over a regular column. Thanks!

Ideas for Big Data Support

2011-06-09 Thread AJ
[Please feel free to correct me on anything or suggest other workarounds that could be employed now to help.] Hello, This is purely theoretical, as I don't have a big working cluster yet and am still in the planning stages, but from what I understand, while Cass scales well horizontally,

Re: Ideas for Big Data Support

2011-06-09 Thread AJ
On 6/9/2011 8:40 AM, Edward Capriolo wrote: Some of these things are challenges, and a few are being worked on in one way or another. 1) Dynamic snitch was implemented to determine slow acting nodes and re-balance load. 2) You can budget bootstrap with rsync, as long as you know what

Misc Performance Questions

2011-06-08 Thread AJ
Is there a performance hit when dropping a CF? What if it contains .5 TB of data? If not, is there a quick and painless way to drop a large amount of data w/minimal perf hit? Is there a performance hit running multiple keyspaces on a cluster versus only one keyspace given a constant total

Re: Misc Performance Questions

2011-06-08 Thread AJ
Thank you Richard! On 6/8/2011 2:57 AM, Richard Low wrote: snip There is however a difference in running multiple column families versus putting everything in the same column family and separating them with e.g. a key prefix. E.g. if you have a large data set and a small one, it will be

Re: CLI set command returns null, ver 0.8.0

2011-06-08 Thread AJ
On 6/8/2011 4:37 PM, aaron morton wrote: Can you provide the cli script to create the schema and info on how many nodes you have. Thanks - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 8 Jun 2011, at 16:12, AJ wrote: Can anyone

Re: Backups, Snapshots, SSTable Data Files, Compaction

2011-06-07 Thread AJ
On 6/6/2011 11:25 PM, Benjamin Coverston wrote: Currently, my data dir has about 16 sets. I thought that compaction (with nodetool) would clean-up these files, but it doesn't. Neither does cleanup or repair. You're not even talking about snapshots using nodetool snapshot yet. Also nodetool

Re: Backups, Snapshots, SSTable Data Files, Compaction

2011-06-07 Thread AJ
Thanks to everyone who responded thus far. On 6/7/2011 10:16 AM, Benjamin Coverston wrote: snip Not to say that there aren't workloads where having many TB/Node doesn't work, but if you're planning to read from the data you're writing you do want to ensure that your working set is stored in

Re: CLI set command returns null

2011-06-07 Thread AJ
The log only shows INFO level messages about flushes, etc.. The debug mode of the CLI shows an exception after the set: [al@mars ~]$ cassandra-cli -h 192.168.1.101 --debug Connected to: Test Cluster on 192.168.1.101/9160 Welcome to the Cassandra CLI. Type 'help;' or '?' for help. Type 'quit;'

Re: CLI set command returns null, ver 0.8.0

2011-06-07 Thread AJ
(CliMain.java:345) [default@Keyspace1] Granted, there are no rows in the CF yet (see probs below), but this exception seems to be during the parsing stage. I've check everything else, AFAIK, so I'm at a loss. Much obliged. On 6/7/2011 12:44 PM, AJ wrote: The log only shows INFO level messages

Backups, Snapshots, SSTable Data Files, Compaction

2011-06-06 Thread AJ
Hi, I am working on a backup strategy and am trying to understand what is going on in the data directory. I notice that after a write to a CF and then flush, a new set of data files are created with an index number incremented in their names, such as: Initially: Users-e-1-Filter.db

Re: Occasional 10s Timeouts on Read

2010-06-19 Thread AJ Slater
be better. AJ On Fri, Jun 18, 2010 at 8:16 PM, Jonathan Ellis jbel...@gmail.com wrote: set log level to TRACE and see if the OutboundTcpConnection is going bad.  that would explain the message never arriving. On Fri, Jun 18, 2010 at 10:39 AM, AJ Slater a...@zuno.com wrote: To summarize

Re: Occasional 10s Timeouts on Read

2010-06-19 Thread AJ Slater
querying its peers when it should, or timing out trying to do so. When it finally realizes theres an error, it resets something? And then we're back in business? I'm going to be offline for 48 hours. AJ On Sat, Jun 19, 2010 at 8:09 PM, AJ Slater a...@zuno.com wrote: Agreed. But those connection

Re: Occasional 10s Timeouts on Read

2010-06-18 Thread AJ Slater
and I'd see that. AJ On Thu, Jun 17, 2010 at 2:26 PM, AJ Slater a...@zuno.com wrote: These are physical machines. storage-conf.xml.fs03 is here: http://pastebin.com/weL41NB1 Diffs from that for the other two storage-confs are inline here: a...@worm:../Z3/cassandra/conf/dev$ diff storage

Occasional 10s Timeouts on Read

2010-06-17 Thread AJ Slater
useful and I'll be on #cassandra as ajslater. Much thanks for taking a look and any suggestions. We fear we'll have to abandon Cassandra if this bug cannot be resolved. AJ

Re: Occasional 10s Timeouts on Read

2010-06-17 Thread AJ Slater
Cassandra 0.6.2 from the apache debian source. Ubunutu Jaunty. Sun Java6 jvm. All nodes in separate racks at 365 main. On Thu, Jun 17, 2010 at 10:12 AM, AJ Slater a...@zuno.com wrote: I'm seing 10s timeouts on reads few times a day. Its hard to reproduce consistently but seems to happen most

  1   2   >