Re: about key sorting and token partitioning

2010-11-10 Thread Peter Schuller
om partitioner, while having reasonable efficiency on range slices by making each row contain a pretty large range such that any additional overhead in jumping across nodes is negligible in comparison to the other work done. -- / Peter Schuller

Re: SSD vs. HDD

2010-11-04 Thread Peter Schuller
CPU bound rather than being close to saturate disk. In either case, please do report back as it's interesting to figure out what kind of performance issues people are seeing. -- / Peter Schuller

Re: using jna.jar "Unknown mlockall error 0"

2010-10-28 Thread Peter Schuller
e future. -- / Peter Schuller

Re: Config Maximum heap size for Cassandra

2010-10-28 Thread Peter Schuller
n operation. How that translates to "used" memory from the perspective of the operating system will largely depend on JVM settings. -- / Peter Schuller

Re: What happens if there is a collision?

2010-10-26 Thread Peter Schuller
value based conflict resolution if you are expecting a set of column updates to either apply or not apply as a group (with respect to some other group of updates). That's a bit subtle. Also on the topic of granularity, entire super columns and entire rows may be deleted without individually referring to all columns. In those cases, deletes span entire rows or supercolumns rather than individual columns. -- / Peter Schuller

Re: What happens if there is a collision?

2010-10-25 Thread Peter Schuller
el clients?) But of course, there is no such guarantee in the distributed sense either way. -- / Peter Schuller

Re: What happens if there is a collision?

2010-10-25 Thread Peter Schuller
ink this should be documented, because engineers will hit that 'local' > undeterministic issue for sure if two instances of their applications > perform 'completed writes' in the same column family. Completed does not > mean successful, even with quorum (or ALL). They ought to know it. I think it does. I believe the results you are describing as unexpected are fully expected fundamentally, and there is no real difference implied in receiving a timestamp ACK flag back. I'm totally open to being wrong or having misunderstood something (or both), but right now I don't see it. If on the other hand I'm not wrong then perhaps we can figure out how to document or present the functionality of Cassandra better :) -- / Peter Schuller

Re: Benchmarking & Testing

2010-10-25 Thread Peter Schuller
output etc. But how to interpret them will be very dependent on what you're doing. How's that for a non-answer? :) -- / Peter Schuller

Re: What happens if there is a collision?

2010-10-21 Thread Peter Schuller
anyway just after you did your read). Perhaps I'm misunderstanding what you're trying to do? -- / Peter Schuller

Re: What happens if there is a collision?

2010-10-21 Thread Peter Schuller
set, or it was given the right a bit later in the background, or it happened during anti-entropy etc. -- / Peter Schuller

Re: Preventing an update of a CF row

2010-10-20 Thread Peter Schuller
r need to remove a value and then re-write it, you have to up the timestamp on your inserts in order for it to be re-inserted. Same thing if you sometimes need to actually overwrite said value in some different context than the concurrent writer case you're trying to solve. -- / Peter Schuller

Re: How fast does compaction run?

2010-10-20 Thread Peter Schuller
e more disk bound. Unless you have a problem with compaction speed not staying caught up with write speeds, this is not necessarily bad as it limits the impact of compaction on disk I/O (i.e., less negative effects for real traffic assuming real traffic goes down to disk and isn't entirely in-memory). -- / Peter Schuller

Re: KeysCached - clarification and recommendations on settings

2010-10-18 Thread Peter Schuller
cumstances, the key cache should likely not be huge in comparison to available memory. -- / Peter Schuller

Re: Preventing an update of a CF row

2010-10-16 Thread Peter Schuller
hat it is trying to do that you want this behavior for? -- / Peter Schuller

Re: Does Cassandra fit my requirements?

2010-10-16 Thread Peter Schuller
 And I > apologize in advance if I've missed something obvious, I have all of about > 1.5 days experience and I'm trying to learn fast ;-) Does both A and B agree about the ring state (i.e., they both know about both nodes) prior to you shutting one of them down? -- / Peter Schuller

Re: Nodes getting slowed down after a few days of smooth operation

2010-10-11 Thread Peter Schuller
means that XXX out of YYY bytes in the heap is live). (I am making *some* simplifications here because the concurrent nature of CMS means that you never get a snapshot view; but for practical purposes you can consider the above to be true with Cassandra.) Similar lines for "ParNew" you can mostly ignore for the purpose of monitoring heap usage unless you specifically know what you're looking for in those. The ConcurrentMarkSweep ones are what tell you what the actual amount of live data in the heap is. -- / Peter Schuller

Re: Nodes getting slowed down after a few days of smooth operation

2010-10-11 Thread Peter Schuller
o make the assumption that activley used data fits in RAM will severely affect the hardware requirements for serving your load. -- / Peter Schuller

Re: Nodes getting slowed down after a few days of smooth operation

2010-10-11 Thread Peter Schuller
disable the memtable flush part of startup? While I don't argue against such a change, one argument for *having* the flush is that subsequent restarts become faster and it might help if you're in some kind of situation where nodes continually die shortly after having been started. (But I total

Re: Dazed and confused with Cassandra on EC2 ...

2010-10-09 Thread Peter Schuller
full GC for the purpose of shrinkage. -- / Peter Schuller

Re: Dazed and confused with Cassandra on EC2 ...

2010-10-09 Thread Peter Schuller
ficient with larger heap sizes and in order to avoid bad policy decisions of the GC causing excessive GC activity rather than just growing the heap which you're fine with anyway. -- / Peter Schuller

Re: using jna.jar "Unknown mlockall error 0"

2010-10-09 Thread Peter Schuller
L_FUTURE (Cassandra does the former). -- / Peter Schuller

Re: Newbie Question about restarting Cassandra

2010-10-08 Thread Peter Schuller
omments). Can anyone confirm/deny that this is an intended guarantee? -- / Peter Schuller

Re: Dazed and confused with Cassandra on EC2 ...

2010-10-07 Thread Peter Schuller
at much less tweaking of heap sizes was necessary, but this is not the case.) -- / Peter Schuller

Re: Tuning cassandra to use less memory

2010-10-07 Thread Peter Schuller
ntly useful (i.e., lots of truly unused stuff that is good to keep swapped out). -- / Peter Schuller

Re: Heap Settings suggestions

2010-10-07 Thread Peter Schuller
here are "Y" keyspaces, Is > it recommended to allocate "XY" as the max  Heap size ?  Please let me know. Yes. Each column family will have a memtable subject to the configured memory constraints; whether or not they are in different keyspaces does not matter. -- / Peter Schuller

Re: Advice on settings

2010-10-07 Thread Peter Schuller
of the data to inspect when doing read repair. > Will this result in better read performance? Sorry, I did the impolite thing and began responding before having read your entire E-Mail ;) So yes, a low RF would increase read performance, but assuming you care about data redundancy the better way to achieve that effect is probably to decrease or disable read repair. -- / Peter Schuller

Re: Retaining commit logs

2010-10-06 Thread Peter Schuller
> PS. Are other ppl interested in this functionality ? > I could file it to JIRA as well... I was about to post that such a thing was useful for point-in-time recovery before reading your post, so yes :) -- / Peter Schuller

Re: Migrating from Mysql to Cassandra

2010-10-02 Thread Peter Schuller
erts. (That said, deletes are a bit special in other ways to go GCGraceSeconds, but that is another story.) -- / Peter Schuller

Re: Dazed and confused with Cassandra on EC2 ...

2010-10-02 Thread Peter Schuller
I forget and I didn't find it by brief sifting through thread history; were you running on small EC2 instances or larger ones? -- / Peter Schuller

Re: UnavailableException when data grows

2010-10-02 Thread Peter Schuller
the circumstances of compaction problems that people do have. -- / Peter Schuller

Re: Schema Questions?

2010-09-28 Thread Peter Schuller
assandra/LiveSchemaUpdates > How could I change, for example, the rows_cached or name > fields below on the fly without losing data?  Is this possible? Check out the system_* methods at the bottom of interface/cassandra.thrift. I believe these are in working order for 0.7. -- / Peter Schuller

Re: Is there a debian 0.6.1 install package archived anywhere?

2010-09-27 Thread Peter Schuller
s that the .deb you're running (and now don't have) was built without other modification from the 0.6.1 tag. IIRC debuild is in 'build-essential' in Debian. -- / Peter Schuller

Re: UnavailableException when data grows

2010-09-27 Thread Peter Schuller
ta; generally the larger the individual values the more disk bound you'd tend to be.) Just trying to zero in on what the likely root cause is in this case. -- / Peter Schuller

Re: 0.7 memory usage problem

2010-09-27 Thread Peter Schuller
tion to spreading writes across all nodes in the cluster, you are submitting writes with sufficient concurrency to allow Cassandra to scale to use available CPU across all cores. -- / Peter Schuller

Re: 0.7 memory usage problem

2010-09-27 Thread Peter Schuller
fic does not make much of a difference. But as a general recommendation, distributing the load across machines is a good default choice of behavior. (Of course not doing so should not cause the stack overflow you're seeing, so this is a separate issue.) -- / Peter Schuller

Re: Best strategy for adding new nodes to the cluster

2010-09-27 Thread Peter Schuller
ut this could be entirely wrong; don't take my word for it - anyone else?). On the other hand, if the only concern is I/O bandwidth rather than CPU use maybe the situation is different. (Does anyone have numbers on variation in available disk bandwidth over time for EC2 instances?) -- / Peter Schuller

Re: 0.7 memory usage problem

2010-09-26 Thread Peter Schuller
real fix (on the premise that there is no legitimate significant stack depth). I'm not sure what the best course of action is here short of going down the path of trying to investigate the problem at the JVM level. I'm hoping someone will come along and point to a simple explanation that we're missing :) -- / Peter Schuller

Re: 0.7 memory usage problem

2010-09-25 Thread Peter Schuller
he default stack size on Windows is so small as to make this an expected outcome given the stack depth in Cassandra. I wonder if there is memory corruption going on that causes the overflow. Or am I missing something simple? -- / Peter Schuller

Re: 0.7 memory usage problem

2010-09-21 Thread Peter Schuller
and of itself, no. That's more 1214 (above). -- / Peter Schuller

Re: Dazed and confused with Cassandra on EC2 ...

2010-09-20 Thread Peter Schuller
e. Do you have a URL / bug id or anything that one can read up on about this? -- / Peter Schuller

Re: what are ways to keep the SSTable Count down low

2010-09-20 Thread Peter Schuller
er some circumstances, contribute to decreasing the impact of compaction once it happens, but that would be highly dependent on I/O scheduling and other circumstances. I wouldn't touch if unless I knew specifically there was a reason to. -- / Peter Schuller

Re: what are ways to keep the SSTable Count down low

2010-09-20 Thread Peter Schuller
e due to regular traffic when compaction is not happening? -- / Peter Schuller

Re: Dazed and confused with Cassandra on EC2 ...

2010-09-20 Thread Peter Schuller
is possible that EC2 instances are over-committed on memory such that swapping is happening behind the scenes on the host... but I have always assumed memory was not over-committed on EC2. Can you run with -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimestamps? -- / Peter Schuller

Re: Dazed and confused with Cassandra on EC2 ...

2010-09-20 Thread Peter Schuller
> Can anyone help shed any light on why this might be happening? We've tried a > variety of JVM settings to alleviate this; currently with no luck. Extremely long ParNew (young generations) pause times are almost always due to swapping. Are you swapping? -- / Peter Schuller

Re: commit log question

2010-09-20 Thread Peter Schuller
ower outages affecting multiple writes or even entire data centers. -- / Peter Schuller

Re: 0.7 memory usage problem

2010-09-18 Thread Peter Schuller
ble [0x] "Surrogate Locker Thread (CMS)" daemon prio=5 tid=0x000801cec800 nid=0x801c3d500 waiting on condition [0x] "Finalizer" daemon prio=8 tid=0x000801ced800 nid=0x801c3ddc0 in Object.wait() [0x7f3f6000] "Reference Handler" daemon prio=10 tid=0x000801cef000 nid=0x801c3e680 in Object.wait() [0x7f4f7000] "main" prio=5 tid=0x000801cef800 nid=0x800c0ae40 runnable [0x7fbfe000] "VM Thread" prio=9 tid=0x000801d22000 nid=0x801c3ef40 runnable "Gang worker#0 (Parallel GC Threads)" prio=9 tid=0x000801d26000 nid=0x801c41cc0 runnable "Gang worker#1 (Parallel GC Threads)" prio=9 tid=0x000801d25000 nid=0x801c41400 runnable "Gang worker#2 (Parallel GC Threads)" prio=9 tid=0x000801d24800 nid=0x801c40b40 runnable "Gang worker#3 (Parallel GC Threads)" prio=9 tid=0x000801d24000 nid=0x801c40280 runnable "Concurrent Mark-Sweep GC Thread" prio=9 tid=0x000801d23800 nid=0x801c3f800 runnable "VM Periodic Task Thread" prio=10 tid=0x000801d20800 nid=0x908d139c0 waiting on condition -- / Peter Schuller

Re: 0.7 memory usage problem

2010-09-18 Thread Peter Schuller
o draw conclusions about performance. And you don't want those exceptions in your logs. -- / Peter Schuller

Re: Cassandra performance

2010-09-18 Thread Peter Schuller
or the real production use case. This is not to complain that you are not providing enough information, but to make the point that the performance of these things can be subtly affected by many things and it is difficult to predict expected performance to a high degree of precision - particularly when multiple different forms of caching involved in the comparison. -- / Peter Schuller

Re: Cassandra performance

2010-09-18 Thread Peter Schuller
am I missing something? -- / Peter Schuller

Re: Cassandra performance

2010-09-17 Thread Peter Schuller
Cassandra is not reflective of reality. This is especially true for large data sets (and by that I mean "amount of data on a single machine"). -- / Peter Schuller

Re: Network latency on Cassandra 0.7 (TFramedTransport)

2010-09-17 Thread Peter Schuller
problem. I'm not sure whether or not this is a reasonable hypothesis with C# + Thrift on Windows. Sorry, I don't know what the easy fix is either but perhaps it might help dig further. [1] http://en.wikipedia.org/wiki/Nagle's_algorithm -- / Peter Schuller

Re: Cassandra performance

2010-09-15 Thread Peter Schuller
ad for each row is probably the least flattering case for Cassandra when disk bound. Without knowing more details it's probably difficult to offer specific explanations for this particular case. -- / Peter Schuller

Re: Follow-up post on cassandra configuration with some experiments on GC tuning

2010-08-30 Thread Peter Schuller
nce in question will be collected - *ever* - since G1 always picks the "best" regions first (best in terms of "bang for the buck" - the most memory reclaimed at the lowest cost). -- / Peter Schuller

Re: Follow-up post on cassandra configuration with some experiments on GC tuning

2010-08-28 Thread Peter Schuller
how incremental mode is implemented, but I doubt they've avoided this), is that the total time needed for the mark/sweep onces it does run is higher, such that you retain more floating garbage that might otherwise have been collected. -- / Peter Schuller

Re: Follow-up post on cassandra configuration with some experiments on GC tuning

2010-08-27 Thread Peter Schuller
default it is simply a matter of tweaking. So, I guess I should re-phrase: In terms of just turning on incremental mode without at least application specific tweaking (if not deployment specific testing), I would suggest caution. -- / Peter Schuller

Re: Follow-up post on cassandra configuration with some experiments on GC tuning

2010-08-27 Thread Peter Schuller
on't know much about the incremental duty cycles are scheduled and it may be the case that Cassandra is not even remotely close to having a problem with incremental mode. But I would suggest caution and testing prior to making it the default. -- / Peter Schuller

Re: Cassandra Nodes Freeze/Down for ConcurrentMarkSweep GC?

2010-08-22 Thread Peter Schuller
omplete a CMS sweep in time and you hit the maximum heap size, but unless that happens, CMS will run concurrently (though there are stop-the-world pauses involved, that are typically very short, the mark/sweep phase is concurrent). As jbellis pointed out, you're almost certainly swapping.

Re: Node OOM Problems

2010-08-22 Thread Peter Schuller
ld not be there. Haver other people done high-throughput writes (to the point of CPU saturation) over extended periods of time while consistently seeing low latencies (consistencty meaning never exceeding hundreds of ms over several days)? -- / Peter Schuller

Re: Node OOM Problems

2010-08-19 Thread Peter Schuller
on and thus by definition not eligible for garbage collection). -- / Peter Schuller

Re: Node OOM Problems

2010-08-19 Thread Peter Schuller
apping with the following single error in > the cassandra.log > Error: Exception thrown by the agent : java.lang.NullPointerException > > I assume this is an unrelated problem? Do you have a full stack trace? -- / Peter Schuller

Re: Errors with Cassandra 0.7

2010-08-19 Thread Peter Schuller
> I am trying to run Cassandra 0.7 and I am getting different errors: First it > was while calling client.insert and now while calling set_keyspace (see > below). Are you perhaps not using a framed transport with thrift? Cassandra 0.7 uses framed by default; 0.6 did not. -- / Peter Schuller

Re: Node OOM Problems

2010-08-19 Thread Peter Schuller
n production environments with very large heap sizes. -- / Peter Schuller

Re: Node OOM Problems

2010-08-19 Thread Peter Schuller
ly large individual values will be relevant to transient high memory use when they are read/written. In general, lacking large row caches and such things, you should be able to have hundreds of millions of entries on an 8 gb heap, assuming reasonably sized keys. -- / Peter Schuller

Re: Cassandra disk space utilization WAY higher than I would expect

2010-08-18 Thread Peter Schuller
correlate well in time with each other and/or something else? -- / Peter Schuller

Re: data deleted came back after 9 days.

2010-08-17 Thread Peter Schuller
;> What is possible causing this? What is your GC grace seconds set to? Is it lower than 8-9 days, and is it possible one or more nodes were disconnected from the remainder of the cluster for a period longer than the GC grace seconds? See: http://wiki.apache.org/cassandra/DistributedDeletes -- / Peter Schuller

Re: Filesystem for Cassandra

2010-08-12 Thread Peter Schuller
ue on my desktop which runs freebsd/zfs) - but I am hoping that is specific to the FreeBSD port -- / Peter Schuller

Re: Using a separate commit log drive was 4x slower

2010-08-10 Thread Peter Schuller
939 8.46099853516 8.44287872314 8.43000411987 8.455991745 -- / Peter Schuller #!/usr/bin/env python # ugly hack, reader beware import os import sys import time COUNT = 25 BLOCK = ''.join('x' for _ in xrange(8*1024)) f = file(sys.argv[1], 'w') for _ in xrange(COUNT)

Re: explanation of generated files and ops

2010-08-10 Thread Peter Schuller
" files? Those are bloom filters: http://wiki.apache.org/cassandra/ArchitectureOverview http://spyced.blogspot.com/2009/01/all-you-ever-wanted-to-know-about.html -- / Peter Schuller

Re: Using a separate commit log drive was 4x slower

2010-08-10 Thread Peter Schuller
t;roughly consistent" is not very precise, and the original performance on the RAID:ed device is probably also roughly consistent with this ;) -- / Peter Schuller

Re: batch_mutate atomicity

2010-08-09 Thread Peter Schuller
reads or distinct get_slice() calls; I am not expecting ordering guarantees w.r.t. visibility within a single get_slice(). (Additionally I am assuming QUOROM or RF=1, else it would not be useful to rely on anyway.) -- / Peter Schuller

Re: error using get_range_slice with random partitioner

2010-08-07 Thread Peter Schuller
API) are sorted. Maybe there was a client in between that did not preserve the sorting (I forget which thread that was). (I'm pretty sure my unit tests would have blown up by now if they're not, but you never know...) -- / Peter Schuller

Re: error using get_range_slice with random partitioner

2010-08-07 Thread Peter Schuller
lexicographically previous to b? Is it aaa? Is it aa? Whatever you pick there will always be a column with one additional a. In a less general case where you impose a length limit on the column name, you're fine. But not in the general case. -- / Peter Schuller

Re: Cassandra disk space utilization WAY higher than I would expect

2010-08-06 Thread Peter Schuller
en compacted, regardless of GC and whether they still remain on disk. -- / Peter Schuller

Re: error using get_range_slice with random partitioner

2010-08-06 Thread Peter Schuller
of byte strings, because there is no defined maximum possible length so the lexicographically "previous" column name might be infinitely long). -- / Peter Schuller

Re: one question about cassandra write

2010-08-06 Thread Peter Schuller
a 'nodetool compact' was not clearing up diskspace after bulk deletion even with GCGraceSeconds set to 0.) -- / Peter Schuller

Re: Cassandra disk space utilization WAY higher than I would expect

2010-08-06 Thread Peter Schuller
, my perhaps misuse of the term "obsolete" only refers to sstables that have been successfully compacted together with others into a new sstable which replaces the old ones (hence making them "obsolete"). I do not mean to imply that they contain obsolete columns. -- / Peter Schuller

Re: Cassandra disk space utilization WAY higher than I would expect

2010-08-06 Thread Peter Schuller
think, represent on-disk size. So nevermind on this bit. -- / Peter Schuller

Re: Cassandra disk space utilization WAY higher than I would expect

2010-08-05 Thread Peter Schuller
x27;re reaching the 5-7 GB per node level after a cleanup has completed fully and all obsolete sstables have been removed, does not necessarily help you since each future cleanup/compaction will typically double your diskspace anyway (even if temporarily). -- / Peter Schuller

Re: Cassandra disk space utilization WAY higher than I would expect

2010-08-05 Thread Peter Schuller
eep, thus triggering the deletion of old sstables, presumably the cleanup itself would produce new sstables that would then have to wait anyway. Unless there is some code path to avoid doing that if nothing at all changes in the sstables as a result of the cleanup... I don't know.) -- / Peter Schuller

Re: java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down

2010-08-05 Thread Peter Schuller
ch is in order that would log the incident with an explanation rather than cause exception leakage that gives the impression of something being wrong. Thoughts? -- / Peter Schuller

Re: unable to start cassandra

2010-08-05 Thread Peter Schuller
ed the data directory in between but not the commit logs? -- / Peter Schuller

Re: stress.py

2010-08-04 Thread Peter Schuller
re how that happened since stress.py has a proper shebang, including in cassandra 0.6.3. Are you running it with 'python'? Is this an old cassandra that maybe didn't have a shebang in it's py_stress (I don't know, grasping at straws)? -- / Peter Schuller

Re: stress.py

2010-08-04 Thread Peter Schuller
d for your platform if you're on *nix. Debian is distinctly lacking though if you're running stable, IIRC. -- / Peter Schuller

Re: stress.py

2010-08-04 Thread Peter Schuller
> Can somebody please give steps to run cassandra stess.py  program. Assuming you have the thrift compiler installed, something like (from memory, not tested): cd contrib/py_stress thrift --gen py:new ../../interface/cassandra.thrift export PYTHONPATH=$(pwd)/gen-py python stress.py -- / Pe

Re: what is the expected result of changing this in storage.conf

2010-08-04 Thread Peter Schuller
n several things specific to your situation, including CPU setup, storage system, data size, locality of access, working set size, etc. I suggest benchmarking your particular use-case. (Not that I wouldn't find it useful to have benchmarks for certain archetypical cases from which you can then better infer what to expect in a particular situation.) -- / Peter Schuller

Re: Please need help with Munin: Cassandra Munin plugin problem

2010-08-04 Thread Peter Schuller
JVM mismatch seems likely. -- / Peter Schuller

Re: Two questions : Server crash during compaction and UnavailableException

2010-08-02 Thread Peter Schuller
confirm > it yet) Are you running out of memory (java heap)? If you're running cassandra with default options, it will be running with -XX:+HeapDumpOnOutOfMemoryError Have you checked the cassandra system.log for garbage collection messages? What is in the last minute or two of logs? -- / Peter Schuller

Re: What's using my memory?

2010-08-01 Thread Peter Schuller
oking at how much memory is used after a concurrent mark/sweep as just finished is a good way of doing this) and then set -Xmx accordingly instead of at 12 GB. -- / Peter Schuller

Re: what causes MESSAGE-DESERIALIZER-POOL to spike

2010-07-30 Thread Peter Schuller
I have about a 25-40% failure rate when the hang occurs. Are the 1% failures you normally experience (pre-freeze) timeouts? -- / Peter Schuller

Re: NullPointerException and Java client hang

2010-07-30 Thread Peter Schuller
art - did it start right after a schema change (keyspace addition)? (I'm just grasping at straws based on a cursory examination; I may be barking up the wrong tree completely.) -- / Peter Schuller Index: src/java/org/apache/cassan

Re: what causes MESSAGE-DESERIALIZER-POOL to spike

2010-07-30 Thread Peter Schuller
blocking on, before so much time has passed that you have thousands of threads waiting. If it *is* a matter of parts of files being unreadable it may be easier to spot using standard I/O rather than mmap():ed I/O since you should clearly see it blocking on a read in that case. -- / Peter Schuller

Re: Cassandra benchmarking on Rackspace Cloud

2010-07-29 Thread Peter Schuller
to control (at least weekly) the independence of resources. This includes at least machine instances and things like EBS volumes. This would be useful not only for scaling purposes but also redundancy purposes. I wonder which cloud provider will be the first (or is there one already?) to provide something like that. -- / Peter Schuller

Re: Cassandra disk space utilization WAY higher than I would expect

2010-07-27 Thread Peter Schuller
speed of writes. -- / Peter Schuller

Re: Cassandra behaviour

2010-07-27 Thread Peter Schuller
rse that depends on the nature of the writes and how expensive they are relative to RTT and/or RPC. FWIW, whenever I have needed a hard "maximum of X per second" rate limit I have implemented or re-used a rate limiter (e.g. a token bucket) for the language in question and used it in my client code. -- / Peter Schuller

Re: Cassandra disk space utilization WAY higher than I would expect

2010-07-27 Thread Peter Schuller
tten values. If you're overwriting with larger values, it will no longer be a "doubling" relative to the actual live data set. Julie, did you do over-writes or was your disk space measurements based on the state of the cluster after an initial set of writes of unique values? -- / Peter Schuller

Re: non blocking Cassandra with Tornado

2010-07-27 Thread Peter Schuller
y briefly. Until, for example, a TCP connection stalls and your entire event loop hangs due to a blocking read. Apologies if I'm misunderstanding what you're trying to do. -- / Peter Schuller

Re: what causes MESSAGE-DESERIALIZER-POOL to spike

2010-07-27 Thread Peter Schuller
ng in cache and not having to go down to disk). If this is the case, increasing read concurrency should at least make the actual problem more obvious (i.e., achieving CPU saturation), though it probably won't increase throughput much unless Cassandra is very friendly to hyperthreading.... -- / Peter Schuller

Re: what causes MESSAGE-DESERIALIZER-POOL to spike

2010-07-27 Thread Peter Schuller
sed the fact that ROW-READ-STAGE had 4096 pending and 8 active... oh well.) -- / Peter Schuller

Re: Key Caching

2010-07-27 Thread Peter Schuller
> @Todd, I noticed some new ops in your cassandra.in.sh. Is there any > documentation on what these ops are, and what they do? > > For instance AggressiveOpts, etc. A fairly complete list is here: http://java.sun.com/javase/technologies/hotspot/vmoptions.jsp -- / Peter Schuller

Re: Key Caching

2010-07-26 Thread Peter Schuller
ministic in when a particular object may be collected. Has anyone deployed Cassandra with G1 on very large heaps under real load?) -- / Peter Schuller

<    1   2   3   4   5   6   7   >