Re: Write speed roughly 1/10 of expected.

2011-03-11 Thread Peter Schuller
you're not bottlenecking there? (I have no idea how fast phpcassa is.) -- / Peter Schuller

Re: Nodes frozen in GC

2011-03-10 Thread Peter Schuller
not be causing the problems you are seeing, in general. Especially not with a 5 gb heap size. I think it is highly likely that there is some little detail/mistake going on here rather than a fundamental issue. But regardless, it would be nice to discover what. -- / Peter Schuller

Re: problem with bootstrap

2011-03-10 Thread Peter Schuller
a future start-up to never trigger bootstrapping.) -- / Peter Schuller

Re: problem with bootstrap

2011-03-10 Thread Peter Schuller
up a cluster, you would not normally want to have a node join the ring without auto_bootstrap enabled since it will serve requests yet be void of any data. -- / Peter Schuller

Re: cassandra and G1 gc

2011-03-09 Thread Peter Schuller
would probably be required under real workloads of different kinds to determine whether G1 seems suitable in its current condition. -- / Peter Schuller

Re: cassandra and G1 gc

2011-03-09 Thread Peter Schuller
a large key cache and row cache I fully expect there to be issues. A large LRU is exactly the type of workload where I seem to consistently break it... -- / Peter Schuller

Re: Nodes frozen in GC

2011-03-08 Thread Peter Schuller
not realistic. -- / Peter Schuller

Re: Nodes frozen in GC

2011-03-08 Thread Peter Schuller
Also: * What is the frequency of the pauses? Are we talking every few seconds, minutes, hours, days * If you say decrease the load down to 25%. Are you seeing the same effect but at 1/4th the frequency, or does it remain unchanged, or does the problem go away completely? -- / Peter Schuller

Re: Nodes frozen in GC

2011-03-08 Thread Peter Schuller
. -- / Peter Schuller

Re: Nodes frozen in GC

2011-03-08 Thread Peter Schuller
Cassandra nodes will always freeze for 30 seconds every now and then is also helping no one, other than not being true.) -- / Peter Schuller

Re: Nodes frozen in GC

2011-03-08 Thread Peter Schuller
memory to slabs for different object sizes, when the object size distribution changes. -- / Peter Schuller

Re: recommended way to grow a cluster?

2011-03-08 Thread Peter Schuller
by jumping through some extra hoops (essentially using an 'extra' node so you can do insertions followed by removals instead of moves), but it's more of a hassle. -- / Peter Schuller

Re: Error when bringing up nodes during failure testing

2011-03-08 Thread Peter Schuller
cause data to be replicated. But until that's done, the node should be returning inconsistent results. So, turning off auto_bootstrap probably just hid/changed the symptom of the problem you're seeing rather than fix it./ -- / Peter Schuller

Re: Error when bringing up nodes during failure testing

2011-03-08 Thread Peter Schuller
because it's suddenly unclear to me how this is supposed to work with respect to nodes being down (supposing it's truly down, forever, and needs to be replaced). Anyone? -- / Peter Schuller

Re: Nodes frozen in GC

2011-03-06 Thread Peter Schuller
Do you have row cache enabled? Disable it. If it fixes it and you want it, re-enable but consider row sizes and the cap on the cache size.. -- / Peter Schuller

Re: Defrag

2011-03-03 Thread Peter Schuller
who knows... But yes, it's certainly a very valid point that concurrent streaming I/O can be an issue. But as you say, given that there is a reasonably small amounts of concurrent writes that (except for the commit log) write in large chunks, it should not be a major concern. -- / Peter Schuller

Re: Storing photos, images, docs etc.

2011-03-02 Thread Peter Schuller
is the right approach to the problem since that's more about using the Cassandra data model appropriately. -- / Peter Schuller

Re: Defrag

2011-03-02 Thread Peter Schuller
expect fragmentation is if you run Cassandra on a file system that also does something else that fragments the hell out of it. -- / Peter Schuller

Re: Question about insert performance in multiple node cluster

2011-02-28 Thread Peter Schuller
of bandwidth. -- / Peter Schuller

Re: Question about insert performance in multiple node cluster

2011-02-28 Thread Peter Schuller
. -- / Peter Schuller

Re: Cassandra as write-behind, Cassandra as Cache

2011-02-21 Thread Peter Schuller
correct me if I'm painting a too gloomy picture here.) -- / Peter Schuller

Re: frequent client exceptions on 0.7.0

2011-02-21 Thread Peter Schuller
there. Sorry.) -- / Peter Schuller

Re: Understand eventually consistent

2011-02-21 Thread Peter Schuller
), this essentially implies co-ordination on every read, at all times.) -- / Peter Schuller

Re: memory consuption

2011-02-18 Thread Peter Schuller
() has a tendency to cause swapping out of the JVM heap. But this is not because the process actually uses more memory as such. I didn't read it now but scrolling through it seems the wikipedia article is a pretty good intro: http://en.wikipedia.org/wiki/Virtual_memory -- / Peter Schuller

Re: memory consuption

2011-02-18 Thread Peter Schuller
management technique used by the JVM whether this is practical. (It just occurred to me that Azul should get this for 'free' in their GC. Wonder if that's true.) -- / Peter Schuller

Re: frequent client exceptions on 0.7.0

2011-02-17 Thread Peter Schuller
memtables thresholds far too aggressively for your heap size, resulting in an out-of-memory error which, given Cassandra's default JVM options, results in the node dying. Bottom line: Check /var/log/cassandra/system.log to begin with and see if it's reporting anything or being restarted. -- / Peter

Re: Limit on amount of CFs

2011-02-13 Thread Peter Schuller
limits :) -- / Peter Schuller

Re: Partioning and Sorting is it CF Key or Column Key?

2011-02-13 Thread Peter Schuller
it will not be sorted on key (it will be sorted on ring token). -- / Peter Schuller

Re: Limit on amount of CFs

2011-02-13 Thread Peter Schuller
.) -- / Peter Schuller

Re: Data ends up in wrong Columnfamily

2011-02-11 Thread Peter Schuller
So far so good, but it regularly happens, that data from one application ends up in columnfamilies reserved for the other application as well as the intended columnfamily. Maybe https://issues.apache.org/jira/browse/CASSANDRA-1992 -- / Peter Schuller

Re: Cassandra documentation

2011-02-10 Thread Peter Schuller
complete single handbook style documentation is: http://www.datastax.com/docs/0.7/index Other than that, the wiki is probably the most structured place. -- / Peter Schuller

Re: Out of control memory consumption

2011-02-09 Thread Peter Schuller
with default settings). If you've got a 3 gig heap size and the other nodes stay at 500 mb, the question is why *don't* they increase in heap usage. Unless your 500 mb is the report of the actual live data set as evidenced by post-CMS heap usage. -- / Peter Schuller

Re: Out of control memory consumption

2011-02-09 Thread Peter Schuller
(If you're looking at e.g. jconsole graphs a screenshot of the graph would not hurt.) -- / Peter Schuller

Re: Best way to detect/fix bitrot today?

2011-02-08 Thread Peter Schuller
repair nor read repair is primarily intended to address arbitrary data corruption but rather to reach eventual consistency in the cluster (after writes were dropped, a node went down, etc). -- / Peter Schuller

Re: Best way to detect/fix bitrot today?

2011-02-08 Thread Peter Schuller
arbitrary corruption (anyone?). -- / Peter Schuller

Re: Does variation in no of columns in rows over the column family has any performance impact ?

2011-02-07 Thread Peter Schuller
is a very strong and blanket requirement for making the decision to mix small rows with larger ones. -- / Peter Schuller

Re: Best way to detect/fix bitrot today?

2011-02-07 Thread Peter Schuller
/tentative opinion is that the clean fix is for Cassandra to support strong checksumming at the sstable level. Deploying on e.g. ZFS would help a lot with this, but that's a problem for deployment on Linux (which is the recommended platform for Cassandra). -- / Peter Schuller

Re: Best way to detect/fix bitrot today?

2011-02-07 Thread Peter Schuller
. Checksumming at the sstable level would allow detection of corruption, and regular repair and read-repair would provide self-healing. -- / Peter Schuller

Re: Deleted columns still coming back; CASSANDRA-{1748,1837} alive in 0.6.x?

2011-02-07 Thread Peter Schuller
. -- / Peter Schuller

Re: performance degradation in cluster

2011-02-03 Thread Peter Schuller
against remote machines, if your test is a sequential workload. Adding machines increases aggregate throughput across multiple clients; it won't make individual requests faster (except indirectly of course by avoiding overloaded conditions). -- / Peter Schuller

Re: Tracking down read latency

2011-02-03 Thread Peter Schuller
time columns are massively useful/important; without them one is much more blind as to whether or not storage devices are actually saturated. -- / Peter Schuller

Re: rolling window of data

2011-02-03 Thread Peter Schuller
completed). -- / Peter Schuller

Re: Counters in 0.8 -- conditional?

2011-02-02 Thread Peter Schuller
). It is unrelated to counting rows or columns, unless the application happens to use them for that. -- / Peter Schuller

Re: Counters in 0.8 -- conditional?

2011-02-02 Thread Peter Schuller
with that.) -- / Peter Schuller

Re: Cassandra memory needs

2011-02-02 Thread Peter Schuller
. -- / Peter Schuller

Re: Does Cassandra support range queries on keys ?

2011-01-24 Thread Peter Schuller
to say too much here. Anyone else? (anti-entropy granularity, compaction in-memory thresholds and GC tweaking, etc) -- / Peter Schuller

Re: Forcing GC w/o jconsole

2011-01-24 Thread Peter Schuller
What port number do I actually need? 8080 (well, up until just very recently in trunk). Look at the VM options Cassandra is running with and you'll see the JMX agent port: -Dcom.sun.management.jmxremote.port=8080 -- / Peter Schuller

Re: Does Major Compaction work on dropped CFs? Doesn't seem so.

2011-01-21 Thread Peter Schuller
. GCGraceSeconds with respect to tombstones is different, and refers to garbage collecting tombstones by removing them during compaction. -- / Peter Schuller

Re: the problem of elasticity

2011-01-21 Thread Peter Schuller
you still have concerns that the data has not propagated correctly). -- / Peter Schuller

Re: Question re: the use of multiple ColumnFamilies

2011-01-21 Thread Peter Schuller
to be working around https://issues.apache.org/jira/browse/CASSANDRA-1955 -- / Peter Schuller

Re: How does Bootstrapping work in 0.7 ??

2011-01-20 Thread Peter Schuller
to acquire a new one. Any initial token in the configuration is ignored; it is only the *initial* token, quite literally. Changing the token would require a 'nodetool move' command. -- / Peter Schuller

Re: Multi-tenancy, and authentication and authorization

2011-01-19 Thread Peter Schuller
better overal control over how flushing happens and when writes start blocking, rather than necessarily implying that there can't be more than one memtable (the ticket Stu posted seems to address one such means of control). -- / Peter Schuller

Re: Basic question on distributed delete

2011-01-19 Thread Peter Schuller
you're doing, the critical point is that you should run 'nodetool repair' often enough relative to GCGraceSeconds: http://wiki.apache.org/cassandra/Operations#Frequency_of_nodetool_repair -- / Peter Schuller

Re: Question re: the use of multiple ColumnFamilies

2011-01-18 Thread Peter Schuller
stated it (some cf:s that got written at a slow pace and just happened to flush at the same time), that should be enough. (If you have some CF:s being written to faster than they are flushed, there would still be potential for one CF to hog the flush writers unfairly.) -- / Peter Schuller

Re: balancing load

2011-01-17 Thread Peter Schuller
@Peter Isn't clean up a special case of compaction? IE it works as a major compaction + removes data not belonging to the node? Yes, sorry. Brain lapse. Ignore my. -- / Peter Schuller

Re: Cassandra GC Settings

2011-01-17 Thread Peter Schuller
sure you're not swapping a bit? Also, what do you mean by node instability - does it *completely* stop responding during these periods or does it flap in and out of the cluster but is still responding? Are you nodes disk bound or CPU bound during compaction? -- / Peter Schuller

Re: quorum calculation seems to depend on previous selected nodes

2011-01-17 Thread Peter Schuller
Adding CL.TWO would be easy enough. :) True, but the obvious generalization is to be able to select an arbitrary replica count and that seemed like a bigger change to the API. But if CL.TWO would be considered clean enough... I may submit a jira/patch. -- / Peter Schuller

Re: Cassandra GC Settings

2011-01-17 Thread Peter Schuller
, decreasing: in_memory_compaction_limit_in_mb: 64 from the default of 64 might help here I suppose. -- / Peter Schuller

Re: How can I correct this Cassandra load imbalance?

2011-01-16 Thread Peter Schuller
all)? Given sufficient amounts of overwrites/deletions, variations in compacting timing could account for differences. -- / Peter Schuller

Re: balancing load

2011-01-16 Thread Peter Schuller
add them. If you're looking to add e.g. 25% more nodes that requires moving nodes around. Be aware that moving a node implies decomission + bootstrap, so the node will temporarily not be contributing to cluster capacity during the move. -- / Peter Schuller

Re: Problem starting Cassandra on Ubuntu

2011-01-15 Thread Peter Schuller
to the version that was installed by the package; there is probably a mistake in there somewhere. (Yes, the error message could of course be more informative...) -- / Peter Schuller

Re: Storing big objects into columns

2011-01-14 Thread Peter Schuller
this method ? Not in production, but I've done testing with values on the order of a few megs. Expect compaction to be entirely disk bound rather than CPU bound. Make sure latency is acceptable even when data sizes grow beyond memory size. -- / Peter Schuller

Re: about the data directory

2011-01-13 Thread Peter Schuller
. -- / Peter Schuller

Re: about the data directory

2011-01-13 Thread Peter Schuller
line, to answer that question for a particular key. -- / Peter Schuller

Re: about the data directory

2011-01-13 Thread Peter Schuller
the data? If *all* nodes in the replica set for a particular row are down, then you won't be able to read that row, no. -- / Peter Schuller

Re: Node Inconsistency

2011-01-12 Thread Peter Schuller
for a column, that is older than the tombstone that indicates it removal. Hence, the expiry of the tombstones is safe. -- / Peter Schuller

Re: Advice wanted on modeling

2011-01-12 Thread Peter Schuller
expecting write load to be high such that performance of writes (and compaction) is a concern, or is it mostly about slowly building up huge amounts of data that you want to be compact on disk? -- / Peter Schuller

Re: Node Inconsistency

2011-01-11 Thread Peter Schuller
because the contraints of the cluster were violated (i.e., expired tombstones prior to nodetool repair having been run). I hope that clarifies. -- / Peter Schuller

Re: Confused about CASSANDRA-1417; saving row cache

2011-01-11 Thread Peter Schuller
, you do have to consider the expected start-up time when sizing your row cache. -- / Peter Schuller

Re: upgrading to 0.7 from 0.6.x

2011-01-11 Thread Peter Schuller
for the schema that you actually had in 0.6) the data will be available again. I.e., the moment the schema is created it is essentially created pre-populated with your sstables instead of empty. -- / Peter Schuller

Re: Confused about CASSANDRA-1417; saving row cache

2011-01-11 Thread Peter Schuller
? -- / Peter Schuller

Re: Confused about CASSANDRA-1417; saving row cache

2011-01-11 Thread Peter Schuller
the write it officially ACK:ed. Cassandra would now be violating the consistency guarantees it pretended to have. (That is ignoring any potential issues directly resulting from a node having an internally inconsistent state w.r.t. what's actually stored on the node.) -- / Peter Schuller

Re: Node Inconsistency

2011-01-10 Thread Peter Schuller
, and if all your discrepancies are in the form of forgotten deletes, then GCGraceSeconds/repair seems like a likely candidate cause. -- / Peter Schuller

Re: Node Inconsistency

2011-01-10 Thread Peter Schuller
/browse/CASSANDRA-1316. Subtle, indeed. I've attempted to document this here: http://wiki.apache.org/cassandra/Operations#Frequency_of_nodetool_repair http://wiki.apache.org/cassandra/Operations#Dealing_with_the_consequences_of_nodetool_repair_not_running_within_GCGraceSeconds -- / Peter

Re: Question re: the use of multiple ColumnFamilies

2011-01-08 Thread Peter Schuller
do non-rate-limited writes to the cluster, results are probably negative. I'll file a bug about this to (1) elicit feedback if I'm wrong, and (2) to fix it. -- / Peter Schuller

Re: Question re: the use of multiple ColumnFamilies

2011-01-08 Thread Peter Schuller
Filed: https://issues.apache.org/jira/browse/CASSANDRA-1955 -- / Peter Schuller

Re: How can I correct this Cassandra load imbalance?

2011-01-06 Thread Peter Schuller
in the size of rows (or very few rows). I was hoping someone else would spot something but the thread seems dead still :) -- / Peter Schuller

Re: How can I correct this Cassandra load imbalance?

2011-01-06 Thread Peter Schuller
of very large rows that are screwing up the statistics? -- / Peter Schuller

Re: The CLI sometimes gets 100 results even though there are more, and sometimes gets more than 100

2011-01-05 Thread Peter Schuller
, with multiple queries. -- / Peter Schuller

Re: Cassandra 0.7 - Query on network topology

2011-01-05 Thread Peter Schuller
will take lots of iops. -- / Peter Schuller

Re: The CLI sometimes gets 100 results even though there are more, and sometimes gets more than 100

2011-01-05 Thread Peter Schuller
I know that there's a limit, and I just assumed that the CLI set it to 100, until I saw more than 100 results. Ooh, sorry. Didn't read carefully enough. Not sure why you see that behavior. Sounds strange; should not be supported at the thrift level AFAIK. -- / Peter Schuller

Re: Reclaim deleted rows space

2011-01-04 Thread Peter Schuller
is a background operation. It is never the case that reads or writes somehow wait for compaction to complete. -- / Peter Schuller

Re: Cassandra disk usage and failure recovery

2011-01-04 Thread Peter Schuller
, that should be a minor issue because it should be easy to avoid the possibility of getting to the point of not fitting a single (size limited) sstable. -- / Peter Schuller

Re: Reclaim deleted rows space

2011-01-04 Thread Peter Schuller
-existing silver bullet in a current release; you probably have to live with the need for greater-than-theoretically-optimal memory requirements to keep the working set in memory. -- / Peter Schuller

Re: meaning of eventual consistency in Cassandra ?

2011-01-03 Thread Peter Schuller
mutation, concurrent readers may see a partially applied batch mutation for a given row even though it is only being written to a single node. -- / Peter Schuller

Re: Unable to integrate Cassandra with Hadoop

2010-12-30 Thread Peter Schuller
instead of byte[]. This would affect the thrift API and some internal stuff in Cassandra. Likely whatever code is passing a byte[] needs to be updated. (I haven't used the hadoop support though so I'm not sure what the culprit is specifically. Perhaps someone can fill in.) -- / Peter Schuller

Re: java.io.IOException: No space left on device

2010-12-24 Thread Peter Schuller
), such a feature could be turned off. But it would make it less easy to accidentally write yourself into a corner. -- / Peter Schuller

Re: Cassandra Node Routinely Goes Down - 0.7 RC2

2010-12-22 Thread Peter Schuller
will probably be 256 mb. And with out-of-the-box cassandra.yaml settings I would not be surprised if you're dying with an OutOfMemory error and possibly with high amounts of GC activity before actually dying. -- / Peter Schuller

Re: java.io.IOException: No space left on device

2010-12-22 Thread Peter Schuller
is being sent (anyone?). In any case: Monitoring disk-space is very very important. BTW what precisely does the Owns column mean? The percentage of the token space owned by the node. -- / Peter Schuller

Re: Cassandra Node Routinely Goes Down - 0.7 RC2

2010-12-22 Thread Peter Schuller
system to offer specific claims as to what problem you triggered. -- / Peter Schuller

Re: Cassandra Monitoring

2010-12-19 Thread Peter Schuller
munin/zabbix/etc. It would be pretty nice to have that out of the box with Cassandra, though I expect that would be considered bloat. :) -- / Peter Schuller

Re: Read Latency Degradation

2010-12-18 Thread Peter Schuller
. This means that while latency is a great indicator to look at to judge what the current user perceived behavior is, it is *not* a good thing to look at to extrapolate resource demands or figure out how far you are from saturation / need for more hardware. -- / Peter Schuller

Re: Read Latency Degradation

2010-12-18 Thread Peter Schuller
for extended periods once larger multi-hundreds-of-gig sstables are being compacted. However, that said, if you are just continually increasing your sstable count (rather than there just being spikes) that indicates compaction is not keeping up with write traffic. -- / Peter Schuller

Re: Read Latency Degradation

2010-12-18 Thread Peter Schuller
.) We should be careful not to mislead people. Talking about 16TB XFS setup, or 100TB/node without any difficulties , seems very very far from the common use case. I completely agree. I didn't mean to imply that and I hope no one was mislead. -- / Peter Schuller

Re: Read Latency Degradation

2010-12-18 Thread Peter Schuller
to object if any information is misleading. -- / Peter Schuller

Re: Read Latency Degradation

2010-12-18 Thread Peter Schuller
is going to be whether caching is effective at all, and how much additional caching would help. In any case, it would be interesting to know whether you are seeing more disk seeks per read than you should. -- / Peter Schuller

Re: Read Latency Degradation

2010-12-18 Thread Peter Schuller
really does seem indicative of a problem, unless the the bottleneck is legitimately reads from disk from multiple sstables resulting from rows being spread over said sstables. -- / Peter Schuller

Re: Memory leak with Sun Java 1.6 ?

2010-12-16 Thread Peter Schuller
never worked properly for me in my browsers... too broken-ajaxy.) -- / Peter Schuller

Re: [SOLVED] Very high memory utilization (not caused by mmap on sstables)

2010-12-16 Thread Peter Schuller
Sorry for spam again. :-) No, thanks a lot for tracking that down and reporting details! Presumably a significant amount of users are on that version of Ubuntu running with openjdk. -- / Peter Schuller

Re: Unstable cassandra - , Cannot allocate memory

2010-12-15 Thread Peter Schuller
for the ln fix.) -- / Peter Schuller

<    1   2   3   4   5   6   >