On Mon, Mar 7, 2011 at 11:32 AM, John Lewis lewili...@gmail.com wrote:
When you say decent latency and throughput what numbers do you consider
decent? I know throughput would be highly dependent on the quantity of kb
shoved through the pipe so I would expect throughput needs would be highly
Hi,
Can we define consistency level in yaml file(or at the time of designing
cassandra data modal), my question may sound stupid since m still in process of
understanding Cassandra :)...
Thanks and regards
sagar
Are you exploring a Big Data Strategy ? Listen
Hi,
Can we define consistency level in yaml file(or at the time of designing
cassandra data modal), my question may sound stupid since m still in process of
understanding Cassandra :)...
Thanks and regards
sagar
Are you exploring a Big Data Strategy ? Listen
Yeah this make sense as far as I can tell.
Bye,
Norman
2011/3/8 Aditya Narayan ady...@gmail.com
My application displays list of several blogs' overview data (like
blogTitle/ nameOfBlogger/ shortDescrption for each blog) on 1st page (in
very much similar manner like Digg's newsfeed) and
It looks like the node is sending out it application state and waiting the
required time after which it expects to know about all other nodes in the
cluster.
INFO [main] 2011-03-07 17:04:06,660 StorageService.java (line 399) Joining:
sleeping 3 ms for pending range setup
For some reason
Hello,
According to the Wiki/StorageConfiguration page, auto_bootstrap is
described as below:
auto_bootstrap
Set to 'true' to make new [non-seed] nodes automatically migrate the
right data to themselves. (If no InitialToken is specified, they will
pick one such that they will get half the
Just checking the version of Thrift, you said 0.7.2 the latest stable is 0.6
Unfortunately for cassandra 0.6 you need to match a specific SVN release for
thrift see http://wiki.apache.org/cassandra/InstallThrift For cassandra 0.6.12
it's r917130
Is there a reason you are using cassandra
Consistency is set by the client for each read or write requests. You define
the Replication Factor when creating the Keyspace, either in cassandra.yaml or
as part of the create keyspace statement using the cassandra-cli.
For background...
Check the docs if any for the high level client you
You could duplicate the data from CF1 in CF2 as well (use a batch_mutation
through whatever client you have). So when serving the second page you only
need to read one row from CF2.
Aaron
On 8/03/2011, at 8:13 PM, Norman Maurer wrote:
Yeah this make sense as far as I can tell.
Bye,
AFAIK yes. The node marks itself as bootstrapped whenever it starts, and will
not re-bootstrap once that it set.
More info here
http://wiki.apache.org/cassandra/Operations#Bootstrap
Hope that helps.
Aaron
On 8/03/2011, at 9:35 PM, Maki Watanabe wrote:
Hello,
According to the
One of the issues with ec2 is after a reboot. the internal ip
changes.this caused a a big problem for me yesterday.
On Mar 8, 2011 2:29 AM, aaron morton aa...@thelastpickle.com wrote:
Not this fits your problem, but if you pass
-Dcassandra.load_ring_state=false as a JVM option it will stop
Yes Aaron I thought about that but that doesnt seem to be just a small
amount of data either (contains text), but yes we can consider to do so
later as we find the need for it..
Thank you both!
On Tue, Mar 8, 2011 at 2:25 PM, aaron morton aa...@thelastpickle.comwrote:
You could duplicate the
On Tue, Mar 8, 2011 at 1:53 AM, Jeffrey Wang jw...@palantir.com wrote:
Hi all,
When I drop a column family, it creates a snapshot. When does the snapshot
go away and free up the disk space? I was able to run nodetool clearsnapshot
to get rid of them, but will they go away themselves?
On 03/07/2011 10:08 PM, Aaron Morton wrote:
You can fill your boots.
So long as your boots have a capacity of 2 billion.
Background ...
http://wiki.apache.org/cassandra/LargeDataSetConsiderations
http://wiki.apache.org/cassandra/CassandraLimitations
I do not know of any articles I could send your way, and others may have some
tales from running production systems. But here are a few thoughts, others
please correct me if I am wrong:
- the replication factor is not intended to the changed on a running system. It
can be, but it will be a
Thx!
2011/3/8 aaron morton aa...@thelastpickle.com:
AFAIK yes. The node marks itself as bootstrapped whenever it starts, and
will not re-bootstrap once that it set.
More info here
http://wiki.apache.org/cassandra/Operations#Bootstrap
Hope that helps.
Aaron
On 8/03/2011, at 9:35 PM, Maki
2011/3/8 Chris Goffinet c...@chrisgoffinet.com
How large are your SSTables on disk? My thought was because you have so
many on disk, we have to store the bloom filter + every 128 keys from index
in memory.
0.5GB
But as I understand store in memory happens only when read happens, i do
only
Hi all,
This month's London user group will be on the topic of Hadoop integration.
If anyone is interested in sharing knowledge about how they use Hadoop with
Cassandra then please get in touch, there are some speaker slots available.
If you'd like to learn more then please come along!
If RF=2 and CL= QUORUM, you're getting no benefit from replication. When a
node is in GC it stops everything. Set RF=3, so when one node is busy the
cluster will still work.
On Tue, Mar 8, 2011 at 11:46 AM, ruslan usifov ruslan.usi...@gmail.comwrote:
2011/3/8 Chris Goffinet
(1) I cannot stress this one enough: Run with -XX:+PrintGC
-XX:+PrintGCDetails -XX:+PrintGCTimeStamps and collect the output.
Actually, I wonder if it's worth someone getting this enabled by
default, with the obvious problems associated with getting the log
output placed appropriately and
Sagar,
Consistency level defines how your reads and writes should work. You can
defer it according to your needs, defines what are your expectations
when you are reading/writing data. Hence, they are not static to
Keyspace/CF metadata.
With regards,
Mayank
On 08-03-2011 13:15, Sagar Kohli
Also:
* What is the frequency of the pauses? Are we talking every few
seconds, minutes, hours, days
* If you say decrease the load down to 25%. Are you seeing the same
effect but at 1/4th the frequency, or does it remain unchanged, or
does the problem go away completely?
--
/ Peter Schuller
I've run into this issue as well when running a test instance on my laptop.
In the office (where I set it up) I have no issues, go outside the office
on a different network, different story. I'll try your suggestion, Aaron.
On Tue, Mar 8, 2011 at 12:43 AM, Sasha Dolgy sdo...@gmail.com wrote:
2011/3/8 Peter Schuller peter.schul...@infidyne.com
(1) I cannot stress this one enough: Run with -XX:+PrintGC
-XX:+PrintGCDetails -XX:+PrintGCTimeStamps and collect the output.
(2) Attach to your process with jconsole or some similar tool.
(3) Observe the behavior of the heap over time.
JVM_OPTS=$JVM_OPTS -XX:+PrintGCApplicationStoppedTime
JVM_OPTS=$JVM_OPTS -Xloggc:/var/log/cassandra/gc.log
Add:
JVM_OPTS=$JVM_OPTS -XX:+PrintGC
JVM_OPTS=$JVM_OPTS -XX:+PrintGCDetails
JVM_OPTS=$JVM_OPTS -XX:+PrintGCTimeStamps
And you will see significantly more detail in the GC log.
--
/
$client-batch_mutate($mutations,
cassandra_ConsistencyLevel::QUORUM);
Btw, what are the mutations? Are you doing something like inserting
both very small values and very large ones?
In any case: My main reason to butt back into this thread is that
under normal circumstances you
Also, why is there so much garbage collection to begin with? Memcache
uses a slab allocator to reuse blocks to prevent allocation/deallocation
of blocks from consuming all the cpu time. Are there any plans to reuse
blocks so the garbage collector doesn't have to work so hard?
And to address
Hi,
I've small test cluster, 2 servers, both running successfully
cassandra 0.7.3. I've three keyspaces, two with RF1, one with RF3. Now
when I try to bootstrap 3rd server (empty initial_token,
auto_bootstrap: true), I get this exception on the new server.
INFO 23:13:43,229 Joining: getting
I never saw this before upgrading to 0.7.3 but now I do nodetool repair
and it sits there for hours. Previously it took about 20 minutes per
node (about 10GB of data per node).
I had some OOM crashes, but haven't seen them since I increased the heap
size and decreased the key cache.
In
I just saw repair hang here too, it's actually very easy to reproduce. I'm
looking at it right now.
--
Sylvain
On Tue, Mar 8, 2011 at 4:30 PM, Karl Hiramoto k...@hiramoto.org wrote:
I never saw this before upgrading to 0.7.3 but now I do nodetool repair and
it sits there for hours.
On 08/03/2011 16:34, Sylvain Lebresne wrote:
I just saw repair hang here too, it's actually very easy to reproduce.
I'm looking at it right now.
--
Thanks. Should i bump GCGraceSeconds since i can no longer repair?
I tried repair on 3 nodes of a 6 node cluster and they all hang.
- When adding nodes to a cluster it's mode efficient if you can change the
range to existing nodes to be a sub set of what they were responsible for
previously. So the node only has to stream out data, rather than stream out
and stream in data. Say you have this contrived example (where values
Is he trying to bootstrap? What does that have to do with failure
recovery? Doesn't make sense to me.
On Tue, Mar 8, 2011 at 2:33 AM, aaron morton aa...@thelastpickle.com wrote:
It looks like the node is sending out it application state and waiting the
required time after which it expects to
On 08/03/2011 17:09, Jonathan Ellis wrote:
No.
What is the history of your cluster?
It started out as 0.7.0 - RC3 And I've upgraded 0.7.0, 0.7.1, 0.7.2,
0.7.3 within a few days after each was released.
I have 6 nodes about 10GB of data each RF=2. Only one CF every
row/column has a
Trying out stress.py on AWS EC2 environment (4 Large instances. Each
of 2-cores and 7.5GB RAM. All in the same region/zone.)
python stress.py -o insert -d
10.253.203.224,10.220.203.48,10.220.17.84,10.124.89.81 -l 2 -e ALL -t
10 -n 500 -S 100 -k
(I want to try with column size of about 1MB.
On Tue, Mar 8, 2011 at 1:25 AM, Sylvain Lebresne sylv...@datastax.com wrote:
And it's far easier for you to know what to do with the snapshot
(whether that is deleting it or archiving it somewhere) than for the
application.
Snapshots also have the neat property of not being the full size of
Thanks for the reply, I realize my question was rather nebulous as I consider
this proposed deployment to be rather nebulous as well. Any bit of information
and a direction on which sections of documentation are relevant helps this
challenge become less nebulous over time. I will do some
I turned the auto_bootstrap off and it worked fine. I don't think it's
connectivity issue or network issue at all. I am very confused about what's
going on here. Can you please let me know if this a bug that I am facing?
Also, what are the disadvantage of turning off auto bootstrap? Do I need to
Also, what are the disadvantage of turning off auto bootstrap? Do I need to
do anything after the fact?
Inserting a new node into a ring without auto_bootstrap implies that
it will join the ring, but will not contain any data for which it is
supposedly responsible. A 'nodetool repair' should
2) When I brought 2 nodes down (out of 3), I was able to start one node
(with 66 % load below) even though auto_bootstrap is set to true. Shouldn't
it have failed for the same reason?
This is a good point/question. As far as I can tell, a node being
bootstrapped would need to receive data from
Thanks for the reply!
Not really:
- range scans do not perform read repair
Ok I obviously overlooked that RangeSliceResponseResolver does not repair rows
on nodes that never saw a write for a given key at all. But that's not a big
problem for us since we are mainly interested in fixing
2011/3/8 A J s5a...@gmail.com
Trying out stress.py on AWS EC2 environment (4 Large instances. Each
of 2-cores and 7.5GB RAM. All in the same region/zone.)
python stress.py -o insert -d
10.253.203.224,10.220.203.48,10.220.17.84,10.124.89.81 -l 2 -e ALL -t
10 -n 500 -S 100 -k
(I want
Hi Ruslan,
Is it possible for you to tell us the details on what you have done
which measurably helped your situation, so we can start a best
practices doc on growing cassandra systems?
So far, I see that under load, cassandra is rarely ready to take heavy
load in it's default configuration and
I am as clear as mud with what is happening here :)
But with some suggestions I can try to start my test from scratch and post
results in that order.
--
View this message in context:
2011/3/8 Paul Pak p...@yellowseo.com
Hi Ruslan,
Is it possible for you to tell us the details on what you have done which
measurably helped your situation, so we can start a best practices doc on
growing cassandra systems?
So far, I see that under load, cassandra is rarely ready to take
Anything in Dallas?
From: Jake Luciani [mailto:jak...@gmail.com]
Sent: Tuesday, March 08, 2011 12:53 PM
To: user@cassandra.apache.org
Subject: Re: Cassandra Meetup in Austin, TX
There is also a newly formed NYC area Cassandra User Group
http://www.meetup.com/NYC-Cassandra-User-Group
On Tue,
Hello,
that was the way i was thinking about, actually its written
https://gist.github.com/744761
But any hint how to get those data from httpserver into zabbix?
Thanks
2011/3/8 ruslan usifov ruslan.usi...@gmail.com
You can simply write you own java agent(this doesn't require chage of
I suspect you are in the case of
https://issues.apache.org/jira/browse/CASSANDRA-2290.
That is some neighbor node died or was unable to perform its part of the
repair. You can always
retry making sure all node are and stay alive to see if it is the former
one. But seeing the
other exception in
Is this a client side time out or a server side one? What does the error stack
look like ?
Also check the server side logs for errors. The thrift API will raise a timeout
when less the CL level of nodes return in rpc_timeout.
Good luck
Aaron
On 9/03/2011, at 7:37 AM, ruslan usifov wrote:
Did you run scrub as soon as you updated to 0.7.3 ?
And did you had problems/exceptions before running scrub ?
If yes, did you had problems with only 0.7.3 or also with 0.7.2 ?
If the problems started with running scrub, since it takes a snapshot
before running, can you try restarting a test
I had similar errors in late 0.7.3 releases related to testing I did for the
mails with subject Argh: Data Corruption (LOST DATA) (0.7.0).
I do not see these corruptions or the above error anymore with 0.7.3 release
as long as the dataset is created from scratch. The patch (2104) mentioned
in the
I think this not the right functionality and it is really odd that you can't
successfully bring it online without turning off bootstrap BUT you can bring
it online by turning auto_boostrap off and then run nodetool repair
afterwards.
Also, if that's the case then when one node goes down, say out
On 03/08/11 21:45, Sylvain Lebresne wrote:
Did you run scrub as soon as you updated to 0.7.3 ?
Yes, whithin a few minutes of starting up 0.7.3 on the node
And did you had problems/exceptions before running scrub ?
Not sure.
If yes, did you had problems with only 0.7.3 or also with 0.7.2 ?
Client side (it is just a 5th instance in the same EC2 zone, having
stress.py installed on it) gives the following error:
Process Inserter-4:
Traceback (most recent call last):
File /usr/lib64/python2.6/multiprocessing/process.py, line 232, in
_bootstrap
self.run()
File stress.py, line
And there three people here in Zurich if anyone else is lurking ... not
organized beer + discussion yet.
On Tue, Mar 8, 2011 at 7:52 PM, Jake Luciani jak...@gmail.com wrote:
There is also a newly formed NYC area Cassandra User Group
http://www.meetup.com/NYC-Cassandra-User-Group
On Tue,
Cool, so it's a server side because
- in the client side stack the thrift code is raising the error
- server side log has this DEBUG 22:29:10,318 ... timed out
The TimedOutException is raised when the number of replicas required by your CL
have not returned inside the timespan specified by
alienth on irc is reporting the same error. His path was 0.6.8 to
0.7.1 to 0.7.3.
It's probably a bug in scrub. If we can get an sstable exhibiting the
problem posted here or on Jira that would help troubleshoot.
On Tue, Mar 8, 2011 at 10:31 AM, Karl Hiramoto k...@hiramoto.org wrote:
On
On Tue, Mar 8, 2011 at 1:56 PM, Sanchez, Carlos carlos.sanc...@msci.com wrote:
Anything in Dallas?
Funny you should ask, on March 22nd there's:
http://dbdmh.eventbrite.com
Informal get-together more than a real event, but Cassandra has
come up as a topic and I suspect it would be a good
Turn on debug logging and see if the output looks like what I posted
to https://issues.apache.org/jira/browse/CASSANDRA-2296
It *may* be harmless depending on where those zero-length rows are
coming from. I've added asserts to 0.7 branch that fire if we attempt
to write a zero-length row, so if
Looks like it is harmless -- Scrub would write a zero-length row when
tombstones expire and there is nothing left, instead of writing no row
at all. Fix attached to the jira ticket.
On Tue, Mar 8, 2011 at 8:58 PM, Jonathan Ellis jbel...@gmail.com wrote:
It *may* be harmless depending on where
Do the overwrites of newly written columns(that are present in
memtable) *replace the old column* or is it just a simple append.
I am trying to understand that if I update these column very very
frequently(while they are in memtable), does the read performance of
these columns gets affected,
Multiple write for same key and column will result in overwriting of column
in a memtable. Basically multiple updates for same (key, column) are
reconciled based on the column's timestamp. This happens per memtable. So if
a memtable is flushed to an sstable, this rule will be valid for the next
so this means that in memtable only the most recent version of a
column will reside!? For this implementation, while writing to
memtable Cassandra will see if there are other versions and will
overwrite them (reconcilation while writing) !?
I know that different SST tables may have different
63 matches
Mail list logo