Sorry to intrude in this thread, but my intention is to get a clarity on
read_repair_chance.
Our reads doesn't need near real time data, so all our reads use CL.ONE. In
this case, how read repair happens in the replicas? what should be the
ideal value of read_repair_chance in this case?
how often
So what I want is, Cassandra provide some information for client, to
indicate A is stored before B, e.g. global unique timestamp, or row order.
The row order is determined by 1) the comparator you use for the column
family and 2) the column names you, the client, choose for A and B. So what
@Bryan,
To keep data size as low as possible with TTL columns we still use STCS and
nightly major compactions.
Experience with LCS was not successful in our case, data size keeps too high
along with amount of compactions.
IMO, before 1.2, LCS was good for CFs without TTL or high delete rate.
Hi
I have a cluster of 3 nodes running Cassandra v1.2 with num_tokens set to 256.
It's running on EC2. When I installed the cluster, I took up one node with seed
set to it's own IP. The next 2 had the first one as seed. A 'nodetool status'
shows all 3 nodes up and running. Replicationfactor is
Cool feature, didn't know it existed. It turned however out that everything
works fine! There was a configuration error that duplicated a AWS sns-sqs
subscription, so we go twice the amount of data delivered to our
application. Semi-lame post to this mailing list i guess :( I should have
checked
Now, one of the nodes dies, and when I bring it back up, it does'nt join
the cluster again, but becomes it own node/cluster. I can't get it to join
the cluster again, even after doing 'removenode' and clearing all data.
That obviously should not have happened. That being said we have a few
On Jan 17, 2013, at 11:54 AM, Sylvain Lebresne sylv...@datastax.com wrote:
Now, one of the nodes dies, and when I bring it back up, it does'nt join the
cluster again, but becomes it own node/cluster. I can't get it to join the
cluster again, even after doing 'removenode' and clearing all
what do you mean ? it's not needed by Pig or Hive to access Cassandra data.
Regards
On Jan 16, 2013, at 11:14 PM, Brandon Williams
dri...@gmail.commailto:dri...@gmail.com wrote:
You won't get CFS,
but it's not a hard requirement, either.
CFS is Cassandra File System:
http://www.datastax.com/dev/blog/cassandra-file-system-design
But you don't need CFS to connect from PIG to Cassandra. The latest
versions of Cassandra Source ship with examples of connecting from pig to
cassandra.
apache-cassandra-1.2.0-src/examples/pig --
Jared, how do you guys handle data backups for your ephemeral based cluster?
I'm trying to move to ephemeral drives myself, and that was my last
sticking point; asking how others in the community deal with backup in case
the VM explodes.
On Wed, Jan 16, 2013 at 1:21 PM, Jared Biel
I have a peer EBS disk to the ephemeral disk . Then I do nodetool
snapshot - rsync from ephemeral to EBS - take snapshot of EBS. Syncing
nodetool snapshot directly to S3 would involve less steps and be cheaper
(EBS costs more than S3), but I do post processing on the snapshot for EMR,
and it
Jimmy,
I understand that CFS can replace HDFS for those who use Hadoop. I just want to
use pig and hive on cassandra. I know that pig samples are provided and work
now with cassandra natively (they are part of the core). However, does it mean
that the process will be spread over nodes with
This really depends on how you design your Hadoop Cluster. The testing I
have done, had Hadoop and Cassandra Nodes collocated on the same hosts.
Remember that Pig code runs inside of your hadoop cluster, and connects to
Cassandra as the Database engine.
I have not done any testing with Hive, so
http://goo.gl/CkXv3
On Wed, Jan 16, 2013 at 12:39 PM, Leonid Ilyevsky lilyev...@mooncapital.com
wrote:
** **
** **
*Leonid Ilyevsky*
*Moon Capital Management, LP*
499 Park Avenue
New York, NY 10022
P: (212) 652-4586
F: (212) 652-4501
E:
Ok, I understand that I need to manage both cassandra and hadoop components and
that pig will use hadoop components to launch its tasks which will use
Cassandra as the Storage engine.
Thanks
--
Cyril SCETBON
On Jan 17, 2013, at 4:03 PM, James Schappet
We use a replication factor such that if any one instance dies the
cluster would remain alive. If a node dies, we simply replace it and
move on. As far as disaster recovery, it's easy to store snapshots in
S3, although glacier is looking interesting.
Jared Biel
System Administrator
Bolder Thinking
I'd recommend Priam.
http://techblog.netflix.com/2012/02/announcing-priam.html
Andrey
On Thu, Jan 17, 2013 at 5:44 AM, Adam Venturella aventure...@gmail.comwrote:
Jared, how do you guys handle data backups for your ephemeral based
cluster?
I'm trying to move to ephemeral drives myself,
We are using LCS and the particular row I've referenced has been involved
in several compactions after all columns have TTL expired. The most recent
one was again this morning and the row is still there -- TTL expired for
several days now with gc_grace=0 and several compactions later ...
$
If you have 40ms NTP drift something is VERY VERY wrong. You should have a
local NTP server on the same subnet, do not try to use one on the moon.
On Thu, Jan 17, 2013 at 4:42 AM, Sylvain Lebresne sylv...@datastax.comwrote:
So what I want is, Cassandra provide some information for client, to
Silly question -- but does hive/pig hadoop etc work with cassandra
1.1.8? Or only with 1.2? We are using astyanax library, which seems
to fail horribly on 1.2, so we're still on 1.1.8. But we're just
starting out with this and i'm still debating between cassandra and
hbase. So I just want to
https://issues.apache.org/jira/browse/CASSANDRA-4813
Fixed in 1.2.0
Best,
michael
From: chandra Varahala
hadoopandcassan...@gmail.commailto:hadoopandcassan...@gmail.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
I am not reducers, just Map only job still same kind issue ?
thanks
chandra
On Thu, Jan 17, 2013 at 1:50 PM, Michael Kjellman
mkjell...@barracuda.comwrote:
https://issues.apache.org/jira/browse/CASSANDRA-4813
Fixed in 1.2.0
Best,
michael
From: chandra Varahala
oracle java + cassandra binary should work fiine
2013/1/17 Sloot, Hans-Peter hans-peter.sl...@atos.net
Well I tried to use the oracle stuff but the Cassandra rpm’s seem to
depend on the open-jdk‘s
*From:* Michael Kjellman [mailto:mkjell...@barracuda.com]
*Sent:* dinsdag 15 januari 2013
Hi,
I am trying to maximize execution of the number of read queries/second.
Here is my cluster configuration.
Replication - Default
12 Data Nodes.
16 Client Nodes - used for querying.
Each client node executes 32 threads - each thread executes 76896 read
queries using cassandra-cli tool.
It was primarily a streaming issue not a Hadoop component issue. Seems very
similar to not be related IMHO
On Jan 17, 2013, at 10:59 AM, chandra Varahala
hadoopandcassan...@gmail.commailto:hadoopandcassan...@gmail.com wrote:
I am not reducers, just Map only job still same kind issue ?
thanks
and thrift operation code :-
You life will be a lot easier if you use one of the many find Java Cassandra
clients such as https://github.com/Netflix/astyanax or
https://github.com/hector-client/hector.
They know how to talk to C*
Cheers
-
Aaron Morton
Freelance Cassandra
In this case, how read repair happens in the replicas?
By default 90% of the reads will only read from 1 replica, and 10% will read
from all. However the client request will *only* wait for one replica to return
a value. And it has to be the replica that was asked to return the full data,
not
Semi-lame post to this mailing list i guess :( I should have checked that
earlier
No problems.
Cheers
-
Aaron Morton
Freelance Cassandra Developer
New Zealand
@aaronmorton
http://www.thelastpickle.com
On 17/01/2013, at 11:50 PM, Reik Schatz reik.sch...@gmail.com wrote:
Wow you managed to do a load test through the cassandra-cli. There should
be a merit badge for that.
You should use the built in stress tool or YCSB.
The CLI has to do much more string conversion then a normal client would
and it is not built for performance. You will definitely get better
Hi all,
I am using some composite keys to get just some specific composite
columns names which I am using as follows:
create column family video_event
with comparator = 'CompositeType(UTF8Type,UTF8Type)'
and key_validation_class = 'UTF8Type'
and default_validation_class = 'UTF8Type';
I'm able to reproduce this behavior on my laptop using 1.1.5, 1.1.7, 1.1.8,
a trivial schema, and a simple script that just inserts rows. If the TTL
is small enough so that all LCS data fits in generation 0 then the rows
seem to be removed with TTL expires as desired. However, if the insertion
Hi,
Thanks. I would like to benchmark cassandra with our application so
that we understand the details of how the actual benchmarking is done.
Not sure, how easy it would be to integrate YCSB with our application.
So, i am trying different client interfaces to cassandra.
I found
for 12 Data
When you ran this test, is that the exact schema you used? I'm not seeing
where you are setting gc_grace to 0 (although I could just be blind, it
happens).
On Thu, Jan 17, 2013 at 5:01 PM, Bryan Talbot btal...@aeriagames.comwrote:
I'm able to reproduce this behavior on my laptop using 1.1.5,
Thanks Tyler.
I just moved the pool and cf which store the connection pool and CF
information to have global scope.
Increased the server_list values from 1 to 4. ( i think i can increase
them max to 12 since I have 12 data nodes )
when I created 8 threads using python threading package , I see
Bleh, I rushed out the email before some meetings and I messed something
up. Working on reproducing now with better notes this time.
-Bryan
On Thu, Jan 17, 2013 at 4:45 PM, Derek Williams de...@fyrie.net wrote:
When you ran this test, is that the exact schema you used? I'm not seeing
where
Everyone, thanks a lot for the answer, they helped me a lot.
2013/1/17 Andrey Ilinykh ailin...@gmail.com
I'd recommend Priam.
http://techblog.netflix.com/2012/02/announcing-priam.html
Andrey
On Thu, Jan 17, 2013 at 5:44 AM, Adam Venturella aventure...@gmail.comwrote:
Jared, how do you
36 matches
Mail list logo