Re: Adding a node to cluster keeping 100% data replicated on all nodes

2014-02-10 Thread _ _
 Hi,
 
 Our environment will consist of cluster with size not bigger than 2 to 4 
 nodes per cluster(all
 located in the same DC). We want to ensure that every node in the cluster 
 will own 100% of
 the data. A node adding(or removing) procedure will be automated so we want 
 to ensure we're
 making the right steps. Lets say we have node 'A' up and running and want to 
 add another node
 'B' to make a cluster. Node A configuration will be: 
 seed: IP of A
 listen_address: IP of A
 num_tokens: 256
 rpc_address: 0.0.0.0
 The keyspace uses SimpleStrategy with RF: 1.
 
 Adding node 'B' to cluster we are doing the following:
 1. Stop cassandra on B.
 2. Update cassandra.yaml - change seed to point to IP of A
 3. Update cassandra-topology.properties - add node A ip to it and make it the 
 default one.
 4. rm -rf /var/lib/cassandra/*
 5. Start cassandra on B.
 6. Wait untill nodetool status reports the node B is up.
 7. Update RP of the keyspace to 2.
 8. Run nodetool repair on B and wait it to finish.
 
 Can we update the RF factor on A before starting Cassandra on B in order to 
 skip steps 7 and
 8?
 
 
 Now when the data is sync on both nodes we want to make a node B a seed node.
 9. Update seed property on A and B to include the the IP of B node.
 10. Restart cassandra on both nodes.
 
 If adding more nodes to the cluster the steps will be the same except that 
 seed property will
 contain all existing nodes in the cluster.
 
 So are these steps everything we need to do? 
 Is there anything more we need to do?
 Is there an easier way to do what we want or all the steps above are 
 mandatory?

Good day,

That's something I'm looking for too. Unfortunately till now I haven't found 
the right way to achieve it. Nodetool repair takes lots of time to execute. 



Re: One of my nodes is in the wrong datacenter - help!

2014-02-10 Thread Sholes, Joshua
In case anyone was following this issue, it ended up being something that 
looked an awful lot like CASSANDRA-6053 — when the node was removed, it didn’t 
successfully remove from the peers table from all nodes, and thus several of 
them were doing their best to try to contact it despite it being down.
--
Josh Sholes

From: Sholes, Josh Sholes 
joshua_sho...@cable.comcast.commailto:joshua_sho...@cable.comcast.com
Date: Thursday, February 6, 2014 at 1:41 PM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: One of my nodes is in the wrong datacenter - help!

Thanks for the advice.   I did use “removenode” as I was aware of the 
replace_token problems.
I haven’t run into the issue in CASSANDRA-6615 yet, and I don’t believe I’m at 
risk for it.

I’m actually running into a different problem.   Having done a remove node on 
the node with the incorrect datacenter name, I am still getting “one or more 
nodes were unavailable” messages when doing queries with consistency=all.   I’m 
doing a full repair pass on the column family in question just to be safe 
(which is taking forever!) before I do anything else.   So to reiterate:  my 
cluster now shows 7 nodes up when looking with gossipinfo or status, but will 
still not do consistency=all queries.   Are there any best practices for 
finding out other issues with the cluster, or should I anticipate the repair 
pass will fix the problem?
--
Josh Sholes

From: Robert Coli rc...@eventbrite.commailto:rc...@eventbrite.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Monday, February 3, 2014 at 7:30 PM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: One of my nodes is in the wrong datacenter - help!

On Sun, Feb 2, 2014 at 10:48 AM, Sholes, Joshua 
joshua_sho...@cable.comcast.commailto:joshua_sho...@cable.comcast.com wrote:
I had a node in my 8-node production 1.2.8 cluster have a serious problem and 
need to be removed and rebuilt.   However, after doing nodetool removenode and 
then bootstrapping a new node on the same IP address, the new node somehow 
ended up with a different datacenter name (the rest of the nodes are in dc 
$NAME, and the new one is in dc $NAME6934724 — as in, a string of seemingly 
random numbers appended to the correct name).   How can I force it to change DC 
names back to what it should be?

You could change the entry in the system.local columnfamily on the affected 
node...

cqlsh  update system.local set data_center = $NAME;

... but that is Not Supported and may have side effects of which I am not aware.

I’m working with 500+GB per node here so bootstrapping it again is not a huge 
issue, but I’d prefer to avoid it anyway.  I am NOT able to change the node’s 
IP address at this time so I’m stuck with bootstrapping a new node in the same 
place, which my gut feeling tells me might be part of the problem.

Note that replace_node/replace_token are broken in 1.2.8, did you attempt to 
use either of these? I presume not because you said you did removenode...

 If I were you, I would probably removenode and re-bootstrap, as the safest 
alternative.

As an aside, while trying to deal with this issue you should be aware of this 
ticket, so you do not do the sequence of actions it describes.

https://issues.apache.org/jira/browse/CASSANDRA-6615

=Rob


CQL3 Custom Functions

2014-02-10 Thread Drew Kutcharian
Hey Guys,

How can I define custom CQL3 functions (similar to dateOf, now, etc)?

Cheers,

Drew


RE: Hector Could not flush transport error

2014-02-10 Thread Senthil, Athinanthny X. -ND
Version is C 1.2.6. We use DSE 3.1.3

From: Robert Coli [mailto:rc...@eventbrite.com]
Sent: Friday, February 07, 2014 4:17 PM
To: user@cassandra.apache.org
Subject: Re: Hector Could not flush transport error

On Fri, Feb 7, 2014 at 4:05 PM, Senthil, Athinanthny X. -ND 
athinanthny.x.senthil@disney.commailto:athinanthny.x.senthil@disney.com
 wrote:
We get the below error in app logs when it's trying to hit DC which doesn't get 
traffic in multi DC cluster.

What version of Cassandra?

=Rob



Re: One of my nodes is in the wrong datacenter - help!

2014-02-10 Thread Edward Capriolo
Maybe that node was just trying to tell you that it really  wanted to work
in a different data center :)


On Mon, Feb 10, 2014 at 10:08 AM, Sholes, Joshua 
joshua_sho...@cable.comcast.com wrote:

  In case anyone was following this issue, it ended up being something
 that looked an awful lot like CASSANDRA-6053 -- when the node was removed,
 it didn't successfully remove from the peers table from all nodes, and thus
 several of them were doing their best to try to contact it despite it being
 down.
  --
 Josh Sholes

   From: Sholes, Josh Sholes joshua_sho...@cable.comcast.com
 Date: Thursday, February 6, 2014 at 1:41 PM
 To: user@cassandra.apache.org user@cassandra.apache.org
 Subject: Re: One of my nodes is in the wrong datacenter - help!

Thanks for the advice.   I did use removenode as I was aware of the
 replace_token problems.
 I haven't run into the issue in CASSANDRA-6615 yet, and I don't believe
 I'm at risk for it.

  I'm actually running into a different problem.   Having done a remove
 node on the node with the incorrect datacenter name, I am still getting
 one or more nodes were unavailable messages when doing queries with
 consistency=all.   I'm doing a full repair pass on the column family in
 question just to be safe (which is taking forever!) before I do anything
 else.   So to reiterate:  my cluster now shows 7 nodes up when looking with
 gossipinfo or status, but will still not do consistency=all queries.   Are
 there any best practices for finding out other issues with the cluster, or
 should I anticipate the repair pass will fix the problem?
  --
 Josh Sholes

   From: Robert Coli rc...@eventbrite.com
 Reply-To: user@cassandra.apache.org user@cassandra.apache.org
 Date: Monday, February 3, 2014 at 7:30 PM
 To: user@cassandra.apache.org user@cassandra.apache.org
 Subject: Re: One of my nodes is in the wrong datacenter - help!

On Sun, Feb 2, 2014 at 10:48 AM, Sholes, Joshua 
 joshua_sho...@cable.comcast.com wrote:

  I had a node in my 8-node production 1.2.8 cluster have a serious
 problem and need to be removed and rebuilt.   However, after doing nodetool
 removenode and then bootstrapping a new node on the same IP address, the
 new node somehow ended up with a different datacenter name (the rest of the
 nodes are in dc $NAME, and the new one is in dc $NAME6934724 -- as in, a
 string of seemingly random numbers appended to the correct name).   How can
 I force it to change DC names back to what it should be?


  You could change the entry in the system.local columnfamily on the
 affected node...

  cqlsh  update system.local set data_center = $NAME;

 ... but that is Not Supported and may have side effects of which I am not
 aware.

   I'm working with 500+GB per node here so bootstrapping it again is not
 a huge issue, but I'd prefer to avoid it anyway.  I am NOT able to change
 the node's IP address at this time so I'm stuck with bootstrapping a new
 node in the same place, which my gut feeling tells me might be part of the
 problem.


  Note that replace_node/replace_token are broken in 1.2.8, did you
 attempt to use either of these? I presume not because you said you did
 removenode...

   If I were you, I would probably removenode and re-bootstrap, as the
 safest alternative.

  As an aside, while trying to deal with this issue you should be aware of
 this ticket, so you do not do the sequence of actions it describes.

  https://issues.apache.org/jira/browse/CASSANDRA-6615

  =Rob



problems loading cassandra data from pig

2014-02-10 Thread Irooniam
Hello,

I posted this issue to the pig mailing list and I'm thinking the issue I'm
having is more related to cassandra?

When I run pig scripts against hadoop it works as advertised, however when
I try to have pig get data from cassandra it fails everytime.

Cassandra: [cqlsh 4.1.0 | Cassandra 2.0.4 | CQL spec 3.1.1 | Thrift
protocol 19.39.0]

Hadoop (Cloudera): 2.0.0+1518

Map Reduce: v2 (Yarn)

Pig: Apache Pig version 0.11.0-cdh4.5.0


The test schema is very simple:

cqlsh:main create table a (id int, name varchar, primary key (id));

cqlsh:main insert into a (id, name) values (1, 'blah');

cqlsh:main select * from a;

 id | name

+--

1 | blah

 (1 rows)


bash-4.2$ ./apache-cassandra-2.0.4-src/examples/pig/bin/pig_cassandra -x
local

Using /home/hdfs/pig-0.12.0-src/pig-withouthadoop.jar.

2014-02-07 17:09:18,948 [main] INFO org.apache.pig.Main - Apache Pig
version 0.10.0 (r1328203) compiled Apr 20 2012, 00:33:25

2014-02-07 17:09:18,949 [main] INFO org.apache.pig.Main - Logging error
messages to: /home/hdfs/pig_1391810958945.log

2014-02-07 17:09:19,373 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting
to hadoop file system at: file:///

2014-02-07 17:09:19,377 [main] WARN org.apache.hadoop.conf.Configuration -
mapred.used.genericoptionsparser is deprecated. Instead, use
mapreduce.client.genericoptionsparser.used

2014-02-07 17:09:19,394 [main] WARN
org.apache.hadoop.conf.Configuration -fs.default.name is deprecated.
Instead, use fs.defaultFS

2014-02-07 17:09:19,395 [main] WARN org.apache.hadoop.conf.Configuration -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address

SLF4J: Class path contains multiple SLF4J bindings.

SLF4J: Found binding in
[jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]


SLF4J: Found binding in
[jar:file:/home/hdfs/apache-cassandra-2.0.4-src/lib/slf4j-log4j12-1.7.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]


SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
explanation.

2014-02-07 17:09:20,026 [main] WARN org.apache.hadoop.conf.Configuration -
io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum

2014-02-07 17:09:20,030 [main] WARN
org.apache.hadoop.conf.Configuration -fs.default.name is deprecated.
Instead, use fs.defaultFS

2014-02-07 17:09:20,030 [main] WARN org.apache.hadoop.conf.Configuration -
mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address

grunt rows = LOAD 'cql://main/a' USING CqlStorage();

grunt describe rows;

rows: {id: int,name: chararray}


If I'm reading this correctly, pig can get the columns from the table in
question - but that's where things go awry.

grunt data = foreach rows generate $1;

grunt dump data;

2014-02-07 17:09:47,347 [main] INFO
org.apache.pig.tools.pigstats.ScriptState - Pig features used in the
script: UNKNOWN

2014-02-07 17:09:47,416 [main] INFO
org.apache.pig.newplan.logical.rules.ColumnPruneVisitor - Columns pruned
for rows: $0

2014-02-07 17:09:47,548 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler -
File concatenation threshold: 100 optimistic? false

2014-02-07 17:09:47,589 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
- MR plan size before optimization: 1

2014-02-07 17:09:47,589 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
- MR plan size after optimization: 1

2014-02-07 17:09:47,960 [main] WARN
org.apache.hadoop.conf.Configuration -session.id is deprecated.
Instead, use dfs.metrics.session-id

2014-02-07 17:09:47,968 [main] INFO
org.apache.hadoop.metrics.jvm.JvmMetrics - Initializing JVM Metrics with
processName=JobTracker, sessionId=

2014-02-07 17:09:48,055 [main] INFO
org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added
to the job

2014-02-07 17:09:48,075 [main] WARN org.apache.hadoop.conf.Configuration -
mapred.job.reduce.markreset.buffer.percent is deprecated. Instead, use
mapreduce.reduce.markreset.buffer.percent

2014-02-07 17:09:48,075 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3

2014-02-07 17:09:48,075 [main] WARN org.apache.hadoop.conf.Configuration -
mapred.output.compress is deprecated. Instead, use
mapreduce.output.fileoutputformat.compress

2014-02-07 17:09:48,206 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- Setting up single store job

2014-02-07 17:09:48,330 [main] ERROR org.apache.pig.tools.grunt.Grunt -
ERROR 2998: Unhandled internal error.
org.apache.hadoop.mapred.jobcontrol.JobControl.addJob(Lorg/apache/hadoop/mapred/jobcontrol/Job;)Ljava/lang/String;


Details at logfile: /home/hdfs/pig_1391810958945.log


 The log says:

Pig Stack Trace

---

ERROR 2998: Unhandled internal error.

Re: impact of update operation to read operation

2014-02-10 Thread Benedict Elliott Smith
Also, a major compaction doesn't flush the memtable. If the memtable is
still full, reads may take slightly longer as they may have to be merged
with any on-disk data before being served.


On 10 February 2014 21:18, Tupshin Harper tups...@tupshin.com wrote:

 You don't mention disks and RAM,  but I would assume that the additional
 data meant that you could now cache a lower percentage and that you have to
 seek on disk more often.

 -Tupshin
 On Feb 10, 2014 4:14 PM, Jiaan Zeng l.alle...@gmail.com wrote:

 Hi All,

 I am using Cassandra 1.2.4. I wonder if update operation has
 *permanent* impacts on read operation. Below is the scenario.

 Previously, a read only workload runs against one column family and
 has 4000 qps. Later, a read-update mixed workload  runs against the
 same column family. After that, the read only workload runs again but
 it could not get 4000 qps, only get 3500 qps. After a manual major
 compaction is issued through command line, the read only workload gets
 3600 qps which seems to tell major compaction does not help much.

 Did anyone have similar experiences? Any idea why this is happening?
 Thanks.

 --
 Regards,
 Jiaan




RE: Recommended OS

2014-02-10 Thread Keith Wright
Is this your first cluster?  Have you run older versions of Cassandra?  Any 
specific resource tuning?

Thanks all.  We are unable to bootstrap nodes and are considering creating a 
fresh cluster in hopes this is some how data related.

On Feb 10, 2014 5:33 PM, Brust, Corwin [Hollander] 
corwin.br...@hollanderparts.com wrote:
We’re running C* 2.0.5 under CentOS 6.5 and have not noticed anything like you 
describe.  We have just a couple of pre-production rings (Dev and Test) meaning 
nothing we have has received particularly intense utilization.

Corwin

From: Keith Wright [mailto:kwri...@nanigans.com]
Sent: Monday, February 10, 2014 2:09 PM
To: user@cassandra.apache.org
Cc: Don Jackson; Dave Carroll
Subject: Re: Recommended OS

We are running on CentOS 6.4 but an upgrade to 6.5 caused packets to backup on 
the net queue causing HUGE load spikes and cluster meltdown.  Ultimately we 
reverted.  Have others seen this?  Are others running CentOS 6.4/6.5?

Thanks

From: Sholes, Joshua 
joshua_sho...@cable.comcast.commailto:joshua_sho...@cable.comcast.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Monday, February 10, 2014 at 1:56 PM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Cc: Don Jackson djack...@nanigans.commailto:djack...@nanigans.com, Dave 
Carroll dcarr...@nanigans.commailto:dcarr...@nanigans.com
Subject: Re: Recommended OS

What issues are you running into with CentOS 6.4/5?  I’m running 1.2.8 on 
CentOS 6.3 and Java 1.7.0-25, and about to test with 1.7.latest.
--
Josh Sholes

From: Keith Wright kwri...@nanigans.commailto:kwri...@nanigans.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Monday, February 10, 2014 at 1:50 PM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Cc: Don Jackson djack...@nanigans.commailto:djack...@nanigans.com, Dave 
Carroll dcarr...@nanigans.commailto:dcarr...@nanigans.com
Subject: Recommended OS

Hi all,

I was wondering what operating systems and versions people are running with 
success in production environments?  We are using C* 1.2.13 and have had issues 
using CentOS 6.4/6.5.  Are others using that OS?  What would people recommend?  
What about Java 6 vs 7 (specific versions?!)?

Thanks!!!






PRIVILEGE AND CONFIDENTIALITY NOTICE
The information in this electronic mail (and any associated attachments) is 
intended for the named recipient(s) only and may contain privileged and 
confidential information. If you have received this message in error, you are 
hereby notified that any use, disclosure, copying or alteration of this message 
is strictly prohibited. If you are not the intended recipient(s), please 
contact the sender by reply email and destroy all copies of the original 
message. Thank you.







Re: ring describe returns only public ips

2014-02-10 Thread Chris Burroughs
More generally, a thrift api or other mechanism for Astyanax to get the 
INTERNAL_IP seems necessary to use ConnectionPoolType.TOKEN_AWARE + 
NodeDiscoveryType.TOKEN_AWARE in a multi-dc setup.  Absent one I'm 
confused how that combination is possible.


On 02/06/2014 03:17 PM, Ted Pearson wrote:

We are using Cassandra 1.2.13 in a multi-datacenter setup. We are using 
Astyanax as the client, and we’d like to enable its token aware connection pool 
type and ring describe node discovery type. Unfortunately, I’ve found that both 
thrift’s describe_ring and `nodetool ring` only report the public IPs of the 
cassandra nodes. This means that Astyanax tries to reconnect to the public IPs 
of each node, which doesn’t work and just results in no hosts being available 
for queries according to Astyanax.

I know from `nodetool gossipinfo` (and the fact that the clusters work) that 
it's sharing the LOCAL_IP via gossip, but have no idea how or if it’s possible 
to get describe_ring to return local IPs, or if there is some alternative.

Thanks,

-Ted





Using IN with the Datastax driver (2.0-??)

2014-02-10 Thread Jacob Rhoden
Hi Guys,

Im experimenting with using IN to reduce the number of quires I have to 
execute. The following works in CQL:

i.e select log_entry from log_index where keyword in (‘keyword1’, 
‘keyword2’, ‘keyword3’, etc…);

So I now want to work out how to convert this:

PreparedStatement p = session.prepare(select log_entry from log_index 
where keyword=?”);
session.execute(p.bind(keyword.toLowerCase())

To take a variable number of inputs, something like this???

PreparedStatement p = session.prepare(select log_entry from log_index 
where keyword in (...));
session.execute(p.bind(….))

Thanks,
Jacob

smime.p7s
Description: S/MIME cryptographic signature


Re: Using IN with the Datastax driver (2.0-??)

2014-02-10 Thread DuyHai Doan
Hello Jacob,

 You can try the bind marker for variadic param (new feature):

PreparedStatement p = session.prepare(select log_entry from log_index
where keyword *IN* ?);
session.execute(p.bind(Arrays.asList(keyword1,keyword2,...));

Regards

 Duy Hai DOAN


On Mon, Feb 10, 2014 at 11:50 PM, Jacob Rhoden jacob.rho...@me.com wrote:

 Hi Guys,

 Im experimenting with using IN to reduce the number of quires I have to
 execute. The following works in CQL:

 i.e select log_entry from log_index where keyword in ('keyword1',
 'keyword2', 'keyword3', etc...);

 So I now want to work out how to convert this:

 PreparedStatement p = session.prepare(select log_entry from log_index
 where keyword=?);
 session.execute(p.bind(keyword.toLowerCase())

 To take a variable number of inputs, something like this???

 PreparedStatement p = session.prepare(select log_entry from log_index
 where keyword in (...));
 session.execute(p.bind())

 Thanks,
 Jacob



Re: Using IN with the Datastax driver (2.0-??)

2014-02-10 Thread Jacob Rhoden
Perfect, thanks! I wonder if this is documented anywhere? Certainly I have no 
idea how to search google using the keyword “in” :D

String[] words = TagsToArray.tagsToArray(keyword.toLowerCase());
PreparedStatement p = api.getCassandraSession().prepare(select log_entry 
from log_index where keyword in ?);
session.execute(p.bind(Arrays.asList(words;

Thanks,
Jacob

On 11 Feb 2014, at 9:55 am, DuyHai Doan doanduy...@gmail.com wrote:

 Hello Jacob,
 
  You can try the bind marker for variadic param (new feature):
 
 PreparedStatement p = session.prepare(select log_entry from log_index where 
 keyword IN ?”);
   session.execute(p.bind(Arrays.asList(keyword1,keyword2,...));
 
 Regards
 
  Duy Hai DOAN
 



smime.p7s
Description: S/MIME cryptographic signature


Re: Using IN with the Datastax driver (2.0-??)

2014-02-10 Thread DuyHai Doan
I don't know if it's documented somewhere. Personnally I got the info by
following the Cassandra dev blog and reading each new release notes.

 In each Cassandra release notes, you have a list of bug fixes but also new
features. Just read the corresponding JIRA to get the details.

 Regards

 Duy Hai DOAN


On Tue, Feb 11, 2014 at 12:18 AM, Jacob Rhoden jacob.rho...@me.com wrote:

 Perfect, thanks! I wonder if this is documented anywhere? Certainly I have
 no idea how to search google using the keyword in :D

 String[] words = TagsToArray.tagsToArray(keyword.toLowerCase());
 PreparedStatement p = api.getCassandraSession().prepare(select
 log_entry from log_index where keyword in ?);
 session.execute(p.bind(Arrays.asList(words;

 Thanks,
 Jacob

 On 11 Feb 2014, at 9:55 am, DuyHai Doan doanduy...@gmail.com wrote:

 Hello Jacob,

  You can try the bind marker for variadic param (new feature):

 PreparedStatement p = session.prepare(select log_entry from log_index
 where keyword *IN* ?);
  session.execute(p.bind(Arrays.asList(keyword1,keyword2,...));

 Regards

  Duy Hai DOAN





Clarification on how multi-DC replication works

2014-02-10 Thread Sameer Farooqui
Hi,

I was hoping someone could clarify a point about multi-DC replication.

Let's say I have 2 data centers configured with replication factor = 3 in
each DC.

My client app is sitting in DC 1 and is able to intelligently pick a
coordinator that will also be a replica partner.

So the client app sends a write with consistency for DC1 = Q and
consistency for DC2 = Q to a coordinator node in DC1.

That coordinator in DC1 forwards the write to 2 other nodes in DC1 and a
coordinator in DC2.

Is it correct that all 3 nodes in DC2 will respond back to the original
coordinator in DC1? Or will the DC2 nodes respond back to the DC2
coordinator?

Let's say one of the replica nodes in DC2 is down. Who will hold the hint
for that node? The original coordinator in DC1 or the coordinator in DC2?