Re: Query regarding spark on cassandra

2016-04-27 Thread Siddharth Verma
Hi, If the info could be used
we are using two DCs
dc1 - 3 nodes
dc2 - 1 node
however, dc2 has been down for 3-4 weeks, and we haven't removed it yet.

spark slaves on same machines as the cassandra nodes.
each node has two instances of slaves.

spark master on a separate machine.

If anyone could provide insight to the problem, it would be helpful.

Thanks

On Wed, Apr 27, 2016 at 11:11 PM, Siddharth Verma <
verma.siddha...@snapdeal.com> wrote:

> Hi,
> I dont know, if someone has faced this problem or not.
> I am running a job where some data is loaded from cassandra table. From
> that data, i make some insert and delete statements.
> and execute it (using forEach)
>
> Code snippet:
> boolean deleteStatus= connector.openSession().execute(delete).wasApplied();
> boolean  insertStatus =
> connector.openSession().execute(insert).wasApplied();
> System.out.println(delete+":"+deleteStatus);
> System.out.println(insert+":"+insertStatus);
>
> When i run it locally, i see the respective results in the table.
>
> However when i run it on a cluster, sometimes the result is displayed and
> sometime the changes don't take place.
> I saw the stdout from web-ui of spark, and the query along with true was
> printed for both the queries.
>
> I can't understand, what could be the issue.
>
> Any help would be appreciated.
>
> Thanks,
> Siddharth Verma
>


RE: Inconsistent Reads after Restoring Snapshot

2016-04-27 Thread Anuj Wadehra
No.We are not saving them.I have never read that in DataStax documentation.
ThanksAnuj

Sent from Yahoo Mail on Android 
 
  On Thu, 28 Apr, 2016 at 12:45 AM, 
sean_r_dur...@homedepot.com wrote:   
What about the commitlogs? Are you saving those off anywhere in between the 
snapshot and the crash?
 
  
 
  
 
Sean Durity
 
  
 
From: Anuj Wadehra [mailto:anujw_2...@yahoo.co.in]
Sent: Monday, April 25, 2016 10:26 PM
To: User
Subject: Inconsistent Reads after Restoring Snapshot
 
  
 
Hi,
 
  
 
We have 2.0.14. We use RF=3 and read/write at Quorum. Moreover, we dont use 
incremental backups. As per the documentation at 
https://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_backup_snapshot_restore_t.html
 , if i need to restore a Snapshot on SINGLE node in a cluster, I would run 
repair at the end. But while the repair is going on, reads may get inconsistent.
 
  
 
  
 
Consider following scenario:
 
10 AM Daily Snapshot taken of node A and moved to backup location
 
11 AM A record is inserted such that node A and B insert the record but there 
is a mutation drop on node C.
 
1 PM Node A crashes and data is restored from latest 10 AM snapshot. Now, only 
Node B has the record.
 
  
 
Now, my question is:
 
  
 
Till the repair is completed on node A,a read at Quorum may return inconsistent 
result based on the nodes from which data is read.If data is read from node A 
and node C, nothing is returned and if data is read from node A and node B, 
record is returned. This is a vital point which is not highlighted anywhere.
 
  
 
  
 
Please confirm my understanding.If my understanding is right, how to make sure 
that my reads are not inconsistent while a node is being repair after restoring 
a snapshot.
 
  
 
I think, autobootstrapping the node without joining the ring till the repair is 
completed, is an alternative option. But snapshots save lot of streaming as 
compared to bootstrap.
 
  
 
Will incremental backups guarantee that 
 
  
 
Thanks
 
Anuj
 
  
 
  
 

Sent from Yahoo Mail on Android 

The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.
  


RE: Inconsistent Reads after Restoring Snapshot

2016-04-27 Thread SEAN_R_DURITY
What about the commitlogs? Are you saving those off anywhere in between the 
snapshot and the crash?


Sean Durity

From: Anuj Wadehra [mailto:anujw_2...@yahoo.co.in]
Sent: Monday, April 25, 2016 10:26 PM
To: User
Subject: Inconsistent Reads after Restoring Snapshot

Hi,

We have 2.0.14. We use RF=3 and read/write at Quorum. Moreover, we dont use 
incremental backups. As per the documentation at 
https://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_backup_snapshot_restore_t.html
 , if i need to restore a Snapshot on SINGLE node in a cluster, I would run 
repair at the end. But while the repair is going on, reads may get inconsistent.


Consider following scenario:
10 AM Daily Snapshot taken of node A and moved to backup location
11 AM A record is inserted such that node A and B insert the record but there 
is a mutation drop on node C.
1 PM Node A crashes and data is restored from latest 10 AM snapshot. Now, only 
Node B has the record.

Now, my question is:

Till the repair is completed on node A,a read at Quorum may return inconsistent 
result based on the nodes from which data is read.If data is read from node A 
and node C, nothing is returned and if data is read from node A and node B, 
record is returned. This is a vital point which is not highlighted anywhere.


Please confirm my understanding.If my understanding is right, how to make sure 
that my reads are not inconsistent while a node is being repair after restoring 
a snapshot.

I think, autobootstrapping the node without joining the ring till the repair is 
completed, is an alternative option. But snapshots save lot of streaming as 
compared to bootstrap.

Will incremental backups guarantee that

Thanks
Anuj


Sent from Yahoo Mail on 
Android



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


Query regarding spark on cassandra

2016-04-27 Thread Siddharth Verma
Hi,
I dont know, if someone has faced this problem or not.
I am running a job where some data is loaded from cassandra table. From
that data, i make some insert and delete statements.
and execute it (using forEach)

Code snippet:
boolean deleteStatus= connector.openSession().execute(delete).wasApplied();
boolean  insertStatus =
connector.openSession().execute(insert).wasApplied();
System.out.println(delete+":"+deleteStatus);
System.out.println(insert+":"+insertStatus);

When i run it locally, i see the respective results in the table.

However when i run it on a cluster, sometimes the result is displayed and
sometime the changes don't take place.
I saw the stdout from web-ui of spark, and the query along with true was
printed for both the queries.

I can't understand, what could be the issue.

Any help would be appreciated.

Thanks,
Siddharth Verma


RE: how expensive is light weight transaction: if not exists

2016-04-27 Thread Jacques-Henri Berthemet
Hi,

You can’t batch LWT if they don’t work on the same partition. So in your below 
queries all the values of “id” must be the same.

--
Jacques-Henri Berthemet

From: y2k...@gmail.com [mailto:y2k...@gmail.com] On Behalf Of Jimmy Lin
Sent: mercredi 27 avril 2016 18:14
To: user@cassandra.apache.org
Subject: how expensive is light weight transaction: if not exists

hi all,
we like to consider using light weight transaction like the following:
begin batch:
update table set x=y where id=A if not exists;
update table set x=y where id=B if not exists;
update table set x=y where id=C if not exists;
update table set x=y where id=D if not exists;
apply batch
(using LOCAL_QUORUM)
I know there is lot of things going on behind the cass light weight 
transaction, just how much overhead when using "if not exists" ?


Re: MX4J support broken in cassandra 3.0.5?

2016-04-27 Thread Fabien Rousseau
Hi Robert,

This could be related to:
https://issues.apache.org/jira/plugins/servlet/mobile#issue/CASSANDRA-9242
(Maybe you can try to comment this option and try again)
Le 27 avr. 2016 15:21, "Robert Sicoie"  a écrit :

> Hi guys,
>
> I'm upgrading from cassandra 2.1 to cassandra 3.0.5 and mx4j support seems
> to be broker. An empty html page is shown:
>
> > GET / HTTP/1.1
> > Host: localhost:8081
> > User-Agent: curl/7.43.0
> > Accept: */*
> >
> * HTTP 1.0, assume close after body
> < HTTP/1.0 200 OK
> < expires: now
> < Server: MX4J-HTTPD/1.0
> < Cache-Control: no-cache
> < pragma: no-cache
> < Content-Type: text/html
>
> This is what I have in cassandra-env.sh
> ...
> MX4J_PORT="-Dmx4jport=8081"
> ...
> And the mx4j-tools.jar is in place.
>
> It worked fine with cassandra 2.1. Is there a new configuration needed in
> 3.0.5?
>
> Any advice?
>
> Thanks,
> Robert
>
> In order to protect our email recipients, the Paddy Power Betfair plc
> group of companies use MessageLabs to scan all Incoming and Outgoing mail
> for viruses.
> Paddy Power Betfair may monitor the content of email sent and received for
> the purpose of ensuring compliance with its policies and procedures.
>


how expensive is light weight transaction: if not exists

2016-04-27 Thread Jimmy Lin
hi all,
we like to consider using light weight transaction like the following:
begin batch:
update table set x=y where id=A if not exists;
update table set x=y where id=B if not exists;
update table set x=y where id=C if not exists;
update table set x=y where id=D if not exists;
apply batch
(using LOCAL_QUORUM)
I know there is lot of things going on behind the cass light weight
transaction, just how much overhead when using "if not exists" ?


MX4J support broken in cassandra 3.0.5?

2016-04-27 Thread Robert Sicoie

Hi guys,

I'm upgrading from cassandra 2.1 to cassandra 3.0.5 and mx4j support 
seems to be broker. An empty html page is shown:


> GET / HTTP/1.1
> Host: localhost:8081
> User-Agent: curl/7.43.0
> Accept: */*
>
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< expires: now
< Server: MX4J-HTTPD/1.0
< Cache-Control: no-cache
< pragma: no-cache
< Content-Type: text/html

This is what I have in cassandra-env.sh
...
MX4J_PORT="-Dmx4jport=8081"
...
And the mx4j-tools.jar is in place.

It worked fine with cassandra 2.1. Is there a new configuration needed 
in 3.0.5?


Any advice?

Thanks,
Robert

In order to protect our email recipients, the Paddy Power Betfair plc group of companies use MessageLabs to scan all Incoming and Outgoing mail for viruses.  


Paddy Power Betfair may monitor the content of email sent and received for the 
purpose of ensuring compliance with its policies and procedures.