Missing data

2015-06-15 Thread Jean Tremblay
Hi,

I have reloaded the data in my cluster of 3 nodes RF: 2.
I have loaded about 2 billion rows in one table.
I use LeveledCompactionStrategy on my table.
I use version 2.1.6.
I use the default cassandra.yaml, only the ip address for seeds and throughput 
has been change.

I loaded my data with simple insert statements. This took a bit more than one 
day to load the data… and one more day to compact the data on all nodes.
For me this is quite acceptable since I should not be doing this again.
I have done this with previous versions like 2.1.3 and others and I basically 
had absolutely no problems.

Now I read the log files on the client side, there I see no warning and no 
errors.
On the nodes side there I see many WARNING, all related with tombstones, but 
there are no ERRORS.

My problem is that I see some *many missing records* in the DB, and I have 
never observed this with previous versions.

1) Is this a know problem?
2) Do you have any idea how I could track down this problem?
3) What is the meaning of this WARNING (the only type of ERROR | WARN  I could 
find)?

WARN  [SharedPool-Worker-2] 2015-06-15 10:12:00,866 SliceQueryFilter.java:319 - 
Read 2990 live and 16016 tombstone cells in gttdata.alltrades_co_rep_pcode for 
key: D:07 (see tombstone_warn_threshold). 5000 columns were requested, 
slices=[388:201001-388:201412:!]


4) Is it possible to have Tombstone when we make no DELETE statements?

I’m lost…

Thanks for your help.


RE: PrepareStatement problem

2015-06-15 Thread Peer, Oded
This only applies to “select *” queries where you don’t specify the column 
names.
There is a reported bug and fixed in 2.1.3. See 
https://issues.apache.org/jira/browse/CASSANDRA-7910

From: joseph gao [mailto:gaojf.bok...@gmail.com]
Sent: Monday, June 15, 2015 10:52 AM
To: user@cassandra.apache.org
Subject: PrepareStatement problem

hi, all
  I'm using PrepareStatement. If I prepare a sql everytime I use, cassandra 
will give me a warning tell me NOT PREPARE EVERYTIME. So I Cache the 
PrepareStatement locally . But when other client change the table's schema, 
like, add a new Column, If I still use the former Cached PrepareStatement, the 
metadata will dismatch the data. The metadata tells n column, and the data 
tells n+1 column. So what should I do to avoid this problem?

--
--
Joseph Gao
PhoneNum:15210513582
QQ: 409343351


Catastrophy Recovery.

2015-06-15 Thread Jean Tremblay

Hi,

I have a cluster of 3 nodes RF: 2.
There are about 2 billion rows in one table.
I use LeveledCompactionStrategy on my table.
I use version 2.1.6.
I use the default cassandra.yaml, only the ip address for seeds and throughput 
has been change.

I am have tested a scenario where one node crashes and loose all its data.
I have deleted all data on this node after having stopped Cassandra.
At this point I noticed that the cluster was giving proper results. What I was 
expecting from a cluster DB.

I then restarted that node and I observed that the node was joining the cluster.
After an hour or so the old “defect” node was up and normal.
I noticed that its hard disk loaded with much less data than its neighbours.

When I was querying the DB, the cluster was giving me different results for 
successive identical queries.
I guess the old “defect” node was giving me less rows than it should have.

1) For what I understand, if you have a fixed node with no data it will 
automatically bootstrap and recover all its old data from its neighbour while 
doing the joining phase. Is this correct?
2) After such catastrophe, and after the joining phase is done should the 
cluster not be ready to deliver always consistent data if there was no inserts 
or delete during the catastrophe?
3) After the bootstrap of a broken node is finish, i.e. after the joining 
phase, is there not simply a repair to be done on that node using “node repair?


Thanks for your comments.

Kind regards

Jean



Re: Missing data

2015-06-15 Thread Carlos Rolo
Hi Jean,

The problem of that Warning is that you are reading too many tombstones per
request.

If you do have Tombstones without doing DELETE it because you probably
TTL'ed the data when inserting (By mistake? Or did you set
default_time_to_live in your table?). You can use nodetool cfstats to see
how many tombstones per read slice you have. This is, probably, also the
cause of your missing data. Data was tombstoned, so it is not available.



Regards,

Carlos Juzarte Rolo
Cassandra Consultant

Pythian - Love your data

rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo
http://linkedin.com/in/carlosjuzarterolo*
Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649
www.pythian.com

On Mon, Jun 15, 2015 at 10:54 AM, Jean Tremblay 
jean.tremb...@zen-innovations.com wrote:

  Hi,

  I have reloaded the data in my cluster of 3 nodes RF: 2.
 I have loaded about 2 billion rows in one table.
 I use LeveledCompactionStrategy on my table.
 I use version 2.1.6.
 I use the default cassandra.yaml, only the ip address for seeds and
 throughput has been change.

  I loaded my data with simple insert statements. This took a bit more
 than one day to load the data… and one more day to compact the data on all
 nodes.
 For me this is quite acceptable since I should not be doing this again.
 I have done this with previous versions like 2.1.3 and others and I
 basically had absolutely no problems.

  Now I read the log files on the client side, there I see no warning and
 no errors.
 On the nodes side there I see many WARNING, all related with tombstones,
 but there are no ERRORS.

  My problem is that I see some *many missing records* in the DB, and I
 have never observed this with previous versions.

  1) Is this a know problem?
 2) Do you have any idea how I could track down this problem?
 3) What is the meaning of this WARNING (the only type of ERROR | WARN  I
 could find)?

  WARN  [SharedPool-Worker-2] 2015-06-15 10:12:00,866
 SliceQueryFilter.java:319 - Read 2990 live and 16016 tombstone cells in
 gttdata.alltrades_co_rep_pcode for key: D:07 (see
 tombstone_warn_threshold). 5000 columns were requested,
 slices=[388:201001-388:201412:!]


  4) Is it possible to have Tombstone when we make no DELETE statements?

  I’m lost…

  Thanks for your help.


-- 


--





Re: Catastrophy Recovery.

2015-06-15 Thread Jean Tremblay
That is really wonderful. Thank you very much Alain. You gave me a lot of 
trails to investigate. Thanks again for you help.

On 15 Jun 2015, at 17:49 , Alain RODRIGUEZ 
arodr...@gmail.commailto:arodr...@gmail.com wrote:

Hi, it looks like your starting to use Cassandra.

Welcome.

I invite you to read from here as much as you can 
http://docs.datastax.com/en/cassandra/2.1/cassandra/gettingStartedCassandraIntro.html.

When a node lose some data you have various anti entropy mechanism

Hinted Handoff -- For writes that occurred while node was down and known as 
such by other nodes (exclusively)
Read repair -- On each read, you can set a chance to check other nodes for 
auto correction.
Repair ( called either manual / anti entropy / full / ...) : Which takes care 
to give back a node its missing data only for the range this node handles (-pr) 
or for all its data (its range plus its replica). This is something you 
generally want to perform on all nodes on a regular basis (lower than the 
lowest gc_grace_period set on any of your tables).

Also, you are having wrong values because you probably have a Consistency Level 
(CL) too low. If you want this to never happen you have to set Read (R) / Write 
(W) consistency level as follow : R + W  RF (Refplication Factor), if not you 
can see what you are currently seeing. I advise you to set your consistency to 
local_quorum or quorum on single DC environment. Also, with 3 nodes, you 
should set RF to 3, if not you won't be able to reach a strong consistency due 
to the formula I just give you.

There is a lot more to know, you should read about this all. Using Cassandra 
without knowing about its internals would lead you to very poor and unexpected 
results.

To answer your questions:

For what I understand, if you have a fixed node with no data it will 
automatically bootstrap and recover all its old data from its neighbour while 
doing the joining phase. Is this correct?

-- Not at all, unless it join the ring for the first time, which is not your 
case. Through it will (by default) slowly recover while you read.

After such catastrophe, and after the joining phase is done should the cluster 
not be ready to deliver always consistent data if there was no inserts or 
delete during the catastrophe?

No, we can't ensure that, excepted dropping the node and bootstrapping a new 
one. What we can make sure of is that there is enough replica remaining to 
serve consistent data (search for RF and CL)

After the bootstrap of a broken node is finish, i.e. after the joining phase, 
is there not simply a repair to be done on that node using “node repair?

This sentence is false bootstrap / joining phase ≠ from broken node coming 
back. You are right on repair, if a broken node (or down for too long - default 
3 hours) come back you have to repair. But repair is slow, make sure you can 
afford a node, see my previous answer.

Testing is a really good idea but you also have to read a lot imho.

Good luck,

C*heers,

Alain


2015-06-15 11:13 GMT+02:00 Jean Tremblay 
jean.tremb...@zen-innovations.commailto:jean.tremb...@zen-innovations.com:

Hi,

I have a cluster of 3 nodes RF: 2.
There are about 2 billion rows in one table.
I use LeveledCompactionStrategy on my table.
I use version 2.1.6.
I use the default cassandra.yaml, only the ip address for seeds and throughput 
has been change.

I am have tested a scenario where one node crashes and loose all its data.
I have deleted all data on this node after having stopped Cassandra.
At this point I noticed that the cluster was giving proper results. What I was 
expecting from a cluster DB.

I then restarted that node and I observed that the node was joining the cluster.
After an hour or so the old “defect” node was up and normal.
I noticed that its hard disk loaded with much less data than its neighbours.

When I was querying the DB, the cluster was giving me different results for 
successive identical queries.
I guess the old “defect” node was giving me less rows than it should have.

1) For what I understand, if you have a fixed node with no data it will 
automatically bootstrap and recover all its old data from its neighbour while 
doing the joining phase. Is this correct?
2) After such catastrophe, and after the joining phase is done should the 
cluster not be ready to deliver always consistent data if there was no inserts 
or delete during the catastrophe?
3) After the bootstrap of a broken node is finish, i.e. after the joining 
phase, is there not simply a repair to be done on that node using “node repair?


Thanks for your comments.

Kind regards

Jean





Re: Catastrophy Recovery.

2015-06-15 Thread Alain RODRIGUEZ
Hi, it looks like your starting to use Cassandra.

Welcome.

I invite you to read from here as much as you can
http://docs.datastax.com/en/cassandra/2.1/cassandra/gettingStartedCassandraIntro.html
.

When a node lose some data you have various anti entropy mechanism

Hinted Handoff -- For writes that occurred while node was down and known
as such by other nodes (exclusively)
Read repair -- On each read, you can set a chance to check other nodes for
auto correction.
Repair ( called either manual / anti entropy / full / ...) : Which takes
care to give back a node its missing data only for the range this node
handles (-pr) or for all its data (its range plus its replica). This is
something you generally want to perform on all nodes on a regular basis
(lower than the lowest gc_grace_period set on any of your tables).

Also, you are having wrong values because you probably have a Consistency
Level (CL) too low. If you want this to never happen you have to set Read
(R) / Write (W) consistency level as follow : R + W  RF (Refplication
Factor), if not you can see what you are currently seeing. I advise you to
set your consistency to local_quorum or quorum on single DC
environment. Also, with 3 nodes, you should set RF to 3, if not you won't
be able to reach a strong consistency due to the formula I just give you.

There is a lot more to know, you should read about this all. Using
Cassandra without knowing about its internals would lead you to very poor
and unexpected results.

To answer your questions:

For what I understand, if you have a fixed node with no data it will
automatically bootstrap and recover all its old data from its neighbour
while doing the joining phase. Is this correct?

-- Not at all, unless it join the ring for the first time, which is not
your case. Through it will (by default) slowly recover while you read.

After such catastrophe, and after the joining phase is done should the
cluster not be ready to deliver always consistent data if there was no
inserts or delete during the catastrophe?

No, we can't ensure that, excepted dropping the node and bootstrapping a
new one. What we can make sure of is that there is enough replica remaining
to serve consistent data (search for RF and CL)

After the bootstrap of a broken node is finish, i.e. after the joining
phase, is there not simply a repair to be done on that node using “node
repair?

This sentence is false bootstrap / joining phase ≠ from broken node coming
back. You are right on repair, if a broken node (or down for too long -
default 3 hours) come back you have to repair. But repair is slow, make
sure you can afford a node, see my previous answer.

Testing is a really good idea but you also have to read a lot imho.

Good luck,

C*heers,

Alain


2015-06-15 11:13 GMT+02:00 Jean Tremblay jean.tremb...@zen-innovations.com
:


 Hi,

 I have a cluster of 3 nodes RF: 2.
 There are about 2 billion rows in one table.
 I use LeveledCompactionStrategy on my table.
 I use version 2.1.6.
 I use the default cassandra.yaml, only the ip address for seeds and
 throughput has been change.

  I am have tested a scenario where one node crashes and loose all its
 data.
 I have deleted all data on this node after having stopped Cassandra.
 At this point I noticed that the cluster was giving proper results. What I
 was expecting from a cluster DB.

  I then restarted that node and I observed that the node was joining the
 cluster.
 After an hour or so the old “defect” node was up and normal.
 I noticed that its hard disk loaded with much less data than its
 neighbours.

  When I was querying the DB, the cluster was giving me different results
 for successive identical queries.
 I guess the old “defect” node was giving me less rows than it should have.

  1) For what I understand, if you have a fixed node with no data it will
 automatically bootstrap and recover all its old data from its neighbour
 while doing the joining phase. Is this correct?
 2) After such catastrophe, and after the joining phase is done should the
 cluster not be ready to deliver always consistent data if there was no
 inserts or delete during the catastrophe?
 3) After the bootstrap of a broken node is finish, i.e. after the joining
 phase, is there not simply a repair to be done on that node using “node
 repair?


  Thanks for your comments.

  Kind regards

  Jean




RE: Lucene index plugin for Apache Cassandra

2015-06-15 Thread Matthew Johnson
Hi Andres,



This looks awesome, many thanks for your work on this. Just out of
curiosity, how does this compare to the DSE Cassandra with embedded Solr?
Do they provide very similar functionality? Is there a list of obvious pros
and cons of one versus the other?



Thanks!

Matthew





*From:* Andres de la Peña [mailto:adelap...@stratio.com]
*Sent:* 13 June 2015 13:20
*To:* user@cassandra.apache.org
*Subject:* Re: Lucene index plugin for Apache Cassandra



Thanks for showing interest.



Faceting is not yet supported, but it is in our roadmap. Our goal is to add
to Cassandra as many Lucene features as possible.



2015-06-12 18:21 GMT+02:00 Mohammed Guller moham...@glassbeam.com:

The plugin looks cool. Thank you for open sourcing it.



Does it support faceting and other Solr functionality?



Mohammed



*From:* Andres de la Peña [mailto:adelap...@stratio.com]
*Sent:* Friday, June 12, 2015 3:43 AM
*To:* user@cassandra.apache.org
*Subject:* Re: Lucene index plugin for Apache Cassandra



I really appreciate your interest



Well, the first recommendation is to not use it unless you need it, because
a properly Cassandra denormalized model is almost always preferable to
indexing. Lucene indexing is a good option when there is no viable
denormalization alternative. This is the case of range queries over
multiple dimensions, full-text search or maybe complex boolean predicates.
It's also appropriate for Spark/Hadoop jobs mapping a small fraction of the
total amount of rows in a certain table, if you can pay the cost of
indexing.



Lucene indexes run inside C*, so users should closely monitor the amount of
used memory. It's also a good idea to put the Lucene directory files in a
separate disk to those used by C* itself. Additionally, you should consider
that indexed tables write throughput will be appreciably reduced, maybe to
a few thousands rows per second.



It's really hard to estimate the amount of resources needed by the index
due to the great variety of indexing and querying ways that Lucene offers,
so the only thing we can suggest is to empirically find the optimal setup
for your use case.



2015-06-12 12:00 GMT+02:00 Carlos Rolo r...@pythian.com:

Seems like an interesting tool!

What operational recommendations would you make to users of this tool
(Extra hardware capacity, extra metrics to monitor, etc)?


Regards,



Carlos Juzarte Rolo

Cassandra Consultant



Pythian - Love your data



rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo
http://linkedin.com/in/carlosjuzarterolo*

Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649

www.pythian.com



On Fri, Jun 12, 2015 at 11:07 AM, Andres de la Peña adelap...@stratio.com
wrote:

Unfortunately, we don't have published any benchmarks yet, but we have
plans to do it as soon as possible. However, you can expect a similar
behavior as those of Elasticsearch or Solr, with some overhead due to the
need for indexing both the Cassandra's row key and the partition's token.
You can also take a look at this presentation
http://planetcassandra.org/video-presentations/vp/cassandra-summit-europe-2014/vd/stratio-advanced-search-and-top-k-queries-in-cassandra/
to see how cluster distribution is done.



2015-06-12 0:45 GMT+02:00 Ben Bromhead b...@instaclustr.com:

Looks awesome, do you have any examples/benchmarks of using these indexes
for various cluster sizes e.g. 20 nodes, 60 nodes, 100s+?



On 10 June 2015 at 09:08, Andres de la Peña adelap...@stratio.com wrote:

Hi all,



With the release of Cassandra 2.1.6, Stratio is glad to present its open
source Lucene-based implementation of C* secondary indexes
https://github.com/Stratio/cassandra-lucene-index as a plugin that can be
attached to Apache Cassandra. Before the above changes, Lucene index was
distributed inside a fork of Apache Cassandra, with all the difficulties
implied. As of now, the fork is discontinued and new users should use the
recently created plugin, which maintains all the features of Stratio
Cassandra https://github.com/Stratio/stratio-cassandra.



Stratio's Lucene index extends Cassandra’s functionality to provide near
real-time distributed search engine capabilities such as with ElasticSearch
or Solr, including full text search capabilities, free multivariable
search, relevance queries and field-based sorting. Each node indexes its
own data, so high availability and scalability is guaranteed.



We hope this will be useful to the Apache Cassandra community.



Regards,



-- 


Andrés de la Peña



http://www.stratio.com/
Avenida de Europa, 26. Ática 5. 3ª Planta

28224 Pozuelo de Alarcón, Madrid

Tel: +34 91 352 59 42 // *@stratiobd https://twitter.com/StratioBD*





-- 

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
http://twitter.com/instaclustr | (650) 284 9692





-- 


Andrés de la Peña



http://www.stratio.com/
Avenida de Europa, 26. Ática 5. 3ª Planta

28224 Pozuelo de Alarcón, Madrid

Tel: +34 91 352 59 42 // *@stratiobd 

Re: Nodetool ring and Replicas after 1.2 upgrade

2015-06-15 Thread Jason Wee
maybe check the system.log to see if there is any exception and/or error?
check as well if they are having consistent schema for the keyspace?

hth

jason

On Tue, Jun 16, 2015 at 7:17 AM, Michael Theroux mthero...@yahoo.com
wrote:

 Hello,

 We (finally) have just upgraded from Cassandra 1.1 to Cassandra 1.2.19.
 Everything appears to be up and running normally, however, we have noticed
 unusual output from nodetool ring.  There is a new (to us) field Replicas
 in the nodetool output, and this field, seemingly at random, is changing
 from 2 to 3 and back to 2.

 We are using the byte ordered partitioner (we hash our own keys), and have
 a replication factor of 3.  We are also on AWS and utilize the Ec2snitch on
 a single Datacenter.

 Other calls appear to be normal.  nodetool getEndpoints returns the
 proper endpoints when querying various keys, nodetool ring and status
 return that all nodes appear healthy.

 Anyone have any hints on what maybe happening, or if this is a problem we
 should be concerned with?

 Thanks,
 -Mike




Re: Missing data

2015-06-15 Thread Robert Wille
You can get tombstones from inserting null values. Not sure if that’s the 
problem, but it is another way of getting tombstones in your data.

On Jun 15, 2015, at 10:50 AM, Jean Tremblay 
jean.tremb...@zen-innovations.commailto:jean.tremb...@zen-innovations.com 
wrote:

Dear all,

I identified a bit more closely the root cause of my missing data.

The problem is occurring when I use

dependency
groupIdcom.datastax.cassandra/groupId
artifactIdcassandra-driver-core/artifactId
version2.1.6/version
/dependency

on my client against Cassandra 2.1.6.

I did not have the problem when I was using the driver 2.1.4 with C* 2.1.4.
Interestingly enough I don’t have the problem with the driver 2.1.4 with C* 
2.1.6.  !!

So as far as I can locate the problem, I would say that the version 2.1.6 of 
the driver is not working properly and is loosing some of my records.!!!

——

As far as my tombstones are concerned I don’t understand their origin.
I removed all location in my code where I delete items, and I do not use TTL 
anywhere ( I don’t need this feature in my project).

And yet I have many tombstones building up.

Is there another origin for tombstone beside TTL, and deleting items? Could the 
compaction of LeveledCompactionStrategy be the origin of them?

@Carlos thanks for your guidance.

Kind regards

Jean



On 15 Jun 2015, at 11:17 , Carlos Rolo 
r...@pythian.commailto:r...@pythian.com wrote:

Hi Jean,

The problem of that Warning is that you are reading too many tombstones per 
request.

If you do have Tombstones without doing DELETE it because you probably TTL'ed 
the data when inserting (By mistake? Or did you set default_time_to_live in 
your table?). You can use nodetool cfstats to see how many tombstones per read 
slice you have. This is, probably, also the cause of your missing data. Data 
was tombstoned, so it is not available.



Regards,

Carlos Juzarte Rolo
Cassandra Consultant

Pythian - Love your data

rolo@pythian | Twitter: cjrolo | Linkedin: 
linkedin.com/in/carlosjuzarterolohttp://linkedin.com/in/carlosjuzarterolo
Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649
www.pythian.comhttp://www.pythian.com/

On Mon, Jun 15, 2015 at 10:54 AM, Jean Tremblay 
jean.tremb...@zen-innovations.commailto:jean.tremb...@zen-innovations.com 
wrote:
Hi,

I have reloaded the data in my cluster of 3 nodes RF: 2.
I have loaded about 2 billion rows in one table.
I use LeveledCompactionStrategy on my table.
I use version 2.1.6.
I use the default cassandra.yaml, only the ip address for seeds and throughput 
has been change.

I loaded my data with simple insert statements. This took a bit more than one 
day to load the data… and one more day to compact the data on all nodes.
For me this is quite acceptable since I should not be doing this again.
I have done this with previous versions like 2.1.3 and others and I basically 
had absolutely no problems.

Now I read the log files on the client side, there I see no warning and no 
errors.
On the nodes side there I see many WARNING, all related with tombstones, but 
there are no ERRORS.

My problem is that I see some *many missing records* in the DB, and I have 
never observed this with previous versions.

1) Is this a know problem?
2) Do you have any idea how I could track down this problem?
3) What is the meaning of this WARNING (the only type of ERROR | WARN  I could 
find)?

WARN  [SharedPool-Worker-2] 2015-06-15 10:12:00,866 SliceQueryFilter.java:319 - 
Read 2990 live and 16016 tombstone cells in gttdata.alltrades_co_rep_pcode for 
key: D:07 (see tombstone_warn_threshold). 5000 columns were requested, 
slices=[388:201001-388:201412:!]


4) Is it possible to have Tombstone when we make no DELETE statements?

I’m lost…

Thanks for your help.



--







Re: Seed Node OOM

2015-06-15 Thread Robert Coli
On Sat, Jun 13, 2015 at 4:39 AM, Oleksandr Petrov 
oleksandr.pet...@gmail.com wrote:

 We're using Cassandra, recently migrated to 2.1.6, and we're experiencing
 constant OOMs in one of our clusters.


Maybe this memory leak?

https://issues.apache.org/jira/browse/CASSANDRA-9549

=Rob


Re: Catastrophy Recovery.

2015-06-15 Thread Saladi Naidu
Alain great write-up on the recovery procedure. You had covered both RF factor 
and Consistency levels. As mentioned two anti entropy mechanisms, hinted hand 
off's and Read Repair work for temporary node outage and incremental recovery. 
In case of disaster/catastrophic recovery, nodetool repair is best way to 
recover back. 
Is below procedure would have ensured node being added properly to the cluster?
Adding nodes to an existing cluster | DataStax Cassandra 2.0 Documentation 
|   |
|   |   |   |   |   |
| Adding nodes to an existing cluster | DataStax Cassandra 2.0 
DocumentationSteps to add nodes when using virtual nodes. | Version 2.0 |
|  |
| View on docs.datastax.com | Preview by Yahoo |
|  |
|   |

   Naidu Saladi 

  From: Jean Tremblay jean.tremb...@zen-innovations.com
 To: user@cassandra.apache.org user@cassandra.apache.org 
 Sent: Monday, June 15, 2015 10:58 AM
 Subject: Re: Catastrophy Recovery.
   
That is really wonderful. Thank you very much Alain. You gave me a lot of 
trails to investigate. Thanks again for you help.



On 15 Jun 2015, at 17:49 , Alain RODRIGUEZ arodr...@gmail.com wrote:
Hi, it looks like your starting to use Cassandra.
Welcome.
I invite you to read from here as much as you can 
http://docs.datastax.com/en/cassandra/2.1/cassandra/gettingStartedCassandraIntro.html.
When a node lose some data you have various anti entropy mechanism
Hinted Handoff -- For writes that occurred while node was down and known as 
such by other nodes (exclusively)Read repair -- On each read, you can set a 
chance to check other nodes for auto correction.Repair ( called either manual / 
anti entropy / full / ...) : Which takes care to give back a node its missing 
data only for the range this node handles (-pr) or for all its data (its range 
plus its replica). This is something you generally want to perform on all nodes 
on a regular basis (lower than the lowest gc_grace_period set on any of your 
tables).
Also, you are having wrong values because you probably have a Consistency Level 
(CL) too low. If you want this to never happen you have to set Read (R) / Write 
(W) consistency level as follow : R + W  RF (Refplication Factor), if not you 
can see what you are currently seeing. I advise you to set your consistency to 
local_quorum or quorum on single DC environment. Also, with 3 nodes, you 
should set RF to 3, if not you won't be able to reach a strong consistency due 
to the formula I just give you.
There is a lot more to know, you should read about this all. Using Cassandra 
without knowing about its internals would lead you to very poor and unexpected 
results.
To answer your questions:
For what I understand, if you have a fixed node with no data it will 
automatically bootstrap and recover all its old data from its neighbour while 
doing the joining phase. Is this correct?

-- Not at all, unless it join the ring for the first time, which is not your 
case. Through it will (by default) slowly recover while you read.
After such catastrophe, and after the joining phase is done should the cluster 
not be ready to deliver always consistent data if there was no inserts or 
delete during the catastrophe?
No, we can't ensure that, excepted dropping the node and bootstrapping a new 
one. What we can make sure of is that there is enough replica remaining to 
serve consistent data (search for RF and CL)
After the bootstrap of a broken node is finish, i.e. after the joining phase, 
is there not simply a repair to be done on that node using “node repair?
This sentence is false bootstrap / joining phase ≠ from broken node coming 
back. You are right on repair, if a broken node (or down for too long - default 
3 hours) come back you have to repair. But repair is slow, make sure you can 
afford a node, see my previous answer.
Testing is a really good idea but you also have to read a lot imho.
Good luck,
C*heers,
Alain

2015-06-15 11:13 GMT+02:00 Jean Tremblay jean.tremb...@zen-innovations.com:




Hi,

I have a cluster of 3 nodes RF: 2.
There are about 2 billion rows in one table.
I use LeveledCompactionStrategy on my table.
I use version 2.1.6.
I use the default cassandra.yaml, only the ip address for seeds and throughput 
has been change.
I am have tested a scenario where one node crashes and loose all its data.I 
have deleted all data on this node after having stopped Cassandra.At this point 
I noticed that the cluster was giving proper results. What I was expecting from 
a cluster DB.
I then restarted that node and I observed that the node was joining the 
cluster.After an hour or so the old “defect” node was up and normal. I noticed 
that its hard disk loaded with much less data than its neighbours.
When I was querying the DB, the cluster was giving me different results for 
successive identical queries.I guess the old “defect” node was giving me less 
rows than it should have.
1) For what I understand, if you have a fixed node with no data it will 
automatically bootstrap and recover all its 

Re: Missing data

2015-06-15 Thread Jean Tremblay
Thanks Robert, but I don’t insert NULL values, but thanks anyway.

On 15 Jun 2015, at 19:16 , Robert Wille 
rwi...@fold3.commailto:rwi...@fold3.com wrote:

You can get tombstones from inserting null values. Not sure if that’s the 
problem, but it is another way of getting tombstones in your data.

On Jun 15, 2015, at 10:50 AM, Jean Tremblay 
jean.tremb...@zen-innovations.commailto:jean.tremb...@zen-innovations.com 
wrote:

Dear all,

I identified a bit more closely the root cause of my missing data.

The problem is occurring when I use

dependency
groupIdcom.datastax.cassandra/groupId
artifactIdcassandra-driver-core/artifactId
version2.1.6/version
/dependency

on my client against Cassandra 2.1.6.

I did not have the problem when I was using the driver 2.1.4 with C* 2.1.4.
Interestingly enough I don’t have the problem with the driver 2.1.4 with C* 
2.1.6.  !!

So as far as I can locate the problem, I would say that the version 2.1.6 of 
the driver is not working properly and is loosing some of my records.!!!

——

As far as my tombstones are concerned I don’t understand their origin.
I removed all location in my code where I delete items, and I do not use TTL 
anywhere ( I don’t need this feature in my project).

And yet I have many tombstones building up.

Is there another origin for tombstone beside TTL, and deleting items? Could the 
compaction of LeveledCompactionStrategy be the origin of them?

@Carlos thanks for your guidance.

Kind regards

Jean



On 15 Jun 2015, at 11:17 , Carlos Rolo 
r...@pythian.commailto:r...@pythian.com wrote:

Hi Jean,

The problem of that Warning is that you are reading too many tombstones per 
request.

If you do have Tombstones without doing DELETE it because you probably TTL'ed 
the data when inserting (By mistake? Or did you set default_time_to_live in 
your table?). You can use nodetool cfstats to see how many tombstones per read 
slice you have. This is, probably, also the cause of your missing data. Data 
was tombstoned, so it is not available.



Regards,

Carlos Juzarte Rolo
Cassandra Consultant

Pythian - Love your data

rolo@pythian | Twitter: cjrolo | Linkedin: 
linkedin.com/in/carlosjuzarterolohttp://linkedin.com/in/carlosjuzarterolo
Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649
www.pythian.comhttp://www.pythian.com/

On Mon, Jun 15, 2015 at 10:54 AM, Jean Tremblay 
jean.tremb...@zen-innovations.commailto:jean.tremb...@zen-innovations.com 
wrote:
Hi,

I have reloaded the data in my cluster of 3 nodes RF: 2.
I have loaded about 2 billion rows in one table.
I use LeveledCompactionStrategy on my table.
I use version 2.1.6.
I use the default cassandra.yaml, only the ip address for seeds and throughput 
has been change.

I loaded my data with simple insert statements. This took a bit more than one 
day to load the data… and one more day to compact the data on all nodes.
For me this is quite acceptable since I should not be doing this again.
I have done this with previous versions like 2.1.3 and others and I basically 
had absolutely no problems.

Now I read the log files on the client side, there I see no warning and no 
errors.
On the nodes side there I see many WARNING, all related with tombstones, but 
there are no ERRORS.

My problem is that I see some *many missing records* in the DB, and I have 
never observed this with previous versions.

1) Is this a know problem?
2) Do you have any idea how I could track down this problem?
3) What is the meaning of this WARNING (the only type of ERROR | WARN  I could 
find)?

WARN  [SharedPool-Worker-2] 2015-06-15 10:12:00,866 SliceQueryFilter.java:319 - 
Read 2990 live and 16016 tombstone cells in gttdata.alltrades_co_rep_pcode for 
key: D:07 (see tombstone_warn_threshold). 5000 columns were requested, 
slices=[388:201001-388:201412:!]


4) Is it possible to have Tombstone when we make no DELETE statements?

I’m lost…

Thanks for your help.



--








Re: Missing data

2015-06-15 Thread Jean Tremblay
Dear all,

I identified a bit more closely the root cause of my missing data.

The problem is occurring when I use

dependency
groupIdcom.datastax.cassandra/groupId
artifactIdcassandra-driver-core/artifactId
version2.1.6/version
/dependency

on my client against Cassandra 2.1.6.

I did not have the problem when I was using the driver 2.1.4 with C* 2.1.4.
Interestingly enough I don’t have the problem with the driver 2.1.4 with C* 
2.1.6.  !!

So as far as I can locate the problem, I would say that the version 2.1.6 of 
the driver is not working properly and is loosing some of my records.!!!

——

As far as my tombstones are concerned I don’t understand their origin.
I removed all location in my code where I delete items, and I do not use TTL 
anywhere ( I don’t need this feature in my project).

And yet I have many tombstones building up.

Is there another origin for tombstone beside TTL, and deleting items? Could the 
compaction of LeveledCompactionStrategy be the origin of them?

@Carlos thanks for your guidance.

Kind regards

Jean



On 15 Jun 2015, at 11:17 , Carlos Rolo 
r...@pythian.commailto:r...@pythian.com wrote:

Hi Jean,

The problem of that Warning is that you are reading too many tombstones per 
request.

If you do have Tombstones without doing DELETE it because you probably TTL'ed 
the data when inserting (By mistake? Or did you set default_time_to_live in 
your table?). You can use nodetool cfstats to see how many tombstones per read 
slice you have. This is, probably, also the cause of your missing data. Data 
was tombstoned, so it is not available.



Regards,

Carlos Juzarte Rolo
Cassandra Consultant

Pythian - Love your data

rolo@pythian | Twitter: cjrolo | Linkedin: 
linkedin.com/in/carlosjuzarterolohttp://linkedin.com/in/carlosjuzarterolo
Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649
www.pythian.comhttp://www.pythian.com/

On Mon, Jun 15, 2015 at 10:54 AM, Jean Tremblay 
jean.tremb...@zen-innovations.commailto:jean.tremb...@zen-innovations.com 
wrote:
Hi,

I have reloaded the data in my cluster of 3 nodes RF: 2.
I have loaded about 2 billion rows in one table.
I use LeveledCompactionStrategy on my table.
I use version 2.1.6.
I use the default cassandra.yaml, only the ip address for seeds and throughput 
has been change.

I loaded my data with simple insert statements. This took a bit more than one 
day to load the data… and one more day to compact the data on all nodes.
For me this is quite acceptable since I should not be doing this again.
I have done this with previous versions like 2.1.3 and others and I basically 
had absolutely no problems.

Now I read the log files on the client side, there I see no warning and no 
errors.
On the nodes side there I see many WARNING, all related with tombstones, but 
there are no ERRORS.

My problem is that I see some *many missing records* in the DB, and I have 
never observed this with previous versions.

1) Is this a know problem?
2) Do you have any idea how I could track down this problem?
3) What is the meaning of this WARNING (the only type of ERROR | WARN  I could 
find)?

WARN  [SharedPool-Worker-2] 2015-06-15 10:12:00,866 SliceQueryFilter.java:319 - 
Read 2990 live and 16016 tombstone cells in gttdata.alltrades_co_rep_pcode for 
key: D:07 (see tombstone_warn_threshold). 5000 columns were requested, 
slices=[388:201001-388:201412:!]


4) Is it possible to have Tombstone when we make no DELETE statements?

I’m lost…

Thanks for your help.



--






Re: Missing data

2015-06-15 Thread Bryan Holladay
Theres your problem, you're using the DataStax java driver :) I just ran
into this issue in the last week and it was incredibly frustrating. If you
are doing a simple loop on a select *  query, then the DataStax java
driver will only process 2^31 rows (e.g. the Java Integer Max
(2,147,483,647)) before it stops w/o any error or output in the logs. The
fact that you said you only had about 2 billion rows but you are seeing
missing data is a red flag.

I found the only way around this is to do your select * in chunks based
on the token range (see this gist for an example:
https://gist.github.com/baholladay/21eb4c61ea8905302195 )
Just loop for every 100million rows and make a new query select * from
TABLE where token(key)  lastToken

Thanks,
Bryan




On Mon, Jun 15, 2015 at 12:50 PM, Jean Tremblay 
jean.tremb...@zen-innovations.com wrote:

  Dear all,

  I identified a bit more closely the root cause of my missing data.

  The problem is occurring when I use

   dependency
 groupIdcom.datastax.cassandra/groupId
 artifactIdcassandra-driver-core/artifactId
  version2.1.6/version
  /dependency

  on my client against Cassandra 2.1.6.

  I did not have the problem when I was using the driver 2.1.4 with C*
 2.1.4.
 Interestingly enough I don’t have the problem with the driver 2.1.4 with
 C* 2.1.6.  !!

  So as far as I can locate the problem, I would say that the version
 2.1.6 of the driver is not working properly and is loosing some of my
 records.!!!

  ——

  As far as my tombstones are concerned I don’t understand their origin.
 I removed all location in my code where I delete items, and I do not use
 TTL anywhere ( I don’t need this feature in my project).

  And yet I have many tombstones building up.

  Is there another origin for tombstone beside TTL, and deleting items?
 Could the compaction of LeveledCompactionStrategy be the origin of them?

  @Carlos thanks for your guidance.

  Kind regards

  Jean



  On 15 Jun 2015, at 11:17 , Carlos Rolo r...@pythian.com wrote:

  Hi Jean,

  The problem of that Warning is that you are reading too many tombstones
 per request.

  If you do have Tombstones without doing DELETE it because you probably
 TTL'ed the data when inserting (By mistake? Or did you set
 default_time_to_live in your table?). You can use nodetool cfstats to see
 how many tombstones per read slice you have. This is, probably, also the
 cause of your missing data. Data was tombstoned, so it is not available.



Regards,

  Carlos Juzarte Rolo
 Cassandra Consultant

 Pythian - Love your data

  rolo@pythian | Twitter: cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo
 http://linkedin.com/in/carlosjuzarterolo*
 Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649
 www.pythian.com

 On Mon, Jun 15, 2015 at 10:54 AM, Jean Tremblay 
 jean.tremb...@zen-innovations.com wrote:

 Hi,

  I have reloaded the data in my cluster of 3 nodes RF: 2.
 I have loaded about 2 billion rows in one table.
 I use LeveledCompactionStrategy on my table.
 I use version 2.1.6.
 I use the default cassandra.yaml, only the ip address for seeds and
 throughput has been change.

  I loaded my data with simple insert statements. This took a bit more
 than one day to load the data… and one more day to compact the data on all
 nodes.
 For me this is quite acceptable since I should not be doing this again.
 I have done this with previous versions like 2.1.3 and others and I
 basically had absolutely no problems.

  Now I read the log files on the client side, there I see no warning and
 no errors.
 On the nodes side there I see many WARNING, all related with tombstones,
 but there are no ERRORS.

  My problem is that I see some *many missing records* in the DB, and I
 have never observed this with previous versions.

  1) Is this a know problem?
 2) Do you have any idea how I could track down this problem?
 3) What is the meaning of this WARNING (the only type of ERROR | WARN  I
 could find)?

  WARN  [SharedPool-Worker-2] 2015-06-15 10:12:00,866
 SliceQueryFilter.java:319 - Read 2990 live and 16016 tombstone cells in
 gttdata.alltrades_co_rep_pcode for key: D:07 (see
 tombstone_warn_threshold). 5000 columns were requested,
 slices=[388:201001-388:201412:!]


  4) Is it possible to have Tombstone when we make no DELETE statements?

  I’m lost…

  Thanks for your help.



 --








Re: Missing data

2015-06-15 Thread Jean Tremblay
Thanks Bryan.
I believe I have a different problem with the Datastax 2.1.6 driver.
My problem is not that I make huge selects.
My problem seems more to occur on some inserts. I inserts MANY rows and with 
the version 2.1.6 of the driver I seem to be loosing some records.

But thanks anyway I will remember your mail when I bump into the select problem.

Cheers

Jean


On 15 Jun 2015, at 19:13 , Bryan Holladay 
holla...@longsight.commailto:holla...@longsight.com wrote:

Theres your problem, you're using the DataStax java driver :) I just ran into 
this issue in the last week and it was incredibly frustrating. If you are doing 
a simple loop on a select *  query, then the DataStax java driver will only 
process 2^31 rows (e.g. the Java Integer Max (2,147,483,647)) before it stops 
w/o any error or output in the logs. The fact that you said you only had about 
2 billion rows but you are seeing missing data is a red flag.

I found the only way around this is to do your select * in chunks based on 
the token range (see this gist for an example: 
https://gist.github.com/baholladay/21eb4c61ea8905302195 )
Just loop for every 100million rows and make a new query select * from TABLE 
where token(key)  lastToken

Thanks,
Bryan




On Mon, Jun 15, 2015 at 12:50 PM, Jean Tremblay 
jean.tremb...@zen-innovations.commailto:jean.tremb...@zen-innovations.com 
wrote:
Dear all,

I identified a bit more closely the root cause of my missing data.

The problem is occurring when I use

dependency
groupIdcom.datastax.cassandra/groupId
artifactIdcassandra-driver-core/artifactId
version2.1.6/version
/dependency

on my client against Cassandra 2.1.6.

I did not have the problem when I was using the driver 2.1.4 with C* 2.1.4.
Interestingly enough I don’t have the problem with the driver 2.1.4 with C* 
2.1.6.  !!

So as far as I can locate the problem, I would say that the version 2.1.6 of 
the driver is not working properly and is loosing some of my records.!!!

——

As far as my tombstones are concerned I don’t understand their origin.
I removed all location in my code where I delete items, and I do not use TTL 
anywhere ( I don’t need this feature in my project).

And yet I have many tombstones building up.

Is there another origin for tombstone beside TTL, and deleting items? Could the 
compaction of LeveledCompactionStrategy be the origin of them?

@Carlos thanks for your guidance.

Kind regards

Jean



On 15 Jun 2015, at 11:17 , Carlos Rolo 
r...@pythian.commailto:r...@pythian.com wrote:

Hi Jean,

The problem of that Warning is that you are reading too many tombstones per 
request.

If you do have Tombstones without doing DELETE it because you probably TTL'ed 
the data when inserting (By mistake? Or did you set default_time_to_live in 
your table?). You can use nodetool cfstats to see how many tombstones per read 
slice you have. This is, probably, also the cause of your missing data. Data 
was tombstoned, so it is not available.



Regards,

Carlos Juzarte Rolo
Cassandra Consultant

Pythian - Love your data

rolo@pythian | Twitter: cjrolo | Linkedin: 
linkedin.com/in/carlosjuzarterolohttp://linkedin.com/in/carlosjuzarterolo
Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 
x1649tel:%2B1%20613%20565%208696%20x1649
www.pythian.comhttp://www.pythian.com/

On Mon, Jun 15, 2015 at 10:54 AM, Jean Tremblay 
jean.tremb...@zen-innovations.commailto:jean.tremb...@zen-innovations.com 
wrote:
Hi,

I have reloaded the data in my cluster of 3 nodes RF: 2.
I have loaded about 2 billion rows in one table.
I use LeveledCompactionStrategy on my table.
I use version 2.1.6.
I use the default cassandra.yaml, only the ip address for seeds and throughput 
has been change.

I loaded my data with simple insert statements. This took a bit more than one 
day to load the data… and one more day to compact the data on all nodes.
For me this is quite acceptable since I should not be doing this again.
I have done this with previous versions like 2.1.3 and others and I basically 
had absolutely no problems.

Now I read the log files on the client side, there I see no warning and no 
errors.
On the nodes side there I see many WARNING, all related with tombstones, but 
there are no ERRORS.

My problem is that I see some *many missing records* in the DB, and I have 
never observed this with previous versions.

1) Is this a know problem?
2) Do you have any idea how I could track down this problem?
3) What is the meaning of this WARNING (the only type of ERROR | WARN  I could 
find)?

WARN  [SharedPool-Worker-2] 2015-06-15 10:12:00,866 SliceQueryFilter.java:319 - 
Read 2990 live and 16016 tombstone cells in gttdata.alltrades_co_rep_pcode for 
key: D:07 (see tombstone_warn_threshold). 5000 columns were requested, 
slices=[388:201001-388:201412:!]


4) Is it possible to have Tombstone when we make no DELETE statements?

I’m lost…

Thanks for your help.



--








counters still inconsistent after repair

2015-06-15 Thread Dan Kinder
Currently on 2.1.6 I'm seeing behavior like the following:

cqlsh:walker select * from counter_table where field = 'test';
 field | value
---+---
 test  |30
(1 rows)
cqlsh:walker select * from counter_table where field = 'test';
 field | value
---+---
 test  |90
(1 rows)
cqlsh:walker select * from counter_table where field = 'test';
 field | value
---+---
 test  |30
(1 rows)

Using tracing I can see that one node has wrong data. However running
repair on this table does not seem to have done anything, I still see the
wrong value returned from this same node.

Potentially relevant facts:
- Recently upgraded to 2.1.6 from 2.0.14
- This table has ~million rows, low contention, and fairly high increment
rate

Mainly wondering:
- Is this known or expected? I know Cassandra counters have had issues but
thought by now it should be able to keep a consistent counter or at least
repair it...
- Any way to reset this counter?
- Any other stuff I can check?


Nodetool ring and Replicas after 1.2 upgrade

2015-06-15 Thread Michael Theroux
Hello,
We (finally) have just upgraded from Cassandra 1.1 to Cassandra 1.2.19.  
Everything appears to be up and running normally, however, we have noticed 
unusual output from nodetool ring.  There is a new (to us) field Replicas in 
the nodetool output, and this field, seemingly at random, is changing from 2 to 
3 and back to 2.
We are using the byte ordered partitioner (we hash our own keys), and have a 
replication factor of 3.  We are also on AWS and utilize the Ec2snitch on a 
single Datacenter.  
Other calls appear to be normal.  nodetool getEndpoints returns the proper 
endpoints when querying various keys, nodetool ring and status return that all 
nodes appear healthy.  
Anyone have any hints on what maybe happening, or if this is a problem we 
should be concerned with?
Thanks,-Mike


Re: counters still inconsistent after repair

2015-06-15 Thread Robert Coli
On Mon, Jun 15, 2015 at 2:52 PM, Dan Kinder dkin...@turnitin.com wrote:

 Potentially relevant facts:
 - Recently upgraded to 2.1.6 from 2.0.14
 - This table has ~million rows, low contention, and fairly high increment
 rate

Can you repro on a counter that was created after the upgrade?

 Mainly wondering:

 - Is this known or expected? I know Cassandra counters have had issues but
 thought by now it should be able to keep a consistent counter or at least
 repair it...

All counters which haven't been written to after 2.1 new counters are
still on disk as old counters and will remain that way until UPDATEd and
then compacted together with all old shards. Old counters can exhibit
this behavior.

 - Any way to reset this counter?

Per Aleksey (in IRC) you can turn a replica for an old counter into a new
counter by UPDATEing it once.

In order to do that without modifying the count, you can [1] :

UPDATE tablename SET countercolumn = countercolumn +0 where id = 1;

The important caveat that this must be done at least once per shard, with
one shard per RF. The only way one can be sure that all shards have been
UPDATEd is by contacting each replica node and doing the UPDATE + 0 there,
because local writes are preferred.

To summarize, the optimal process to upgrade your pre-existing counters to
2.1-era new counters :

1) get a list of all counter keys
2) get a list of replicas per counter key
3) connect to each replica for each counter key and issue an UPDATE + 0 for
that counter key
4) run a major compaction

As an aside, Aleksey suggests that the above process is so heavyweight that
it may not be worth it. If you just leave them be, all counters you're
actually used will become progressively more accurate over time.

=Rob
[1] Special thanks to Jeff Jirsa for verifying that this syntax works.


PrepareStatement problem

2015-06-15 Thread joseph gao
hi, all
  I'm using PrepareStatement. If I prepare a sql everytime I use,
cassandra will give me a warning tell me NOT PREPARE EVERYTIME. So I Cache
the PrepareStatement locally . But when other client change the table's
schema, like, add a new Column, If I still use the former Cached
PrepareStatement, the metadata will dismatch the data. The metadata tells n
column, and the data tells n+1 column. So what should I do to avoid this
problem?

-- 
--
Joseph Gao
PhoneNum:15210513582
QQ: 409343351