Re: Missing data

Jean Tremblay Mon, 15 Jun 2015 10:22:32 -0700

Thanks Robert, but I don’t insert NULL values, but thanks anyway.

On 15 Jun 2015, at 19:16 , Robert Wille 
<rwi...@fold3.com<mailto:rwi...@fold3.com>> wrote:


You can get tombstones from inserting null values. Not sure if that’s the 
problem, but it is another way of getting tombstones in your data.

On Jun 15, 2015, at 10:50 AM, Jean Tremblay 
<jean.tremb...@zen-innovations.com<mailto:jean.tremb...@zen-innovations.com>> 
wrote:

Dear all,

I identified a bit more closely the root cause of my missing data.

The problem is occurring when I use

<dependency>
<groupId>com.datastax.cassandra</groupId>
<artifactId>cassandra-driver-core</artifactId>
<version>2.1.6</version>
</dependency>

on my client against Cassandra 2.1.6.

I did not have the problem when I was using the driver 2.1.4 with C* 2.1.4.
Interestingly enough I don’t have the problem with the driver 2.1.4 with C* 
2.1.6.  !!!!!!

So as far as I can locate the problem, I would say that the version 2.1.6 of 
the driver is not working properly and is loosing some of my records.!!!

——————

As far as my tombstones are concerned I don’t understand their origin.
I removed all location in my code where I delete items, and I do not use TTL 
anywhere ( I don’t need this feature in my project).

And yet I have many tombstones building up.

Is there another origin for tombstone beside TTL, and deleting items? Could the 
compaction of LeveledCompactionStrategy be the origin of them?

@Carlos thanks for your guidance.

Kind regards

Jean



On 15 Jun 2015, at 11:17 , Carlos Rolo 
<r...@pythian.com<mailto:r...@pythian.com>> wrote:

Hi Jean,

The problem of that Warning is that you are reading too many tombstones per 
request.

If you do have Tombstones without doing DELETE it because you probably TTL'ed 
the data when inserting (By mistake? Or did you set default_time_to_live in 
your table?). You can use nodetool cfstats to see how many tombstones per read 
slice you have. This is, probably, also the cause of your missing data. Data 
was tombstoned, so it is not available.



Regards,

Carlos Juzarte Rolo
Cassandra Consultant

Pythian - Love your data

rolo@pythian | Twitter: cjrolo | Linkedin: 
linkedin.com/in/carlosjuzarterolo<http://linkedin.com/in/carlosjuzarterolo>
Mobile: +31 6 159 61 814 | Tel: +1 613 565 8696 x1649
www.pythian.com<http://www.pythian.com/>

On Mon, Jun 15, 2015 at 10:54 AM, Jean Tremblay 
<jean.tremb...@zen-innovations.com<mailto:jean.tremb...@zen-innovations.com>> 
wrote:
Hi,

I have reloaded the data in my cluster of 3 nodes RF: 2.
I have loaded about 2 billion rows in one table.
I use LeveledCompactionStrategy on my table.
I use version 2.1.6.
I use the default cassandra.yaml, only the ip address for seeds and throughput 
has been change.

I loaded my data with simple insert statements. This took a bit more than one 
day to load the data… and one more day to compact the data on all nodes.
For me this is quite acceptable since I should not be doing this again.
I have done this with previous versions like 2.1.3 and others and I basically 
had absolutely no problems.

Now I read the log files on the client side, there I see no warning and no 
errors.
On the nodes side there I see many WARNING, all related with tombstones, but 
there are no ERRORS.

My problem is that I see some *many missing records* in the DB, and I have 
never observed this with previous versions.

1) Is this a know problem?
2) Do you have any idea how I could track down this problem?
3) What is the meaning of this WARNING (the only type of ERROR | WARN  I could 
find)?

WARN  [SharedPool-Worker-2] 2015-06-15 10:12:00,866 SliceQueryFilter.java:319 - 
Read 2990 live and 16016 tombstone cells in gttdata.alltrades_co_rep_pcode for 
key: D:07 (see tombstone_warn_threshold). 5000 columns were requested, 
slices=[388:201001-388:201412:!]


4) Is it possible to have Tombstone when we make no DELETE statements?

I’m lost…

Thanks for your help.



--

Re: Missing data

Reply via email to