[jira] [Comment Edited] (CASSANDRA-14013) Data loss in snapshots keyspace after service restart

2023-01-13 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17676583#comment-17676583
 ] 

Stefan Miklosovic edited comment on CASSANDRA-14013 at 1/13/23 11:10 AM:
-

looks great! +1. Lets ship this!


was (Author: smiklosovic):
look great! +1. Lets ship this!

> Data loss in snapshots keyspace after service restart
> -
>
> Key: CASSANDRA-14013
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14013
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core, Local/Snapshots
>Reporter: Gregor Uhlenheuer
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 4.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I am posting this bug in hope to discover the stupid mistake I am doing 
> because I can't imagine a reasonable answer for the behavior I see right now 
> :-)
> In short words, I do observe data loss in a keyspace called *snapshots* after 
> restarting the Cassandra service. Say I do have 1000 records in a table 
> called *snapshots.test_idx* then after restart the table has less entries or 
> is even empty.
> My kind of "mysterious" observation is that it happens only in a keyspace 
> called *snapshots*...
> h3. Steps to reproduce
> These steps to reproduce show the described behavior in "most" attempts (not 
> every single time though).
> {code}
> # create keyspace
> CREATE KEYSPACE snapshots WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> # create table
> CREATE TABLE snapshots.test_idx (key text, seqno bigint, primary key(key));
> # insert some test data
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1', 1);
> ...
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1000', 1000);
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 1000
> # restart service
> kill 
> cassandra -f
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 0
> {code}
> I hope someone can point me to the obvious mistake I am doing :-)
> This happened to me using both Cassandra 3.9 and 3.11.0



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14013) Data loss in snapshots keyspace after service restart

2023-01-09 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17656028#comment-17656028
 ] 

Stefan Miklosovic edited comment on CASSANDRA-14013 at 1/9/23 10:47 AM:


The problem is that when we are trying to get a descriptor for a legacy 
sstable, the test is going to find the first file in the dir and it might 
happen that it will return ".txt" file. But 
Descriptor.LEGACY_SSTABLE_DIR_PATTERN is ending on ".db". We should just do 
this [https://github.com/pauloricardomg/cassandra/pull/2]

I am running the build for trunk with that PR included here 
[https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/2166/]


was (Author: smiklosovic):
The problem is that when we are trying to get a descriptor for a legacy 
sstable, the test is going to find a first file in the dir and it might happen 
that it will return ".txt" file. But Descriptor.LEGACY_SSTABLE_DIR_PATTERN is 
ending on ".db". We should just do this 
[https://github.com/pauloricardomg/cassandra/pull/2]

I am running the build for trunk with that PR included here 
[https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/2166/]

> Data loss in snapshots keyspace after service restart
> -
>
> Key: CASSANDRA-14013
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14013
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core, Local/Snapshots
>Reporter: Gregor Uhlenheuer
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 4.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I am posting this bug in hope to discover the stupid mistake I am doing 
> because I can't imagine a reasonable answer for the behavior I see right now 
> :-)
> In short words, I do observe data loss in a keyspace called *snapshots* after 
> restarting the Cassandra service. Say I do have 1000 records in a table 
> called *snapshots.test_idx* then after restart the table has less entries or 
> is even empty.
> My kind of "mysterious" observation is that it happens only in a keyspace 
> called *snapshots*...
> h3. Steps to reproduce
> These steps to reproduce show the described behavior in "most" attempts (not 
> every single time though).
> {code}
> # create keyspace
> CREATE KEYSPACE snapshots WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> # create table
> CREATE TABLE snapshots.test_idx (key text, seqno bigint, primary key(key));
> # insert some test data
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1', 1);
> ...
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1000', 1000);
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 1000
> # restart service
> kill 
> cassandra -f
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 0
> {code}
> I hope someone can point me to the obvious mistake I am doing :-)
> This happened to me using both Cassandra 3.9 and 3.11.0



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14013) Data loss in snapshots keyspace after service restart

2022-12-20 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649434#comment-17649434
 ] 

Paulo Motta edited comment on CASSANDRA-14013 at 12/20/22 7:16 PM:
---

{quote} In that case, could you add a test in SSTableLoaderTest as it was, that 
it is loading it just fine without uuid as well?
{quote}
done 
[here|https://github.com/pauloricardomg/cassandra/commit/9cc0f63171c60e927af18eb3256eb63a29916a43].

During a [CI 
run|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/2114/testReport/]
 of the trunk patch, I realized the original regex was only accepting ".db" 
sstable files, so it was failing to correctly parse other extensions (such as 
.txt or .crc32). So I updated the regex to accept any extension on [this 
commit|https://github.com/pauloricardomg/cassandra/commit/345222a3e2504a84ef91eb25e35ae23762c34178].
 We could make the regex more prescriptive with only supported extensions, but 
I don't think this is needed for now.

I prepared 4.0/4.1 patches with the less disruptive fix, and the trunk patch 
with the improved regex-based fix:
|branch||CI||
|[CASSANDRA-14013-4.0|https://github.com/pauloricardomg/cassandra/tree/CASSANDRA-14013-4.0]|[#2115|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/2115/]
 (finished)|
|[CASSANDRA-14013-4.1|https://github.com/pauloricardomg/cassandra/tree/CASSANDRA-14013-4.1]|[#2121|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/2121/]
 (finished)|
|[CASSANDRA-14013-trunk|https://github.com/pauloricardomg/cassandra/tree/CASSANDRA-14013-trunk]|[#2125|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/2125/]
 (running)|

(will update state when CI is finished)


Are you ok with the improved regex fix to trunk [~blerer], while having the 
simpler fix on 4.x to reduce risk on released versions?


was (Author: paulo):
{quote} In that case, could you add a test in SSTableLoaderTest as it was, that 
it is loading it just fine without uuid as well?
{quote}
done 
[here|https://github.com/pauloricardomg/cassandra/commit/9cc0f63171c60e927af18eb3256eb63a29916a43].

During a [CI 
run|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/2114/testReport/]
 of the trunk patch, I realized the original regex was only accepting ".db" 
sstable files, so it was failing to correctly parse other extensions (such as 
.txt or .crc32). So I updated the regex to accept any extension on [this 
commit|https://github.com/pauloricardomg/cassandra/commit/345222a3e2504a84ef91eb25e35ae23762c34178].
 We could make the regex more prescriptive with only supported extensions, but 
I don't think this is needed for now.

I prepared 4.0/4.1 patches with the less disruptive fix, and the trunk patch 
with the improved regex-based fix:
|branch||CI||
|[CASSANDRA-14013-4.0|https://github.com/pauloricardomg/cassandra/tree/CASSANDRA-14013-4.0]|[#2115|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/2115/]
 (finished)|
|[CASSANDRA-14013-4.1|https://github.com/pauloricardomg/cassandra/tree/CASSANDRA-14013-4.1]|[#2121|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/2122/]
 (running)|
|[CASSANDRA-14013-trunk|https://github.com/pauloricardomg/cassandra/tree/CASSANDRA-14013-trunk]|[#2122|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/2122/]
 (running)|

(will update state when CI is finished)


Are you ok with the improved regex fix to trunk [~blerer], while having the 
simpler fix on 4.x to reduce risk on released versions?

> Data loss in snapshots keyspace after service restart
> -
>
> Key: CASSANDRA-14013
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14013
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core, Local/Snapshots
>Reporter: Gregor Uhlenheuer
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 4.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I am posting this bug in hope to discover the stupid mistake I am doing 
> because I can't imagine a reasonable answer for the behavior I see right now 
> :-)
> In short words, I do observe data loss in a keyspace called *snapshots* after 
> restarting the Cassandra service. Say I do have 1000 records in a table 
> called *snapshots.test_idx* then after restart the table has less entries or 
> is even empty.
> My kind of "mysterious" observation is that it happens only in a keyspace 
> called *snapshots*...
> h3. Steps to reproduce
> These steps to reproduce show the described behavior in "most" attempts (not 
> every single time though).
> {code}
> # create keyspace
> CREATE KEYSPACE snapshots WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> 

[jira] [Comment Edited] (CASSANDRA-14013) Data loss in snapshots keyspace after service restart

2022-12-15 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17648276#comment-17648276
 ] 

Stefan Miklosovic edited comment on CASSANDRA-14013 at 12/15/22 10:38 PM:
--

_We did not have a test on DescriptorTest#testKeyspaceTableParsing to pick up 
the scenario of a legacy "backups" table directory and keyspace, when the table 
id suffix is missing so I added it here._

So, we _do_ support that, still, right? In that case, could you add a test in 
SSTableLoaderTest as it was, that it is loading it just fine without uuid as 
well? Just same thing as it was before.

When it comes to branches, more branches better it is :D I made the peace with 
having it in 4.0+, you will have bonus points for anything older. However, 
having data being lost on this kind of stuff is rather embarrassing in 2022 
(almost 2023!)



was (Author: smiklosovic):
_On this commit, I added the ./* prefix to the regex which made it pick up the 
case of a "backups" table in the legacy directory format without the table 
uuid. I also updated SSTableLoaderTest to use the new table directory format._

Just to be sure, if you changed that test to support new table format, does 
that mean that a user in 4.2 / 5.0 will not be able to import sstables in table 
dir called "backups"? That is basically regression when it comes to 
CASSANDRA-16235.

But I think I am wrong because here in your you write:

_We did not have a test on DescriptorTest#testKeyspaceTableParsing to pick up 
the scenario of a legacy "backups" table directory and keyspace, when the table 
id suffix is missing so I added it here._

So, we _do_ support that, still, right? In that case, could you add a test in 
SSTableLoaderTest as it was, that it is loading it just fine without uuid as 
well? Just same thing as it was before.

When it comes to branches, more branches better it is :D I made the peace with 
having it in 4.0+, you will have bonus points for anything older. However, 
having data being lost on this kind of stuff is rather embarrassing in 2022 
(almost 2023!)


> Data loss in snapshots keyspace after service restart
> -
>
> Key: CASSANDRA-14013
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14013
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core, Local/Snapshots
>Reporter: Gregor Uhlenheuer
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 4.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I am posting this bug in hope to discover the stupid mistake I am doing 
> because I can't imagine a reasonable answer for the behavior I see right now 
> :-)
> In short words, I do observe data loss in a keyspace called *snapshots* after 
> restarting the Cassandra service. Say I do have 1000 records in a table 
> called *snapshots.test_idx* then after restart the table has less entries or 
> is even empty.
> My kind of "mysterious" observation is that it happens only in a keyspace 
> called *snapshots*...
> h3. Steps to reproduce
> These steps to reproduce show the described behavior in "most" attempts (not 
> every single time though).
> {code}
> # create keyspace
> CREATE KEYSPACE snapshots WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> # create table
> CREATE TABLE snapshots.test_idx (key text, seqno bigint, primary key(key));
> # insert some test data
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1', 1);
> ...
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1000', 1000);
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 1000
> # restart service
> kill 
> cassandra -f
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 0
> {code}
> I hope someone can point me to the obvious mistake I am doing :-)
> This happened to me using both Cassandra 3.9 and 3.11.0



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14013) Data loss in snapshots keyspace after service restart

2020-11-06 Thread Benjamin Lerer (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17227347#comment-17227347
 ] 

Benjamin Lerer edited comment on CASSANDRA-14013 at 11/6/20, 11:29 AM:
---

Trying to summarize the problem:
# SSTables used within the C* data directories should be within the data 
directories returned by {{DatabaseDescriptor.getAllDataFileLocations()}} and 
the table directories should be in the form {{-}}. In this 
case the problem come mainly from keyspace being named {{backups}} or 
{{snapshots}}.
# Files coming from SSTableLoader should be outside of the data directories and 
the table name should be without the TableID. In this case, keyspaces and 
tables with a 
{{backups}} or {{snapshots}} name will be having issues.

To be honest, the documentation I found on the SSTableloader is pretty 
confusing and I imagine that some people might try to use it directly on the C* 
data directories in which case the table directory will contains the TableID. 
This case is somehow the same than the {{1.}} above.

[~stefan.miklosovic] As you pointed out there are several scenario that we 
never tested. {{nodetool snapshot}} with a {{snapshots}} or {{backups}} tag 
name. SSTableLoader for a {{snapshots}} table (the {{backups}} name was tested 
by CASSANDRA-16235. The patch should add some tests for those scenarios.
We should also probably test a {{nodetool refresh}} with a {{snapshots}} or 
{{backups}} keyspace.

Pinging [~e.dimitrova] as she was involved in CASSANDRA-16235.


was (Author: blerer):
Trying to summarize the problem:
# SSTables used within the C* data directories should be within the data 
directories returned by {{DatabaseDescriptor.getAllDataFileLocations()}} and 
the table directories should be in the form {{-}}. In this 
case the problem come mainly from keyspace being named {{backups}} or 
{{snapshots}}.
# Files coming from SSTableLoader should be outside of the data directories and 
the table name should be without the TableID. In this case, keyspaces and 
tables with a 
{{backups}} or {{snapshots}} name will be having issues.

To be honest, the documentation I found on the SSTableloader is pretty 
confusing and I imagine that some people might try to use it directly on the C* 
data directories in which case the table directory will contains the TableID. 
This case is somehow the same than the {{1.}} above.

[~stefan.miklosovic] As you pointed out there are several scenario that we 
never tested {{nodetool snapshot}} with a {{snapshots}} or {{backups}} tag 
name. SSTableLoader for a {{snapshots}} table (the {{backups}} name was tested 
by CASSANDRA-16235. The patch should add some tests for those scenarios.
We should also probably test a {{nodetool refresh}} with a {{snapshots}} or 
{{backups}} keyspace.

Pinging [~e.dimitrova] as she was involved in CASSANDRA-16235.

> Data loss in snapshots keyspace after service restart
> -
>
> Key: CASSANDRA-14013
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14013
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Gregor Uhlenheuer
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I am posting this bug in hope to discover the stupid mistake I am doing 
> because I can't imagine a reasonable answer for the behavior I see right now 
> :-)
> In short words, I do observe data loss in a keyspace called *snapshots* after 
> restarting the Cassandra service. Say I do have 1000 records in a table 
> called *snapshots.test_idx* then after restart the table has less entries or 
> is even empty.
> My kind of "mysterious" observation is that it happens only in a keyspace 
> called *snapshots*...
> h3. Steps to reproduce
> These steps to reproduce show the described behavior in "most" attempts (not 
> every single time though).
> {code}
> # create keyspace
> CREATE KEYSPACE snapshots WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> # create table
> CREATE TABLE snapshots.test_idx (key text, seqno bigint, primary key(key));
> # insert some test data
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1', 1);
> ...
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1000', 1000);
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 1000
> # restart service
> kill 
> cassandra -f
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 0
> {code}
> I hope someone can point me to the obvious mistake I am doing :-)
> This happened to me using both Cassandra 3.9 and 3.11.0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: 

[jira] [Comment Edited] (CASSANDRA-14013) Data loss in snapshots keyspace after service restart

2020-11-06 Thread Benjamin Lerer (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17227347#comment-17227347
 ] 

Benjamin Lerer edited comment on CASSANDRA-14013 at 11/6/20, 11:30 AM:
---

Trying to summarize the problem:
# SSTables used within the C* data directories should be within the data 
directories returned by {{DatabaseDescriptor.getAllDataFileLocations()}} and 
the table directories should be in the form {{-}}. In this 
case the problem come mainly from keyspace being named {{backups}} or 
{{snapshots}}.
# Files coming from SSTableLoader should be outside of the data directories and 
the table name should be without the TableID. In this case, keyspaces and 
tables with a 
{{backups}} or {{snapshots}} name will be having issues.

To be honest, the documentation I found on the SSTableloader is pretty 
confusing and I imagine that some people might try to use it directly on the C* 
data directories in which case the table directory will contains the TableID. 
This case is somehow the same than the {{1.}} above.

[~stefan.miklosovic] As you pointed out there are several scenario that we 
never tested. {{nodetool snapshot}} with a {{snapshots}} or {{backups}} tag 
name. SSTableLoader for a {{snapshots}} table (the {{backups}} name was tested 
by CASSANDRA-16235). The patch should add some tests for those scenarios.
We should also probably test {{nodetool refresh}} with a {{snapshots}} or 
{{backups}} keyspace.

Pinging [~e.dimitrova] as she was involved in CASSANDRA-16235.


was (Author: blerer):
Trying to summarize the problem:
# SSTables used within the C* data directories should be within the data 
directories returned by {{DatabaseDescriptor.getAllDataFileLocations()}} and 
the table directories should be in the form {{-}}. In this 
case the problem come mainly from keyspace being named {{backups}} or 
{{snapshots}}.
# Files coming from SSTableLoader should be outside of the data directories and 
the table name should be without the TableID. In this case, keyspaces and 
tables with a 
{{backups}} or {{snapshots}} name will be having issues.

To be honest, the documentation I found on the SSTableloader is pretty 
confusing and I imagine that some people might try to use it directly on the C* 
data directories in which case the table directory will contains the TableID. 
This case is somehow the same than the {{1.}} above.

[~stefan.miklosovic] As you pointed out there are several scenario that we 
never tested. {{nodetool snapshot}} with a {{snapshots}} or {{backups}} tag 
name. SSTableLoader for a {{snapshots}} table (the {{backups}} name was tested 
by CASSANDRA-16235. The patch should add some tests for those scenarios.
We should also probably test a {{nodetool refresh}} with a {{snapshots}} or 
{{backups}} keyspace.

Pinging [~e.dimitrova] as she was involved in CASSANDRA-16235.

> Data loss in snapshots keyspace after service restart
> -
>
> Key: CASSANDRA-14013
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14013
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Gregor Uhlenheuer
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I am posting this bug in hope to discover the stupid mistake I am doing 
> because I can't imagine a reasonable answer for the behavior I see right now 
> :-)
> In short words, I do observe data loss in a keyspace called *snapshots* after 
> restarting the Cassandra service. Say I do have 1000 records in a table 
> called *snapshots.test_idx* then after restart the table has less entries or 
> is even empty.
> My kind of "mysterious" observation is that it happens only in a keyspace 
> called *snapshots*...
> h3. Steps to reproduce
> These steps to reproduce show the described behavior in "most" attempts (not 
> every single time though).
> {code}
> # create keyspace
> CREATE KEYSPACE snapshots WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> # create table
> CREATE TABLE snapshots.test_idx (key text, seqno bigint, primary key(key));
> # insert some test data
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1', 1);
> ...
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1000', 1000);
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 1000
> # restart service
> kill 
> cassandra -f
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 0
> {code}
> I hope someone can point me to the obvious mistake I am doing :-)
> This happened to me using both Cassandra 3.9 and 3.11.0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: 

[jira] [Comment Edited] (CASSANDRA-14013) Data loss in snapshots keyspace after service restart

2020-11-06 Thread Benjamin Lerer (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17227347#comment-17227347
 ] 

Benjamin Lerer edited comment on CASSANDRA-14013 at 11/6/20, 11:29 AM:
---

Trying to summarize the problem:
# SSTables used within the C* data directories should be within the data 
directories returned by {{DatabaseDescriptor.getAllDataFileLocations()}} and 
the table directories should be in the form {{-}}. In this 
case the problem come mainly from keyspace being named {{backups}} or 
{{snapshots}}.
# Files coming from SSTableLoader should be outside of the data directories and 
the table name should be without the TableID. In this case, keyspaces and 
tables with a 
{{backups}} or {{snapshots}} name will be having issues.

To be honest, the documentation I found on the SSTableloader is pretty 
confusing and I imagine that some people might try to use it directly on the C* 
data directories in which case the table directory will contains the TableID. 
This case is somehow the same than the {{1.}} above.

[~stefan.miklosovic] As you pointed out there are several scenario that we 
never tested {{nodetool snapshot}} with a {{snapshots}} or {{backups}} tag 
name. SSTableLoader for a {{snapshots}} table (the {{backups}} name was tested 
by CASSANDRA-16235. The patch should add some tests for those scenarios.
We should also probably test a {{nodetool refresh}} with a {{snapshots}} or 
{{backups}} keyspace.

Pinging [~e.dimitrova] as she was involved in CASSANDRA-16235.


was (Author: blerer):
Trying to summarize the problem:
# SSTables used within the C* data directories should be within the data 
directories returned by {{DatabaseDescriptor.getAllDataFileLocations()}} and 
the table directories should be in the form {{-}}. In this 
case the problem come mainly from keyspace being named {{backups}} or 
{{snapshots}}.
# Files coming from SSTableLoader should outside of the data directories and 
the table name should be without the TableID. In this case, keyspace and table 
with a 
{{backups}} or {{snapshots}} name will be having issues.

To be honest, the documentation I found on the SSTableloader is pretty 
confusing and I imagine that some people might try to use it directly on the C* 
data directories in which case the table directory will contains the TableID. 
This case is somehow the same than the {{1.}} above.

[~stefan.miklosovic] As you pointed out there are several scenario that we 
never tested {{nodetool snapshot}} with a {{snapshots}} or {{backups}} tag 
name. SSTableLoader for a {{snapshots}} table (the {{backups}} name was tested 
by CASSANDRA-16235. The patch should add some tests for those scenarios.
We should also probably test a {{nodetool refresh}} with a {{snapshots}} or 
{{backups}} keyspace.

Pinging [~e.dimitrova] as she was involved in CASSANDRA-16235.

> Data loss in snapshots keyspace after service restart
> -
>
> Key: CASSANDRA-14013
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14013
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Gregor Uhlenheuer
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I am posting this bug in hope to discover the stupid mistake I am doing 
> because I can't imagine a reasonable answer for the behavior I see right now 
> :-)
> In short words, I do observe data loss in a keyspace called *snapshots* after 
> restarting the Cassandra service. Say I do have 1000 records in a table 
> called *snapshots.test_idx* then after restart the table has less entries or 
> is even empty.
> My kind of "mysterious" observation is that it happens only in a keyspace 
> called *snapshots*...
> h3. Steps to reproduce
> These steps to reproduce show the described behavior in "most" attempts (not 
> every single time though).
> {code}
> # create keyspace
> CREATE KEYSPACE snapshots WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> # create table
> CREATE TABLE snapshots.test_idx (key text, seqno bigint, primary key(key));
> # insert some test data
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1', 1);
> ...
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1000', 1000);
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 1000
> # restart service
> kill 
> cassandra -f
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 0
> {code}
> I hope someone can point me to the obvious mistake I am doing :-)
> This happened to me using both Cassandra 3.9 and 3.11.0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: 

[jira] [Comment Edited] (CASSANDRA-14013) Data loss in snapshots keyspace after service restart

2020-11-06 Thread Benjamin Lerer (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17227268#comment-17227268
 ] 

Benjamin Lerer edited comment on CASSANDRA-14013 at 11/6/20, 9:23 AM:
--

{quote}That is not true{quote}

You are right, I should open my eyes properly ;-)

Then unless I am mistaken (again ;-)), we cannot rely on 
{{DatabaseDescriptor.getAllDataFileLocations()}} as those directories will not 
be the same as the one in which is stored the input directory for the 
SSTableLoader.




was (Author: blerer):
{quote}That is not true{quote}

You are right, I should open my eyes properly ;-)

Then unless I am mistaken (again ;-)), you cannot rely on 
{{DatabaseDescriptor.getAllDataFileLocations()}} as those directories will not 
be the same as the one in which is stored the input directory for the 
SSTableLoader.



> Data loss in snapshots keyspace after service restart
> -
>
> Key: CASSANDRA-14013
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14013
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Gregor Uhlenheuer
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I am posting this bug in hope to discover the stupid mistake I am doing 
> because I can't imagine a reasonable answer for the behavior I see right now 
> :-)
> In short words, I do observe data loss in a keyspace called *snapshots* after 
> restarting the Cassandra service. Say I do have 1000 records in a table 
> called *snapshots.test_idx* then after restart the table has less entries or 
> is even empty.
> My kind of "mysterious" observation is that it happens only in a keyspace 
> called *snapshots*...
> h3. Steps to reproduce
> These steps to reproduce show the described behavior in "most" attempts (not 
> every single time though).
> {code}
> # create keyspace
> CREATE KEYSPACE snapshots WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> # create table
> CREATE TABLE snapshots.test_idx (key text, seqno bigint, primary key(key));
> # insert some test data
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1', 1);
> ...
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1000', 1000);
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 1000
> # restart service
> kill 
> cassandra -f
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 0
> {code}
> I hope someone can point me to the obvious mistake I am doing :-)
> This happened to me using both Cassandra 3.9 and 3.11.0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14013) Data loss in snapshots keyspace after service restart

2020-11-05 Thread Benjamin Lerer (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226641#comment-17226641
 ] 

Benjamin Lerer edited comment on CASSANDRA-14013 at 11/5/20, 10:56 AM:
---

[~stefan.miklosovic]
{quote}for example, for sstableloader, people might put sstable into 
/tmp/some/path/mykeyspace/mytable/(data files), and that "mytable" will not 
have any "id" on it ...{quote} 

{{SSTableLoader}} does not rely on the code of 
{{Descriptor::fromFilenameWithComponent}} for creating the {{Descriptor}} 
instances, it has its own mechanism which assume that there will be no 
{{TableID}}.

{quote}And you can have also a snapshot taken which is called "snapshots" That 
complicates things ever further.{quote}

It does not. I did a quick proof of concept and used your PR test to validate 
it 
[here|https://github.com/apache/cassandra/compare/trunk...blerer:CASSANDRA-14013-review].
 Of course, the test to check if a directory is a table one or not might be 
done better.

We know that, outside of this {{snapshots}} keyspace problem, this code work 
and has been battle tested. By consequence, being pragmatic and pretty 
paranoiac ;-), it feels safer to me to not reimplement the all thing. Specially 
if we take into account that we have to backport the fix to 3.11 and 3.0 (not 
sure for 2.2).  



was (Author: blerer):
{quote}for example, for sstableloader, people might put sstable into 
/tmp/some/path/mykeyspace/mytable/(data files), and that "mytable" will not 
have any "id" on it ...{quote} 

{{SSTableLoader}} does not rely on the code of 
{{Descriptor::fromFilenameWithComponent}} for creating the {{Descriptor}} 
instances, it has its own mechanism which assume that there will be no 
{{TableID}}.

{quote}And you can have also a snapshot taken which is called "snapshots" That 
complicates things ever further.{quote}

It does not. I did a quick proof of concept and used your PR test to validate 
it 
[here|https://github.com/apache/cassandra/compare/trunk...blerer:CASSANDRA-14013-review].
 Of course, the test to check if a directory is a table one or not might be 
done better.

We know that, outside of this {{snapshots}} keyspace problem, this code work 
and has been battle tested. By consequence, being pragmatic and pretty 
paranoiac ;-), it feels safer to me to not reimplement the all thing. Specially 
if we take into account that we have to backport the fix to 3.11 and 3.0 (not 
sure for 2.2).  


> Data loss in snapshots keyspace after service restart
> -
>
> Key: CASSANDRA-14013
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14013
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Gregor Uhlenheuer
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I am posting this bug in hope to discover the stupid mistake I am doing 
> because I can't imagine a reasonable answer for the behavior I see right now 
> :-)
> In short words, I do observe data loss in a keyspace called *snapshots* after 
> restarting the Cassandra service. Say I do have 1000 records in a table 
> called *snapshots.test_idx* then after restart the table has less entries or 
> is even empty.
> My kind of "mysterious" observation is that it happens only in a keyspace 
> called *snapshots*...
> h3. Steps to reproduce
> These steps to reproduce show the described behavior in "most" attempts (not 
> every single time though).
> {code}
> # create keyspace
> CREATE KEYSPACE snapshots WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> # create table
> CREATE TABLE snapshots.test_idx (key text, seqno bigint, primary key(key));
> # insert some test data
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1', 1);
> ...
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1000', 1000);
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 1000
> # restart service
> kill 
> cassandra -f
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 0
> {code}
> I hope someone can point me to the obvious mistake I am doing :-)
> This happened to me using both Cassandra 3.9 and 3.11.0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14013) Data loss in snapshots keyspace after service restart

2020-11-04 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226335#comment-17226335
 ] 

Stefan Miklosovic edited comment on CASSANDRA-14013 at 11/4/20, 7:05 PM:
-

[~blerer] more to it, I have just added that "tableId" into test, that is just 
minor detail, the implementation already copes with that.

Devil is in details, for example, for sstableloader, people might put sstable 
into /tmp/some/path/mykeyspace/mytable/(data files), and that "mytable" will 
not have any "id" on it ... the solution works with both scenarios. Plus this 
"path" might be arbitrary, different from the actual data locations specified 
in cassandra.yaml etc ... 

What I do not like in particular is that the whole solution feels rather 
"spaghetti-like" (I do not want to offend here anybody). I based my solution on 
regular expressions.

bq. It seems to me that when we hit a directory named snapshots, it can either 
be the snapshots directory or the keyspace directory.

And you can have also a snapshot taken which is called "snapshots" :) That 
complicates things ever further. Now what, you have a "snapshots" dir where 
snapshots are and there you might have "snapshots" dir which represents the 
snapshot taken etc etc ... 


was (Author: stefan.miklosovic):
[~blerer] more to it, I have just added that "tableId" into test, that is just 
minor detail, the implementation already copes with that.

Devil is in details, for example, for sstableloader, people might put sstable 
into /tmp/some/path/mykeyspace/mytable/(data files), and that "mytable" will 
not have any "id" on it ... the solution works with both scenarios. Plus this 
"path" might be arbitrary, different from the actual data locations specified 
in cassandra.yaml etc ... 

What I do not like in particular is that the whole solution feels rather 
"spaghetti-like" (I do not want to offend here anybody). I based my solution on 
regular expressions.

> Data loss in snapshots keyspace after service restart
> -
>
> Key: CASSANDRA-14013
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14013
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Gregor Uhlenheuer
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I am posting this bug in hope to discover the stupid mistake I am doing 
> because I can't imagine a reasonable answer for the behavior I see right now 
> :-)
> In short words, I do observe data loss in a keyspace called *snapshots* after 
> restarting the Cassandra service. Say I do have 1000 records in a table 
> called *snapshots.test_idx* then after restart the table has less entries or 
> is even empty.
> My kind of "mysterious" observation is that it happens only in a keyspace 
> called *snapshots*...
> h3. Steps to reproduce
> These steps to reproduce show the described behavior in "most" attempts (not 
> every single time though).
> {code}
> # create keyspace
> CREATE KEYSPACE snapshots WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> # create table
> CREATE TABLE snapshots.test_idx (key text, seqno bigint, primary key(key));
> # insert some test data
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1', 1);
> ...
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1000', 1000);
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 1000
> # restart service
> kill 
> cassandra -f
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 0
> {code}
> I hope someone can point me to the obvious mistake I am doing :-)
> This happened to me using both Cassandra 3.9 and 3.11.0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14013) Data loss in snapshots keyspace after service restart

2020-11-04 Thread Benjamin Lerer (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226291#comment-17226291
 ] 

Benjamin Lerer edited comment on CASSANDRA-14013 at 11/4/20, 5:17 PM:
--

It seems to me that when we hit a directory named {{snapshots}}, it can either 
be the {{snapshots}} directory or the keyspace directory.
If the directory is the {{snapshots}} directory then we know that its parent 
will be the table directory and will have a name with the pattern 
{{-}}.  By consequence determining if the name is the 
{{snapshots}} directory or the keyspace directory should be relatively easy.


was (Author: blerer):
It seems to me that when we hit a directory named {{snapshots}}, it can either 
be the {{snapshots}} directory or the keyspace directory.
If the directory is the {{snapshots}} directory then we know that its parent 
will be the table directory and will have a name with the pattern 
{{-}}.  

> Data loss in snapshots keyspace after service restart
> -
>
> Key: CASSANDRA-14013
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14013
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Gregor Uhlenheuer
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I am posting this bug in hope to discover the stupid mistake I am doing 
> because I can't imagine a reasonable answer for the behavior I see right now 
> :-)
> In short words, I do observe data loss in a keyspace called *snapshots* after 
> restarting the Cassandra service. Say I do have 1000 records in a table 
> called *snapshots.test_idx* then after restart the table has less entries or 
> is even empty.
> My kind of "mysterious" observation is that it happens only in a keyspace 
> called *snapshots*...
> h3. Steps to reproduce
> These steps to reproduce show the described behavior in "most" attempts (not 
> every single time though).
> {code}
> # create keyspace
> CREATE KEYSPACE snapshots WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> # create table
> CREATE TABLE snapshots.test_idx (key text, seqno bigint, primary key(key));
> # insert some test data
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1', 1);
> ...
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1000', 1000);
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 1000
> # restart service
> kill 
> cassandra -f
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 0
> {code}
> I hope someone can point me to the obvious mistake I am doing :-)
> This happened to me using both Cassandra 3.9 and 3.11.0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14013) Data loss in snapshots keyspace after service restart

2020-11-04 Thread Benjamin Lerer (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226269#comment-17226269
 ] 

Benjamin Lerer edited comment on CASSANDRA-14013 at 11/4/20, 5:10 PM:
--

{quote}The current solution does not count on the fact that for example there 
might be also a snapshot / table called "snapshots" as well as "backups" for 
example - same for indexes and indexes in backups and snapshots. There might be 
also a snapshot called "snapshot" for a keyspace which is called "snapshots" 
and table which is called "snapshots" too and so on ... {quote}

[~stefan.miklosovic] I do not believe that tables or indexes named 
{{snapshots}} or {{backups}} are trully a problem because their corresponding 
directories will have different names.
The directory name for a table named {{snapshots}} is {{snapshots-}} 
and the directory name for an index named {{snapshots}} is {{.snapshots}}.

The {{testKeyspaceTableParsing}} is incorrect because it assumes that a table 
named {{snapshots}} will result in a directory called {{snapshots}}.  



was (Author: blerer):
{quote}The current solution does not count on the fact that for example there 
might be also a snapshot / table called "snapshot" as well as "backups" for 
example - same for indexes and indexes in backups and snapshots. There might be 
also a snapshot called "snapshot" for a keyspace which is called "snapshot" and 
table which is called "snapshot" too and so on ... {quote}

[~stefan.miklosovic] I do not believe that tables or indexes named {{snapshot}} 
or {{backups}} are trully a problem because their corresponding directories 
will have different names.
The directory name for a table named {{snapshot}} is {{snapshot-}} and 
the directory name for an index named {{snapshot}} is {{.snapshot}}.

The {{testKeyspaceTableParsing}} is incorrect because it assumes that a table 
named {{snapshot}} will result in a directory called {{snapshot}}.  


> Data loss in snapshots keyspace after service restart
> -
>
> Key: CASSANDRA-14013
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14013
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Gregor Uhlenheuer
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I am posting this bug in hope to discover the stupid mistake I am doing 
> because I can't imagine a reasonable answer for the behavior I see right now 
> :-)
> In short words, I do observe data loss in a keyspace called *snapshots* after 
> restarting the Cassandra service. Say I do have 1000 records in a table 
> called *snapshots.test_idx* then after restart the table has less entries or 
> is even empty.
> My kind of "mysterious" observation is that it happens only in a keyspace 
> called *snapshots*...
> h3. Steps to reproduce
> These steps to reproduce show the described behavior in "most" attempts (not 
> every single time though).
> {code}
> # create keyspace
> CREATE KEYSPACE snapshots WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> # create table
> CREATE TABLE snapshots.test_idx (key text, seqno bigint, primary key(key));
> # insert some test data
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1', 1);
> ...
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1000', 1000);
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 1000
> # restart service
> kill 
> cassandra -f
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 0
> {code}
> I hope someone can point me to the obvious mistake I am doing :-)
> This happened to me using both Cassandra 3.9 and 3.11.0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14013) Data loss in snapshots keyspace after service restart

2020-11-04 Thread Benjamin Lerer (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226269#comment-17226269
 ] 

Benjamin Lerer edited comment on CASSANDRA-14013 at 11/4/20, 4:52 PM:
--

{quote}The current solution does not count on the fact that for example there 
might be also a snapshot / table called "snapshot" as well as "backups" for 
example - same for indexes and indexes in backups and snapshots. There might be 
also a snapshot called "snapshot" for a keyspace which is called "snapshot" and 
table which is called "snapshot" too and so on ... {quote}

[~stefan.miklosovic] I do not believe that tables or indexes named {{snapshot}} 
or {{backups}} are trully a problem because their corresponding directories 
will have different names.
The directory name for a table named {{snapshot}} is {{snapshot-}} and 
the directory name for an index named {{snapshot}} is {{.snapshot}}.

The {{testKeyspaceTableParsing}} is incorrect because it assumes that a table 
named {{snapshot}} will result in a directory called {{snapshot}}.  



was (Author: blerer):
{quote}The current solution does not count on the fact that for example there 
might be also a snapshot / table called "snapshot" as well as "backups" for 
example - same for indexes and indexes in backups and snapshots. There might be 
also a snapshot called "snapshot" for a keyspace which is called "snapshot" and 
table which is called "snapshot" too and so on ... {quote}

[~stefan.miklosovic] I do not believe that tables or an indexes named 
{{snapshot}} or {{backups}} are trully a problem because their corresponding 
directories will have different names.
The directory name for a table named {{snapshot}} is {{snapshot-}} and 
the directory name for an index named {{snapshot}} is {{.snapshot}}.

The {{testKeyspaceTableParsing}} is incorrect because it assumes that a table 
named {{snapshot}} will result in a directory called {{snapshot}}.  


> Data loss in snapshots keyspace after service restart
> -
>
> Key: CASSANDRA-14013
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14013
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Gregor Uhlenheuer
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I am posting this bug in hope to discover the stupid mistake I am doing 
> because I can't imagine a reasonable answer for the behavior I see right now 
> :-)
> In short words, I do observe data loss in a keyspace called *snapshots* after 
> restarting the Cassandra service. Say I do have 1000 records in a table 
> called *snapshots.test_idx* then after restart the table has less entries or 
> is even empty.
> My kind of "mysterious" observation is that it happens only in a keyspace 
> called *snapshots*...
> h3. Steps to reproduce
> These steps to reproduce show the described behavior in "most" attempts (not 
> every single time though).
> {code}
> # create keyspace
> CREATE KEYSPACE snapshots WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> # create table
> CREATE TABLE snapshots.test_idx (key text, seqno bigint, primary key(key));
> # insert some test data
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1', 1);
> ...
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1000', 1000);
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 1000
> # restart service
> kill 
> cassandra -f
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 0
> {code}
> I hope someone can point me to the obvious mistake I am doing :-)
> This happened to me using both Cassandra 3.9 and 3.11.0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14013) Data loss in snapshots keyspace after service restart

2020-10-29 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17223066#comment-17223066
 ] 

Stefan Miklosovic edited comment on CASSANDRA-14013 at 10/29/20, 5:27 PM:
--

[~mck]

I wanted to approach this more robustly and I think I am getting there but for 
some strange reason the bunch of tests are failing. I am not sure what is wrong 
here as they do not seem to fail locally.

The current solution does not count on the fact that for example there might be 
also a snapshot / table called "snapshot" as well as "backups" for example - 
same for indexes and indexes in backups and snapshots. There might be also a 
snapshot called "snapshot" for a keyspace which is called "snapshot" and table 
which is called "snapshot" too and so on ...

That code solves all these issues.

The fact that the original tests were failing was that tests do not use data 
locations from descriptor but sstables are somewhere in /tmp/ which furtherly 
complicates things.

PR: [https://github.com/apache/cassandra/pull/798] (there is link to build too).

 

EDIT: some of them do fail locally, I am on it.


was (Author: stefan.miklosovic):
[~mck]

I wanted to approach this more robustly and I think I am getting there but for 
some strange reason the bunch of tests are failing. I am not sure what is wrong 
here as they do not seem to fail locally.

The current solution does not count on the fact that for example there might be 
also a snapshot / table called "snapshot" as well as "backups" for example - 
same for indexes and indexes in backups and snapshots. There might be also a 
snapshot called "snapshot" for a keyspace which is called "snapshot" and table 
which is called "snapshot" too and so on ...

That code solves all these issues.

The fact that tests were failing was that tests do not use data locations from 
descriptor but sstables are somewhere in /tmp/ which furtherly complicates 
things.

PR: [https://github.com/apache/cassandra/pull/798] (there is link to build too).

 

EDIT: some of them do fail locally, I am on it.

> Data loss in snapshots keyspace after service restart
> -
>
> Key: CASSANDRA-14013
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14013
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Gregor Uhlenheuer
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I am posting this bug in hope to discover the stupid mistake I am doing 
> because I can't imagine a reasonable answer for the behavior I see right now 
> :-)
> In short words, I do observe data loss in a keyspace called *snapshots* after 
> restarting the Cassandra service. Say I do have 1000 records in a table 
> called *snapshots.test_idx* then after restart the table has less entries or 
> is even empty.
> My kind of "mysterious" observation is that it happens only in a keyspace 
> called *snapshots*...
> h3. Steps to reproduce
> These steps to reproduce show the described behavior in "most" attempts (not 
> every single time though).
> {code}
> # create keyspace
> CREATE KEYSPACE snapshots WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> # create table
> CREATE TABLE snapshots.test_idx (key text, seqno bigint, primary key(key));
> # insert some test data
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1', 1);
> ...
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1000', 1000);
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 1000
> # restart service
> kill 
> cassandra -f
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 0
> {code}
> I hope someone can point me to the obvious mistake I am doing :-)
> This happened to me using both Cassandra 3.9 and 3.11.0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14013) Data loss in snapshots keyspace after service restart

2020-10-29 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17223066#comment-17223066
 ] 

Stefan Miklosovic edited comment on CASSANDRA-14013 at 10/29/20, 5:18 PM:
--

[~mck]

I wanted to approach this more robustly and I think I am getting there but for 
some strange reason the bunch of tests are failing. I am not sure what is wrong 
here as they do not seem to fail locally.

The current solution does not count on the fact that for example there might be 
also a snapshot / table called "snapshot" as well as "backups" for example - 
same for indexes and indexes in backups and snapshots. There might be also a 
snapshot called "snapshot" for a keyspace which is called "snapshot" and table 
which is called "snapshot" too and so on ...

That code solves all these issues.

The fact that tests were failing was that tests do not use data locations from 
descriptor but sstables are somewhere in /tmp/ which furtherly complicates 
things.

PR: [https://github.com/apache/cassandra/pull/798] (there is link to build too).

 

EDIT: some of them do fail locally, I am on it.


was (Author: stefan.miklosovic):
[~mck]

I wanted to approach this more robustly and I think I am getting there but for 
some strange reason the bunch of tests are failing. I am not sure what is wrong 
here as they do not seem to fail locally.

The current solution does not count on the fact that for example there might be 
also a snapshot / table called "snapshot" as well as "backups" for example - 
same for indexes and indexes in backups and snapshots. There might be also a 
snapshot called "snapshot" for a keyspace which is called "snapshot" and table 
which is called "snapshot" too and so on ...

That code solves all these issues.

The fact that tests were failing was that tests do not use data locations from 
descriptor but sstables are somewhere in /tmp/ which furtherly complicates 
things.

 

PR: [https://github.com/apache/cassandra/pull/798] (there is link to build too).

> Data loss in snapshots keyspace after service restart
> -
>
> Key: CASSANDRA-14013
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14013
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Gregor Uhlenheuer
>Assignee: Stefan Miklosovic
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> I am posting this bug in hope to discover the stupid mistake I am doing 
> because I can't imagine a reasonable answer for the behavior I see right now 
> :-)
> In short words, I do observe data loss in a keyspace called *snapshots* after 
> restarting the Cassandra service. Say I do have 1000 records in a table 
> called *snapshots.test_idx* then after restart the table has less entries or 
> is even empty.
> My kind of "mysterious" observation is that it happens only in a keyspace 
> called *snapshots*...
> h3. Steps to reproduce
> These steps to reproduce show the described behavior in "most" attempts (not 
> every single time though).
> {code}
> # create keyspace
> CREATE KEYSPACE snapshots WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> # create table
> CREATE TABLE snapshots.test_idx (key text, seqno bigint, primary key(key));
> # insert some test data
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1', 1);
> ...
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1000', 1000);
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 1000
> # restart service
> kill 
> cassandra -f
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 0
> {code}
> I hope someone can point me to the obvious mistake I am doing :-)
> This happened to me using both Cassandra 3.9 and 3.11.0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14013) Data loss in snapshots keyspace after service restart

2019-10-20 Thread Michael Semb Wever (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955186#comment-16955186
 ] 

Michael Semb Wever edited comment on CASSANDRA-14013 at 10/20/19 7:39 PM:
--

||branch||circleci||asf jenkins testall||asf jenkins dtests||
|[cassandra-3.0_14013|https://github.com/apache/cassandra/compare/trunk...vincewhite:14013-30]|[circleci|https://circleci.com/gh/vincewhite/workflows/cassandra/tree/14013-30]|[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-testall/59//badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-testall/59/]|[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/694//badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/694]|
|[cassandra-3.11_14013|https://github.com/apache/cassandra/compare/trunk...vincewhite:14013-test]|[circleci|https://circleci.com/gh/vincewhite/workflows/cassandra/tree/14013-test]|[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-testall/60//badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-testall/60/]|[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/695//badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/695]|
|[trunk_14013|https://github.com/apache/cassandra/compare/trunk...vincewhite:14013-trunk]|[circleci|https://circleci.com/gh/vincewhite/workflows/cassandra/tree/14013-trunk]|[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-testall/61//badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-testall/61/]|[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/699//badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/699]|



was (Author: michaelsembwever):

||branch||circleci||asf jenkins testall||asf jenkins dtests||
|[cassandra-3.0_14013|https://github.com/apache/cassandra/compare/trunk...vincewhite:14013-30]|[circleci|https://circleci.com/gh/vincewhite/workflows/cassandra/tree/14013-30]|[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-testall/59//badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-testall/59/]|[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/694//badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/694]|
|[cassandra-3.11_14013|https://github.com/apache/cassandra/compare/trunk...vincewhite:14013-test]|[circleci|https://circleci.com/gh/vincewhite/workflows/cassandra/tree/14013-test]|[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-testall/60//badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-testall/60/]|[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/695//badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/695]|
|[trunk_14013|https://github.com/apache/cassandra/compare/trunk...vincewhite:14013-trunk]|[circleci|https://circleci.com/gh/vincewhite/workflows/cassandra/tree/14013-trunk]|[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-testall/61//badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-testall/61/]|[!https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/696//badge/icon!|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/696]|


> Data loss in snapshots keyspace after service restart
> -
>
> Key: CASSANDRA-14013
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14013
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Gregor Uhlenheuer
>Assignee: Vincent White
>Priority: Normal
>
> I am posting this bug in hope to discover the stupid mistake I am doing 
> because I can't imagine a reasonable answer for the behavior I see right now 
> :-)
> In short words, I do observe data loss in a keyspace called *snapshots* after 
> restarting the Cassandra service. Say I do have 1000 records in a table 
> called *snapshots.test_idx* then after restart the table has less entries or 
> is even empty.
> My kind of "mysterious" observation is that it happens only in a keyspace 
> called *snapshots*...
> h3. Steps to reproduce
> These steps to reproduce show the described behavior in "most" attempts (not 
> every single time though).
> {code}
> # create keyspace
> CREATE KEYSPACE snapshots WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> # create table

[jira] [Comment Edited] (CASSANDRA-14013) Data loss in snapshots keyspace after service restart

2017-11-28 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268963#comment-16268963
 ] 

Jason Brown edited comment on CASSANDRA-14013 at 11/28/17 4:46 PM:
---

[~VincentWhite] Thanks for looking at this. Did you have a chance to clean 
up/add a test? I can review.

UPDATE: reread your comment, where you are looking for feedback. I will do so 
later today.


was (Author: jasobrown):
[~VincentWhite] Thanks for looking at this. Did you have a chance to clean 
up/add a test? I can review.

> Data loss in snapshots keyspace after service restart
> -
>
> Key: CASSANDRA-14013
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14013
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Gregor Uhlenheuer
>Assignee: Vincent White
>
> I am posting this bug in hope to discover the stupid mistake I am doing 
> because I can't imagine a reasonable answer for the behavior I see right now 
> :-)
> In short words, I do observe data loss in a keyspace called *snapshots* after 
> restarting the Cassandra service. Say I do have 1000 records in a table 
> called *snapshots.test_idx* then after restart the table has less entries or 
> is even empty.
> My kind of "mysterious" observation is that it happens only in a keyspace 
> called *snapshots*...
> h3. Steps to reproduce
> These steps to reproduce show the described behavior in "most" attempts (not 
> every single time though).
> {code}
> # create keyspace
> CREATE KEYSPACE snapshots WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 1};
> # create table
> CREATE TABLE snapshots.test_idx (key text, seqno bigint, primary key(key));
> # insert some test data
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1', 1);
> ...
> INSERT INTO snapshots.test_idx (key,seqno) values ('key1000', 1000);
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 1000
> # restart service
> kill 
> cassandra -f
> # count entries
> SELECT count(*) FROM snapshots.test_idx;
> 0
> {code}
> I hope someone can point me to the obvious mistake I am doing :-)
> This happened to me using both Cassandra 3.9 and 3.11.0



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org