[jira] [Commented] (CASSANDRA-1983) Make sstable filenames contain a UUID instead of increasing integer

2014-02-18 Thread Daniel Shelepov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13903877#comment-13903877
 ] 

Daniel Shelepov commented on CASSANDRA-1983:


Notes so far:

- sstable filenames are controlled by the io/sstable/Descriptor class, which 
encapsulates a few parameters including generation -- the increasing integer 
in question.
- dropping generation in favor of a uuid seems questionable, given that 
generation is used by a wide variety of clients in the codebase.  So the most 
likely approach is uuid + generation side by side.
- using the host id as the uuid is easy conceptually, but will violate 
layering, because code in io will start to depend on db and/or service.  Plus 
there is potential bootstrapping problem where system sstables need to be 
initialized early on during boot, and it's not clear whether the unique host id 
is available early enough to feed into system sstable descriptors.
- random uuids are also tricky, because sstable names will no longer be 
discoverable without directory lookups.  Some code (particularly in unit tests) 
leans on the ability to synthesize sstable names without touching the 
filesystem.  It's possible to persist these uuids in one of the system tables, 
but it will have to be a local table, and, regardless, changing system schema 
can make this a breaking change.

I haven't yet found a cost-effective fix that would involve actually modifying 
the existing naming scheme.

The latest idea I have is to create a directory that will hold symlinks to real 
sstables (symlinks are available in Java 7).  Symlink names will contain the 
UUIDs.  The only extra piece of code would be creating and tearing down 
symlinks when real sstables are created and deleted.  End users could then 
access sstables through this symlink directory whenever doing related 
maintenance. The last piece would be making sure that appropriate clients, such 
as the compactor, can consume sstables with and without UUIDs.

I'll work on this some more tomorrow, but it'll probably spill until next week 
(or later).

Comments welcome.

 Make sstable filenames contain a UUID instead of increasing integer
 ---

 Key: CASSANDRA-1983
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1983
 Project: Cassandra
  Issue Type: Improvement
Reporter: David King
Priority: Minor

 sstable filenames look like CFName-1569-Index.db, containing an integer for 
 uniqueness. This makes it possible (however unlikely) that the integer could 
 overflow, which could be a problem. It also makes it difficult to collapse 
 multiple nodes into a single one with rsync. I do this occasionally for 
 testing: I'll copy our 20 node cluster into only 3 nodes by copying all of 
 the data files and running cleanup; at present this requires a manual step of 
 uniqifying the overlapping sstable names. If instead of an incrementing 
 integer, it would be handy if these contained a UUID or somesuch that 
 guarantees uniqueness across the cluster.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-1983) Make sstable filenames contain a UUID instead of increasing integer

2014-02-18 Thread Daniel Shelepov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13904441#comment-13904441
 ] 

Daniel Shelepov commented on CASSANDRA-1983:


OK, but distinct symlinks are much easier to make unique, because they won't 
affect all that code that expects to find sstables under well-known names 
(regular names still being used in regular sstable storage).  The fact that 
they're symlinks allows decoupling the problem from internal naming 
requirements.

 Make sstable filenames contain a UUID instead of increasing integer
 ---

 Key: CASSANDRA-1983
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1983
 Project: Cassandra
  Issue Type: Improvement
Reporter: David King
Priority: Minor

 sstable filenames look like CFName-1569-Index.db, containing an integer for 
 uniqueness. This makes it possible (however unlikely) that the integer could 
 overflow, which could be a problem. It also makes it difficult to collapse 
 multiple nodes into a single one with rsync. I do this occasionally for 
 testing: I'll copy our 20 node cluster into only 3 nodes by copying all of 
 the data files and running cleanup; at present this requires a manual step of 
 uniqifying the overlapping sstable names. If instead of an incrementing 
 integer, it would be handy if these contained a UUID or somesuch that 
 guarantees uniqueness across the cluster.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-1983) Make sstable filenames contain a UUID instead of increasing integer

2014-02-13 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13900892#comment-13900892
 ] 

Jonathan Ellis commented on CASSANDRA-1983:
---

Yes.

 Make sstable filenames contain a UUID instead of increasing integer
 ---

 Key: CASSANDRA-1983
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1983
 Project: Cassandra
  Issue Type: Improvement
Reporter: David King
Priority: Minor

 sstable filenames look like CFName-1569-Index.db, containing an integer for 
 uniqueness. This makes it possible (however unlikely) that the integer could 
 overflow, which could be a problem. It also makes it difficult to collapse 
 multiple nodes into a single one with rsync. I do this occasionally for 
 testing: I'll copy our 20 node cluster into only 3 nodes by copying all of 
 the data files and running cleanup; at present this requires a manual step of 
 uniqifying the overlapping sstable names. If instead of an incrementing 
 integer, it would be handy if these contained a UUID or somesuch that 
 guarantees uniqueness across the cluster.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-1983) Make sstable filenames contain a UUID instead of increasing integer

2014-02-12 Thread Daniel Shelepov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13900054#comment-13900054
 ] 

Daniel Shelepov commented on CASSANDRA-1983:


Is this still needed?  Naming in 2.0+ is still incremental as far as I can 
tell.  

I'd like to work on this fix while I'm learning the codebase.

 Make sstable filenames contain a UUID instead of increasing integer
 ---

 Key: CASSANDRA-1983
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1983
 Project: Cassandra
  Issue Type: Improvement
Reporter: David King
Priority: Minor

 sstable filenames look like CFName-1569-Index.db, containing an integer for 
 uniqueness. This makes it possible (however unlikely) that the integer could 
 overflow, which could be a problem. It also makes it difficult to collapse 
 multiple nodes into a single one with rsync. I do this occasionally for 
 testing: I'll copy our 20 node cluster into only 3 nodes by copying all of 
 the data files and running cleanup; at present this requires a manual step of 
 uniqifying the overlapping sstable names. If instead of an incrementing 
 integer, it would be handy if these contained a UUID or somesuch that 
 guarantees uniqueness across the cluster.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-1983) Make sstable filenames contain a UUID instead of increasing integer

2012-04-26 Thread Robert Coli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262809#comment-13262809
 ] 

Robert Coli commented on CASSANDRA-1983:


+1 here, global uniqueness for sstable names would make many copy-the-sstables 
style maint operations easier, as you wouldn't have to manually resolve the 
namespace conflict. just now I saw someone in #cassandra who was setting up a 
cluster with a copy of data get confused by non-unique filenames being 
overwritten on his new cluster. the only downside seems to be longer sstable 
file names.

 Make sstable filenames contain a UUID instead of increasing integer
 ---

 Key: CASSANDRA-1983
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1983
 Project: Cassandra
  Issue Type: Improvement
Reporter: David King
Priority: Minor

 sstable filenames look like CFName-1569-Index.db, containing an integer for 
 uniqueness. This makes it possible (however unlikely) that the integer could 
 overflow, which could be a problem. It also makes it difficult to collapse 
 multiple nodes into a single one with rsync. I do this occasionally for 
 testing: I'll copy our 20 node cluster into only 3 nodes by copying all of 
 the data files and running cleanup; at present this requires a manual step of 
 uniqifying the overlapping sstable names. If instead of an incrementing 
 integer, it would be handy if these contained a UUID or somesuch that 
 guarantees uniqueness across the cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Commented: (CASSANDRA-1983) Make sstable filenames contain a UUID instead of increasing integer

2011-01-13 Thread Ryan King (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12981507#action_12981507
 ] 

Ryan King commented on CASSANDRA-1983:
--

Alternatively, since we'll need a host-uuid mapping for counters we can put 
that uuid in the filename along with a serial integer (make it a long and we 
should be ok, right?)

 Make sstable filenames contain a UUID instead of increasing integer
 ---

 Key: CASSANDRA-1983
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1983
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: David King
Priority: Minor

 sstable filenames look like CFName-1569-Index.db, containing an integer for 
 uniqueness. This makes it possible (however unlikely) that the integer could 
 overflow, which could be a problem. It also makes it difficult to collapse 
 multiple nodes into a single one with rsync. I do this occasionally for 
 testing: I'll copy our 20 node cluster into only 3 nodes by copying all of 
 the data files and running cleanup; at present this requires a manual step of 
 uniqifying the overlapping sstable names. If instead of an incrementing 
 integer, it would be handy if these contained a UUID or somesuch that 
 guarantees uniqueness across the cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1983) Make sstable filenames contain a UUID instead of increasing integer

2011-01-13 Thread David King (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12981509#action_12981509
 ] 

David King commented on CASSANDRA-1983:
---

As long as the host is still willing to read filenames without its own uuid, 
sure

 Make sstable filenames contain a UUID instead of increasing integer
 ---

 Key: CASSANDRA-1983
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1983
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: David King
Priority: Minor

 sstable filenames look like CFName-1569-Index.db, containing an integer for 
 uniqueness. This makes it possible (however unlikely) that the integer could 
 overflow, which could be a problem. It also makes it difficult to collapse 
 multiple nodes into a single one with rsync. I do this occasionally for 
 testing: I'll copy our 20 node cluster into only 3 nodes by copying all of 
 the data files and running cleanup; at present this requires a manual step of 
 uniqifying the overlapping sstable names. If instead of an incrementing 
 integer, it would be handy if these contained a UUID or somesuch that 
 guarantees uniqueness across the cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.