[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2012-01-24 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13192294#comment-13192294
 ] 

Jonathan Ellis commented on CASSANDRA-1391:
---

Is this going to be done today or tomorrow for 1.1 freeze?

On Sat, Jan 21, 2012 at 3:31 PM, Pavel Yaskevich (Commented) (JIRA)



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
Priority: Critical
 Fix For: 1.1

 Attachments: 0001-CASSANDRA-1391-main.patch, 
 0002-CASSANDRA-1391-fixes.patch, 1391-rebased.txt, CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2012-01-24 Thread Pavel Yaskevich (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13192300#comment-13192300
 ] 

Pavel Yaskevich commented on CASSANDRA-1391:


Absolutely! I'm finishing up few last things and going to attach a patch in few 
hours.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
Priority: Critical
 Fix For: 1.1

 Attachments: 0001-CASSANDRA-1391-main.patch, 
 0002-CASSANDRA-1391-fixes.patch, 1391-rebased.txt, CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2012-01-24 Thread Brandon Williams (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13192487#comment-13192487
 ] 

Brandon Williams commented on CASSANDRA-1391:
-

No exceptions while testing v2.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
Priority: Critical
 Fix For: 1.1

 Attachments: 0001-CASSANDRA-1391-main.patch, 
 0002-CASSANDRA-1391-fixes.patch, 1391-rebased.txt, CASSANDRA-1391-v2.patch, 
 CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2012-01-24 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13192729#comment-13192729
 ] 

Jonathan Ellis commented on CASSANDRA-1391:
---

LGTM, +1.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
Priority: Critical
 Fix For: 1.1

 Attachments: 0001-CASSANDRA-1391-main.patch, 
 0002-CASSANDRA-1391-fixes.patch, 1391-rebased.txt, CASSANDRA-1391-v2.patch, 
 CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2012-01-21 Thread Pavel Yaskevich (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13190403#comment-13190403
 ] 

Pavel Yaskevich commented on CASSANDRA-1391:


Jonathan: Sure, I will do that, although would it be better to name 
ColumnFamilies using camel-case like SchemaKeyspaces, SchemaColumnFamilies, 
SchemaColumns instead?

Brandon: can you please describe the situation when that happend, have you 
deleted all of the columns in update? It seems like I just forgot to add if 
(columnDefs == null) return empty map; case in 
ColumnDefition.toMap(ListColumnDef) method.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
Priority: Critical
 Fix For: 1.1

 Attachments: 0001-CASSANDRA-1391-main.patch, 
 0002-CASSANDRA-1391-fixes.patch, 1391-rebased.txt, CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2012-01-21 Thread Brandon Williams (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13190423#comment-13190423
 ] 

Brandon Williams commented on CASSANDRA-1391:
-

bq. can you please describe the situation when that happened

It was pretty simple, I was just getting warmed up :)  I issued a creation of a 
CF on three machines at once; one got a schema disagreement and the other two 
received this exception.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
Priority: Critical
 Fix For: 1.1

 Attachments: 0001-CASSANDRA-1391-main.patch, 
 0002-CASSANDRA-1391-fixes.patch, 1391-rebased.txt, CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2012-01-21 Thread Pavel Yaskevich (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13190427#comment-13190427
 ] 

Pavel Yaskevich commented on CASSANDRA-1391:


Ok, gotcha :) I will add null check to ColumnDefinition, that I mentioned 
previously, and re-test everything once again when done with changes requested 
by Jonathan.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
Priority: Critical
 Fix For: 1.1

 Attachments: 0001-CASSANDRA-1391-main.patch, 
 0002-CASSANDRA-1391-fixes.patch, 1391-rebased.txt, CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2012-01-21 Thread Sylvain Lebresne (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13190476#comment-13190476
 ] 

Sylvain Lebresne commented on CASSANDRA-1391:
-

bq. although would it be better to name ColumnFamilies using camel-case

Making them fully lowercase internally would make them case-insensitive without 
any work using the current patch for CASSANDRA-3761. It's not really a big deal 
in any case because 1) the patch for CASSANDRA-3761 is not yet committed so it 
could change and 2) it won't be very hard to had some special casing for those 
if we wish so. But if nobody  has a preference, I would suggest calling them 
'keyspaces' and 'columnfamilies' directly.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
Priority: Critical
 Fix For: 1.1

 Attachments: 0001-CASSANDRA-1391-main.patch, 
 0002-CASSANDRA-1391-fixes.patch, 1391-rebased.txt, CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2012-01-21 Thread Pavel Yaskevich (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13190501#comment-13190501
 ] 

Pavel Yaskevich commented on CASSANDRA-1391:


I'm fine dropping schema_ prefix and going with keyspaces, columnfamilies 
but how do we name columns cf then, something like columnfamily_columns?

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
Priority: Critical
 Fix For: 1.1

 Attachments: 0001-CASSANDRA-1391-main.patch, 
 0002-CASSANDRA-1391-fixes.patch, 1391-rebased.txt, CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2012-01-21 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13190533#comment-13190533
 ] 

Jonathan Ellis commented on CASSANDRA-1391:
---

It's not a big deal, but IMO undescore + lowercase fits better with CQL 3.0 
making everything case-insensitive by default.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
Priority: Critical
 Fix For: 1.1

 Attachments: 0001-CASSANDRA-1391-main.patch, 
 0002-CASSANDRA-1391-fixes.patch, 1391-rebased.txt, CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2012-01-21 Thread Pavel Yaskevich (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13190534#comment-13190534
 ] 

Pavel Yaskevich commented on CASSANDRA-1391:


Works for me, so be it schema_keyspaces, schema_columnfamilies and 
schema_columns.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
Priority: Critical
 Fix For: 1.1

 Attachments: 0001-CASSANDRA-1391-main.patch, 
 0002-CASSANDRA-1391-fixes.patch, 1391-rebased.txt, CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2012-01-20 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13189916#comment-13189916
 ] 

Jonathan Ellis commented on CASSANDRA-1391:
---

Thanks, that helps a lot.

I think we can make life easier for clients dealing with CASSANDRA-2477 if we 
split out the columns into a separate CF, and adjust how we use composites for 
the columnfamilies cf:

{noformat}
schema_keyspaces

schema_keyspaces

RowKey: ks
  = (column=durable_writes, value=true, timestamp=1327061028312185000)
  = (column=name, value=ks, timestamp=1327061028312185000)
  = (column=replication_factor, value=0, timestamp=1327061028312185000)
  = (column=strategy_class, 
value=org.apache.cassandra.locator.NetworkTopologyStrategy, 
timestamp=1327061028312185000)
  = (column=strategy_options, value={datacenter1:1}, 
timestamp=1327061028312185000)

schema_columnfamilies
-
RowKey: ks
  = (column=cf:bloom_filter_fp_chance, value=0.0, 
timestamp=1327061105833119000)
  = (column=cf:caching, value=NONE, timestamp=1327061105833119000)
  = (column=cf:column_type, value=Standard, timestamp=1327061105833119000)
  = (column=cf:comment, value=ColumnFamily, timestamp=1327061105833119000)
  = (column=cf:default_validation_class, 
value=org.apache.cassandra.db.marshal.BytesType, 
timestamp=1327061105833119000)
  = (column=cf:gc_grace_seconds, value=864000, timestamp=1327061105833119000)
  = (column=cf:id, value=1000, timestamp=1327061105833119000)
  = (column=cf:key_alias, value=S0VZ, timestamp=1327061105833119000)


schema_columns
--
RowKey: ks
  = (column=cf:c:index_name, value=null, timestamp=1327061105833119000)
  = (column=cf:c:index_options, value=null, timestamp=1327061105833119000)
  = (column=cf:c:index_type, value=null, timestamp=1327061105833119000)
  = (column=cf:c:name, value=aGVsbG8=, timestamp=1327061105833119000)
  = (column=cf:c:validation_class, 
value=org.apache.cassandra.db.marshal.AsciiType, 
timestamp=1327061105833119000)

{noformat}

This will be more forwards-compatible with CASSANDRA-2474/CQL 3.0, since these 
correspond to tables having PRIMARY KEY (keyspace, columnfamily) and PRIMARY 
KEY (keyspace, columnfamily, column), respectively.

This also has the side benefit of grouping everything for a single keyspace 
under the same row key, which means it will be a single atomic RowMutation.

I think leaving strategy_options as json is fine.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
Priority: Critical
 Fix For: 1.1

 Attachments: 0001-CASSANDRA-1391-main.patch, 
 0002-CASSANDRA-1391-fixes.patch, 1391-rebased.txt, CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2012-01-20 Thread Brandon Williams (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13190100#comment-13190100
 ] 

Brandon Williams commented on CASSANDRA-1391:
-

I get exceptions while inducing concurrent schema changes:

{noformat}

ERROR 20:58:10,904 Fatal exception in thread Thread[MigrationStage:1,5,main]
java.lang.NullPointerException
at 
org.apache.cassandra.config.ColumnDefinition.toMap(ColumnDefinition.java:150)
at org.apache.cassandra.config.CFMetaData.diff(CFMetaData.java:978)
at 
org.apache.cassandra.db.migration.MigrationHelper.updateColumnFamily(MigrationHelper.java:285)
at 
org.apache.cassandra.db.migration.MigrationHelper.updateColumnFamily(MigrationHelper.java:200)
at 
org.apache.cassandra.db.DefsTable.mergeRemoteSchema(DefsTable.java:360)
at 
org.apache.cassandra.db.DefinitionsUpdateVerbHandler$1.runMayThrow(DefinitionsUpdateVerbHandler.java:48)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
{noformat}

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
Priority: Critical
 Fix For: 1.1

 Attachments: 0001-CASSANDRA-1391-main.patch, 
 0002-CASSANDRA-1391-fixes.patch, 1391-rebased.txt, CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2012-01-19 Thread Pavel Yaskevich (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13189511#comment-13189511
 ] 

Pavel Yaskevich commented on CASSANDRA-1391:


bq. validateSchemaAgreement is unnecessary now right?

I think it's still a good idea to validate if all nodes have the same schema.

bq. the old Migration infrastructure feels unnecessarily heavyweight now. Can 
we move the validation into the CassandraServer methods, and then just invoke a 
MigrationHelper method from a runnable there?

I tried to optimize it as much as possible because I still think that there is 
a reason to keep it because it encapsulates all announce, apply and validation 
logic pretty good. I tried to move validation and stuff to the CassandraServer 
but it shows itself as hardly readable and heavy-weight. 

bq. should we snapshot the old avro schema before nuking it?

MigrationHelper.dropColumnFamily that I call to remove Migrations and Schema 
CFs makes snapshot of the data.

bq. SystemTable.dropOldSchemaTables is a no-op. I think we can take this out 
entirely since loadSchema/fromAvro takes care of it?

Ugh, I forgot to remove it from the final version of the patch, sorry...

bq. Can you add a comment describing the layout of the new schema CFs to 
defstable or systemtable?

Sure, I will do that in SystemTable.

bq. I'd prefer to leave the low level slicing / deserialize in SystemTable 
class instead of scattered between Schema and DefsTable

Sure, I will move serialize and serialized methods from Schema to SystemTable, 
plus DefsTable.readSchemaRow and getSchema also go there.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
Priority: Critical
 Fix For: 1.1

 Attachments: 1391-rebased.txt, CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2012-01-06 Thread Sylvain Lebresne (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181243#comment-13181243
 ] 

Sylvain Lebresne commented on CASSANDRA-1391:
-

Besides, I really think that 'get rid of thrift (or avro) internally' is a win 
in the long run for doing it using apply/diff if there was no other argument. 
But, as Jonathan, I see no reason not to apply/diff if we can instead of 
rewriting equivalent methods.

Moreover, it seems to me that the apply/diff approach would mean that a schema 
change would basically be 'send new schema as batch mutation to all nodes' and 
the 'does node1 and node2 agree on schema' is just 'read node1 and node2 schema 
row, diff the result and send a batch mutation with whatever each node needs'. 
There is no need for 'schema versions' or anything, column timestamp just deal 
with that problem. You only ever keep one row for the schema and that's it. So 
it seems to me that it's basically making concurrency a non issue (because we 
already handle concurrency internally and through the use of column 
timestamps), while I don't see how those concurrency issues can be free if you 
use some thrift serialization.

As Jonathan said, there may be a fundamental problem with the apply/diff 
approach, but I don't see any right away (and truth is I'm much more confident 
in C* existing data and concurrency model to handle conflicts during schema 
update than in the current (or any) had-hoc migration thingy).

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.1

 Attachments: 
 0001-new-migration-schema-and-avro-methods-cleanup.patch, 
 0002-avro-removal.patch, 
 0003-oldVersion-removed-new-migration-distribution-schema.patch, 
 CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2012-01-06 Thread Pavel Yaskevich (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181316#comment-13181316
 ] 

Pavel Yaskevich commented on CASSANDRA-1391:


Let me get this clear - migrations use apply/diff internally for their actions 
upon KEYSPACE_CF. 0003 patch introduces content-based schema version which is 
calculated from KEYSPACES_CF.

KEYSPACES_CF layout

{noformat}
name: { // key
  'keyspace': str,
  'comparator': str,
  ... 
  'columns': { // composite!
column name: {
  'validation_class': str,
  'index_type': str,
  'index_name': str,
  'index_options': { }
}
  }
}
{noformat}

Current schema distribution is switched to be pull oriented: node A, let's call 
it coordinator, applies migration locally and gossips its new (content-based) 
version to the ring. Node B checks if it's current version differs from new 
version of Node A and if so, it makes a migration request to coordinator by 
sending MIGRATION_REQUEST message with list of its local migrations attached. 
Coordinator upon receiving that message makes a diff between B migrations and 
its local and replies to B with missing migrations. The last thing for B to do 
is just deserialize received migrations and apply them one-by-one. Upon startup 
node uses onAlive gossip handler to check versions on other nodes and request 
missing migrations if needed. 

It feels to be better than sending the whole KEYSPACES_CF on each schema change 
and let receiver to decide what actions to do upon it.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.1

 Attachments: 
 0001-new-migration-schema-and-avro-methods-cleanup.patch, 
 0002-avro-removal.patch, 
 0003-oldVersion-removed-new-migration-distribution-schema.patch, 
 CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2012-01-06 Thread Sylvain Lebresne (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181379#comment-13181379
 ] 

Sylvain Lebresne commented on CASSANDRA-1391:
-

bq. Let me get this clear - migrations use apply/diff internally for their 
actions upon KEYSPACE_CF

But then why do the patches still use thrift internally (I have only quickly 
eyeballed the patches but it does seem to use thrift, which seems confirmed by 
Jonathan comments).

bq. sending MIGRATION_REQUEST message with list of its local migrations attached

Does that mean we still keep the list of all migrations (diffs) that have ever 
been applied? If so, I would be in favor of getting rid of it, as it seems to 
me we can do without (node could use the diffs between their schema and another 
node schema and base whatever action have to be done (directories creation, 
etc...) on that, be we wouldn't keep the diff afterwards).

bq. Current schema distribution is switched to be pull oriented

This is probably not a huge deal but it means that schema changes will be a tad 
slower, based on gossip reactivity.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.1

 Attachments: 
 0001-new-migration-schema-and-avro-methods-cleanup.patch, 
 0002-avro-removal.patch, 
 0003-oldVersion-removed-new-migration-distribution-schema.patch, 
 CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2012-01-06 Thread Pavel Yaskevich (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181392#comment-13181392
 ] 

Pavel Yaskevich commented on CASSANDRA-1391:


bq. But then why do the patches still use thrift internally (I have only 
quickly eyeballed the patches but it does seem to use thrift, which seems 
confirmed by Jonathan comments).

Thrift is only used to keep current/new state of the ks/cf inside of the 
migration to be used for diff upon apply, it doesn't really matter how we would 
keep that - thrift, json, even comma separated plain text. I was guided by the 
fact that we already use thrift internally and it makes it easy to get 
attributes of the object when needed.

bq. Does that mean we still keep the list of all migrations (diffs) that have 
ever been applied? If so, I would be in favor of getting rid of it, as it seems 
to me we can do without (node could use the diffs between their schema and 
another node schema and base whatever action have to be done (directories 
creation, etc...) on that, be we wouldn't keep the diff afterwards).

I still don't get that huge deal why we need to re-implement migration logic 
that way, what does it really gives us comparing to migrations? Migrations 
already give as a straight way to tell a node what to do without involving any 
schema transfers or decision making.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.1

 Attachments: 
 0001-new-migration-schema-and-avro-methods-cleanup.patch, 
 0002-avro-removal.patch, 
 0003-oldVersion-removed-new-migration-distribution-schema.patch, 
 CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2012-01-06 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181393#comment-13181393
 ] 

Jonathan Ellis commented on CASSANDRA-1391:
---

I think pull is the right fit for a content-based schema version:

bq. Currently, schema changes are push-only (see: SS.pushMigrations). So 
without changing that, yes, both nodes will send schema to each other (which 
will be a no-op on the newer one). That's not a blocker for me. I'd be fine 
switching to a pull model in either this or a followup ticket, which would let 
the newer side skip its pull if it recognizes the remote version as one it used 
to have (which would reliably indicate it's older even if timestamp ties are 
involved).

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.1

 Attachments: 
 0001-new-migration-schema-and-avro-methods-cleanup.patch, 
 0002-avro-removal.patch, 
 0003-oldVersion-removed-new-migration-distribution-schema.patch, 
 CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2012-01-06 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181398#comment-13181398
 ] 

Jonathan Ellis commented on CASSANDRA-1391:
---

bq. what does it really gives us comparing to migrations? 

it should be equivalent to one big migration when we use apply/diff.  which 
is a lot simpler than recording each change since the beginning of time and 
replaying them, and a lot more convenient when adding new nodes to do it all at 
once.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.1

 Attachments: 
 0001-new-migration-schema-and-avro-methods-cleanup.patch, 
 0002-avro-removal.patch, 
 0003-oldVersion-removed-new-migration-distribution-schema.patch, 
 CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2012-01-06 Thread Pavel Yaskevich (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181606#comment-13181606
 ] 

Pavel Yaskevich commented on CASSANDRA-1391:


It doesn't feel to simple for me to do it in one big run - where you would need 
to go through whole schema (possibly few times) comparing to applying a single 
well-defined change. We are not expecting migrations to be very frequent e.g. 
because special conditions should be ensured according to ks/cf getting 
updated, so even with 1000 (I don't know if it's even a real deal) migrations 
it's not a problem for a new node to get them in one request and apply 
sequentially.  

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.1

 Attachments: 
 0001-new-migration-schema-and-avro-methods-cleanup.patch, 
 0002-avro-removal.patch, 
 0003-oldVersion-removed-new-migration-distribution-schema.patch, 
 CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2012-01-06 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13181646#comment-13181646
 ] 

Jonathan Ellis commented on CASSANDRA-1391:
---

But if it's modeled as the migration is just a RowMutation whose CF is in the 
same format we store the schema internally there is no extra code involved to 
do both.  You can have a small migration rowmutation for when the user adds a 
column, and just send the entire schema at once for new nodes joining.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.1

 Attachments: 
 0001-new-migration-schema-and-avro-methods-cleanup.patch, 
 0002-avro-removal.patch, 
 0003-oldVersion-removed-new-migration-distribution-schema.patch, 
 CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2012-01-05 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13180527#comment-13180527
 ] 

Jonathan Ellis commented on CASSANDRA-1391:
---

While getting rid of avro is great, replacing serialize to avro with 
serialize to thrift isn't as much improvement as I was hoping for.  I thought 
we were on board with modeling the schema natively in system columnfamilies 
as sketched in 
https://issues.apache.org/jira/browse/CASSANDRA-1391?focusedCommentId=13149875page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13149875.
  Which would allow the apply/diff design I keep talking about instead of 
having to do that manually in UCF, allow dropping the last_migration_key 
indirection, pave the way for CASSANDRA-2477, and probably more simplifications.

Is there a reason that doesn't work?

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.1

 Attachments: 
 0001-new-migration-schema-and-avro-methods-cleanup.patch, 
 0002-avro-removal.patch, 
 0003-oldVersion-removed-new-migration-distribution-schema.patch, 
 CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2012-01-05 Thread Pavel Yaskevich (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13180559#comment-13180559
 ] 

Pavel Yaskevich commented on CASSANDRA-1391:


I still don't see how/why apply/diff would be better than migrations (which are 
localized actions on the schema)...

bq. allow dropping the last_migration_key indirection

We can drop it even with my current patches but what does it give us? Instead 
of converting deserialized thrift objects to KSMetaData we would need to 
initialize KSMetaData/CFMetaData and populate those with parameters from db, is 
it really better?

bq. pave the way for CASSANDRA-2477

With current patch user will be able to do SELECT * FROM system.keyspaces and 
other queries but after CASSANDRA-2474 is done because `system.keyspaces` uses 
composite columns.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.1

 Attachments: 
 0001-new-migration-schema-and-avro-methods-cleanup.patch, 
 0002-avro-removal.patch, 
 0003-oldVersion-removed-new-migration-distribution-schema.patch, 
 CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2012-01-05 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13180604#comment-13180604
 ] 

Jonathan Ellis commented on CASSANDRA-1391:
---

bq. Instead of converting deserialized thrift objects to KSMetaData we would 
need to initialize KSMetaData/CFMetaData and populate those with parameters 
from db, is it really better?

At the least, it lets you re-use apply/diff instead of rewriting that, and easy 
CASSANDRA-2477.

bq. With current patch user will be able to do SELECT * FROM system.keyspaces 
and other queries but after CASSANDRA-2474 is done because `system.keyspaces` 
uses composite columns

Does it?  It sure looks like it uses serialized Thrift objects to me.  (Looking 
at DefsTable.loadFromStorage.)

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.1

 Attachments: 
 0001-new-migration-schema-and-avro-methods-cleanup.patch, 
 0002-avro-removal.patch, 
 0003-oldVersion-removed-new-migration-distribution-schema.patch, 
 CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2012-01-05 Thread Pavel Yaskevich (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13180655#comment-13180655
 ] 

Pavel Yaskevich commented on CASSANDRA-1391:


There is a SCHEMA_CF (where we store serialized schema state after each of the 
migrations) and KEYSPACES_CF which is involved in the ks/cf attribute diff 
process, you can create a keyspace, open a CLI and do use system; list 
'keyspaces';' to see that I don't lie about that :)

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.1

 Attachments: 
 0001-new-migration-schema-and-avro-methods-cleanup.patch, 
 0002-avro-removal.patch, 
 0003-oldVersion-removed-new-migration-distribution-schema.patch, 
 CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2012-01-05 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13180703#comment-13180703
 ] 

Jonathan Ellis commented on CASSANDRA-1391:
---

From experience, that sounds like a great way to get bugs that update one but 
not the other.  We really need a single source of truth.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.1

 Attachments: 
 0001-new-migration-schema-and-avro-methods-cleanup.patch, 
 0002-avro-removal.patch, 
 0003-oldVersion-removed-new-migration-distribution-schema.patch, 
 CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2012-01-05 Thread Pavel Yaskevich (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13180707#comment-13180707
 ] 

Pavel Yaskevich commented on CASSANDRA-1391:


I can remove SCHEMA_CF and remove KEYSPACES_CF to SCHEMA_CF and use it to build 
a schema from db on startup easily.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.1

 Attachments: 
 0001-new-migration-schema-and-avro-methods-cleanup.patch, 
 0002-avro-removal.patch, 
 0003-oldVersion-removed-new-migration-distribution-schema.patch, 
 CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-12-19 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13172502#comment-13172502
 ] 

Jonathan Ellis commented on CASSANDRA-1391:
---

I'd rather do them together in this case, it's pretty hard to work in trunk w/o 
schema announce working.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.1

 Attachments: 
 0001-new-migration-schema-and-avro-methods-cleanup.patch, 
 0002-avro-removal.patch, 0003-oldVersion-removed-nit-fixed.patch, 
 CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-12-18 Thread Pavel Yaskevich (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171866#comment-13171866
 ] 

Pavel Yaskevich commented on CASSANDRA-1391:


Sounds reasonable and this would also change how we do announce of the new 
schema, so if it's possible, could we do it in the separate ticket (or 
subticket)? Because this one is getting really big and I'd like to settle on 
the local migration handling code before we start with schema propagation 
changes...

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.1

 Attachments: 
 0001-new-migration-schema-and-avro-methods-cleanup.patch, 
 0002-avro-removal.patch, 0003-oldVersion-removed-nit-fixed.patch, 
 CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-12-17 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171781#comment-13171781
 ] 

Jonathan Ellis commented on CASSANDRA-1391:
---

Feels more like a simplification, than a complication to me.  Syncing schema 
becomes always sending exactly one message instead of potentially 
hundreds/thousands.  And, we get full concurrency support instead of mostly 
(the equal-timestamp weakness I mentioned).  Seems worth it to me.

Currently, schema changes are push-only (see: SS.pushMigrations).  So without 
changing that, yes, both nodes will send schema to each other (which will be a 
no-op on the newer one).  That's not a blocker for me.  I'd be fine switching 
to a pull model in either this or a followup ticket, which would let the newer 
side skip its pull if it recognizes the remote version as one it used to have 
(which would reliably indicate it's older even if timestamp ties are 
involved).

Traffic from schema changes will be negligible for any known workload.  We can 
optimize for that if/when it becomes a problem (my prediction: never).

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.1

 Attachments: 
 0001-new-migration-schema-and-avro-methods-cleanup.patch, 
 0002-avro-removal.patch, 0003-oldVersion-removed-nit-fixed.patch, 
 CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-12-16 Thread Pavel Yaskevich (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13170908#comment-13170908
 ] 

Pavel Yaskevich commented on CASSANDRA-1391:


Isn't that an over-complication? Starting from step 2 in your previous comment, 
node would always need to do diff to all of the CF objects plus determine were 
any of the keyspaces deleted/added which on the other hand migrations give us 
for free because we always know exactly what does migration modify. Also when 
node starts with such content-based version and it's version does not much 
others, does it really know what to do - send own schema or request one?.. I 
also think that once ring will get to some frequency of the schema changes it 
would create a noticeable traffic and nodes won't be able to keep up migrating 
anymore... 

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.1

 Attachments: 
 0001-new-migration-schema-and-avro-methods-cleanup.patch, 
 0002-avro-removal.patch, 0003-oldVersion-removed-nit-fixed.patch, 
 CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-12-15 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13170322#comment-13170322
 ] 

Jonathan Ellis commented on CASSANDRA-1391:
---

That doesn't work, though: what if we have two updates at the same timestamp?  
I think it really does need to be content-based.

Also, I still think using Table.apply and CF.diff is the right way to do 
this, instead of effectively duplicating that code as a special case.  Are 
there any downsides to that approach I'm missing?

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.1

 Attachments: 
 0001-new-migration-schema-and-avro-methods-cleanup.patch, 
 0002-avro-removal.patch, 0003-oldVersion-removed-nit-fixed.patch, 
 CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-12-15 Thread Pavel Yaskevich (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13170336#comment-13170336
 ] 

Pavel Yaskevich commented on CASSANDRA-1391:


We could compare uuids instead in the isMergingMigration method.

How do node determine if it is ahead or behind of the ring with content based 
versioning? Even if able to determine state, how do you find out what 
migrations node needs to send/receive to get ring in sync?

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.1

 Attachments: 
 0001-new-migration-schema-and-avro-methods-cleanup.patch, 
 0002-avro-removal.patch, 0003-oldVersion-removed-nit-fixed.patch, 
 CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-12-15 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13170791#comment-13170791
 ] 

Jonathan Ellis commented on CASSANDRA-1391:
---

You don't need to -- just send it the current schema, and apply/diff will take 
care of any redundancies.  (This means we don't need to worry about schema 
propagation taking a long time during bootstrap or rebuild of a new node, 
either, as in CASSANDRA-3629 or CASSANDRA-2056.)

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.1

 Attachments: 
 0001-new-migration-schema-and-avro-methods-cleanup.patch, 
 0002-avro-removal.patch, 0003-oldVersion-removed-nit-fixed.patch, 
 CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-12-13 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13168969#comment-13168969
 ] 

Jonathan Ellis commented on CASSANDRA-1391:
---

Thanks, Pavel.  This is getting closer.  But I think continuing to use UUIDs is 
the wrong approach.  In particular, code like this means we've failed to 
achieve our goal:

{code}
.   if (newVersion.timestamp() = lastVersion.timestamp())
throw new ConfigurationException(New version timestamp is not 
newer than the current version timestamp.);
{code}

If two migrations X and Y propagate through the cluster concurrently from 
different coordinators, some nodes will apply X first, some Y; whichever 
migration has a lower timestamp will then error out on the remaining nodes and 
we'll end up with the same kind of version conflict snafu we encounter now.

Here's how I think it should work:

* Coordinator turns KsDef and CfDef objects into RowMutations by applying them 
to the existing (local) schema.  Maybe you use something like your 
attributesToCheck code since you already have that written.  Give that mutation 
a normal local timestamp (FBU.timestampMicros).

Then each node applying the change:
* makes a deep copy of the existing schema ColumnFamily objects
* calls Table.apply on the migration RowMutations
* calls ColumnFamily.diff on the new schema ColumnFamily object vs the copied 
one.  (This is where I was going above by saying let the existing resolve code 
do the work.  No matter which order nodes apply X and Y in, they will always 
agree on the result after applying both.  Note that this does not depend on X 
and Y getting correctly ordered timestamps, either.)
* makes the appropriate Table + CFS + Schema changes dicated by the diff
* (above obvously needs to be synchronized at least against the Table/CFS 
objects affected)

Schema version may then be computed as an md5 of the Schema objects.  (Again: 
goal is that nodes can apply X and Y in any order, and we don't care.  So 
version needs to be entirely content-based, not time-based.)  Probably the 
easiest way to do this is to just use CF.updateDigest.  We can cut this down to 
the first 16 bytes if we need to cram it into a UUID, but I don't see a reason 
for that (the Thrift API uses Strings already).

Nit: flushSystemCFs could use FBUtilities.waitOnFutures(flushes) instead of 
rolling its own multi-future wait.


 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.1

 Attachments: 
 0001-new-migration-schema-and-avro-methods-cleanup.patch, 
 0002-avro-removal.patch, CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-11-28 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13158469#comment-13158469
 ] 

T Jake Luciani commented on CASSANDRA-1391:
---

If you remove avro how do people upgrade?

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.1

 Attachments: 
 0001-new-migration-schema-and-avro-methods-cleanup.patch, 
 0002-avro-removal.patch, CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-11-28 Thread Pavel Yaskevich (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13158473#comment-13158473
 ] 

Pavel Yaskevich commented on CASSANDRA-1391:


I'm planing to add a special tool which would help convert schema from avro to 
new model in the separate patch. I don't see a reason to hold avro in lib as 
core code does not use it.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.1

 Attachments: 
 0001-new-migration-schema-and-avro-methods-cleanup.patch, 
 0002-avro-removal.patch, CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-11-14 Thread Pavel Yaskevich (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149691#comment-13149691
 ] 

Pavel Yaskevich commented on CASSANDRA-1391:


It seems like what we really want from migrations is schema state before any 
given migration and actual modifications migration makes like add keyspace 
ks with attributes = ..., update cf with attributes =  

As all of the migrations are user initiated we can easily calculate what 
modifications migration makes and propagate only them keeping TimeUUID as ID of 
the migration to identify apply order. As it's okay for us to require full 
cluster update before accepting schema modifications it makes merge a trivial 
task where modifications should be applied one-by-one on some initial state 
of the schema (that also allows as to remove Avro overhead from migrations). 
Abandoning Avro would make things less fragile because there would be no need 
to modify CFMetaData or any other classes to support new (or deleted) 
attributes.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.1

 Attachments: CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-11-14 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149743#comment-13149743
 ] 

T Jake Luciani commented on CASSANDRA-1391:
---

My moving to CF based migration logic it would be very useful to have the logic 
abstracted so it can be used for other use cases.

Migrations give you the following:

  * RF = N where N is the size of the ring.
  * All changes are pushed to new nodes when they join the ring.
  * previously sent data is available locally on startup



 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.1

 Attachments: CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-11-14 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149780#comment-13149780
 ] 

Jonathan Ellis commented on CASSANDRA-1391:
---

It seems to make sense eventually.  But for this ticket let's stick with 
LocalPartitioner and the existing schema replication.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.1

 Attachments: CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-11-14 Thread Pavel Yaskevich (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149788#comment-13149788
 ] 

Pavel Yaskevich commented on CASSANDRA-1391:


Than we need to preserve an order of the migrations that we accept from remote 
nodes otherwise we don't have sufficient information to apply modifications or 
am I missing something? Can you please brigly describe the process how you see 
it?

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.1

 Attachments: CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-11-14 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13149925#comment-13149925
 ] 

Jonathan Ellis commented on CASSANDRA-1391:
---

I'm enthusiastic about this approach for several reasons:

- ultimately we end up with simpler code with less special cases, although the 
migration (*cough*) from Avro-based schema will be a pain initially
- gets rid of Avro dependency!
- fixes CASSANDRA-2477 (SELECT * FROM keyspaces)


 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.1

 Attachments: CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-11-11 Thread T Jake Luciani (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13148656#comment-13148656
 ] 

T Jake Luciani commented on CASSANDRA-1391:
---

I think the patch is a good start but I do like Jonathans idea of moving to 
native CFs too.

We just want to make sure this gets into 1.1 since it's a problem a lot of 
people run into.

Regarding the current impl, my concern is missing fields added to migration 
structs over time. like we had happen a lot in CFMetaData conversion code.

Could you add a test verifies all migration struct fields are accounted for in 
the merge logic? so if someone adds a new field and doesn't update the 
migration merge logic it would cause this test to fail



 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.1

 Attachments: CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-11-09 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13147216#comment-13147216
 ] 

Jonathan Ellis commented on CASSANDRA-1391:
---

bq. My preference would be to model this on our Row conflict resolution

I think we can make it even simpler, by moving schema out of avro and into 
native CFs.

Each KS/CF would be a row.  Attributes would be columns.  When columns change 
you run the appropriate ALTER code.  If you get an update that's obsolete 
(applying it does not change the schema b/c it has older timestamp) then it is 
no-op.

Schema version would become some kind of md5 or sha of the CF contents (all 
rows + all columns).

The only problem is you need to be a little careful to open the schema CFs 
before anything else, but that's relatively easy I think.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.1

 Attachments: CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-09-06 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13098236#comment-13098236
 ] 

Jonathan Ellis commented on CASSANDRA-1391:
---

bq. if we detect that current migration is outdated and should be merged we do 
the following actions

Why do we need to do two code paths here?

My preference would be to model this on our Row conflict resolution: for rows, 
we have a single code path where distinct columns are simply merged, and for 
conflicting columns we pick a winner based on user-provided timestamp and, if 
necessary, value contents.  So the result is guaranteed to be the same on all 
replicas no matter what order updates were received in.

Similarly, I'd like to see schemas merge field-at-a-time in 
KSMetadata/CFMetadata, with commutative conflict resolution.  (I suggested byte 
ordering of the field contents; Gary suggested using some clock value created 
by coordinator.)

Seems to me this would make the isolation complexity go away.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.0

 Attachments: CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-09-06 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13098252#comment-13098252
 ] 

Pavel Yaskevich commented on CASSANDRA-1391:


bq. Why do we need to do two code paths here? My preference would be to model 
this on our Row conflict resolution: for rows, we have a single code path where 
distinct columns are simply merged, and for conflicting columns we pick a 
winner based on user-provided timestamp and, if necessary, value contents. So 
the result is guaranteed to be the same on all replicas no matter what order 
updates were received in.

Migration merging is more complex process comparing to row merging which is 
pretty straight-forward, current approach easily handles all possible conflits 
without any tie-breakers or coordinators because it simply detects what 
modifications where made by each of the migrations starting from merging one, 
combines them (modifications) together in isolated schema and updates 
Schema.instance _so the resulting schema is guaranteed to be the same on all 
replicas no matter what order migrations were received in_.

bq. Seems to me this would make the isolation complexity go away.

I still think that this is the simplest solution of all proposed because actual 
modifications are: KSMetaData/CFMetaData.diff(...) methods to detect modified 
fields, one flag member Migration.isolated to indicate that migration is 
running in the isolated mode and one method to update system Schema.instance 
with resulting Schema after all migrations where applied.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.0

 Attachments: CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-09-02 Thread Gary Dusbabek (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13096031#comment-13096031
 ] 

Gary Dusbabek commented on CASSANDRA-1391:
--

bq. Merge algorithm is based on isolated schema initialized from merging 
migration lastVersion point: merging migration applied first then all older 
migrations, after that Schema.instance gets safely updated.

Could you clarify what you mean by merging migration applied first, then all 
older migrations...?

It seems like a side effect of applying a migration is that it can apply other 
migrations.  Does MigrationManager.applyMigrations() need to be updated because 
of this?

What does isolated indicate?

Try to put things like flushSystemTables() in a separate patch (ok on the same 
ticket) to make reviewing the actual changes easier.

Would it be possible to create some unit tests for CFMD.diff()?

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.0

 Attachments: CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-09-02 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13096045#comment-13096045
 ] 

Pavel Yaskevich commented on CASSANDRA-1391:


bq. Could you clarify what you mean by merging migration applied first, then 
all older migrations...?

Take a look at the Migration.apply() starting from line 114 and 
Migration.tryMerge methods

if we detect that current migration is outdated and should be merged we do the 
following actions:

  - initialize isolated Schema from the point of migration's lastVersion (this 
sets isolated = true)
  - reload migration's system definition to reflect that isolated schema
  - call applyModels on the merging migration to apply it's schema changes
  - merge phrase:
- read from SystemTable.Migrations all migrations that go after current
- for each of those migrations:
   - replaces their schema with isolated (from merging migration) and 
reload system definition
   - call apply() method to re-write records in SystemTable.Migrations and 
SystemTable.Schema
  - after all migrations were applied we try to merge isolated schema with 
current system schema (Schema.instance)
  - flush system tables to persist changes

bq. It seems like a side effect of applying a migration is that it can apply 
other migrations. Does MigrationManager.applyMigrations() need to be updated 
because of this?

No because all modifications are done using isolated schema

bq. What does isolated indicate?

Isolated indicates that migration will be applied with isolated Schema so no 
real file operations are going to be made, such as snapshot, create of the 
keyspace directory, remove of the SSTable files etc.  
 
bq. Try to put things like flushSystemTables() in a separate patch (ok on the 
same ticket) to make reviewing the actual changes easier.

I see only one such a refactoring change, is it really worse splitting current 
patch?

bq. Would it be possible to create some unit tests for CFMD.diff()

CFMD.diff is used all over the place so if it was broken other tests would fail 
but if you think that this is necessary I can do that.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.0

 Attachments: CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-09-01 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13095206#comment-13095206
 ] 

Pavel Yaskevich commented on CASSANDRA-1391:


Isolated flag is used to indicate that migration will be applied using isolated 
schema so SystemTable.Schema record for LAST_MIGRATION_KEY won't be updated and 
no real file operations will be made. This allows us to old deserialize 
migrations and re-apply them one-by-one to in case on merging migration (after 
each re-apply information about that migration going to be updated in 
SystemTable.Schema and SystemTable.Migrations).

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.0

 Attachments: CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-08-31 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13095098#comment-13095098
 ] 

Jonathan Ellis commented on CASSANDRA-1391:
---

what is the isIsolated mode stuff for?

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.0

 Attachments: CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-08-30 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13094015#comment-13094015
 ] 

Pavel Yaskevich commented on CASSANDRA-1391:


I have figured out a problem with Avro - it can't correctly handle map 
comparison: org.apache.avro.AvroRuntimeException: Can't compare maps! I will 
try to this issue and re-attach working patch.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.0

 Attachments: CASSANDRA-1391.patch, CASSANDRA-1391.patch, 
 CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-08-29 Thread Gary Dusbabek (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13092834#comment-13092834
 ] 

Gary Dusbabek commented on CASSANDRA-1391:
--

Pavel, can you please rebase?

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.0

 Attachments: CASSANDRA-1391.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-08-22 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13088774#comment-13088774
 ] 

Pavel Yaskevich commented on CASSANDRA-1391:


Ok, I will do renaming and inline and commit this, thanks!

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.0

 Attachments: 
 0001-CASSANDRA-1391-refactoring-of-the-DatabaseDescriptor.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-08-22 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13089138#comment-13089138
 ] 

Pavel Yaskevich commented on CASSANDRA-1391:


Committed 0001 with changed proposed by Jonathan. 0002 will follow up with 
migration merge strategy.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
Assignee: Pavel Yaskevich
 Fix For: 1.0

 Attachments: 
 0001-CASSANDRA-1391-refactoring-of-the-DatabaseDescriptor.patch


 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-08-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13089181#comment-13089181
 ] 

Hudson commented on CASSANDRA-1391:
---

Integrated in Cassandra #1043 (See 
[https://builds.apache.org/job/Cassandra/1043/])
Refactoring of the DatabaseDescriptor/CFMetadata/Table to support 
o.a.c.config.Schema which will be handling all schema quering/mutations
patch by Pavel Yaskevich; reviewed by Jonathan Ellis for CASSANDRA-1391

xedin : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1160494
Files : 
* /cassandra/trunk/src/java/org/apache/cassandra/cql/DropIndexStatement.java
* 
/cassandra/trunk/test/unit/org/apache/cassandra/db/migration/SerializationsTest.java
* 
/cassandra/trunk/test/long/org/apache/cassandra/db/compaction/LongCompactionSpeedTest.java
* /cassandra/trunk/src/java/org/apache/cassandra/db/Table.java
* 
/cassandra/trunk/src/java/org/apache/cassandra/db/migration/UpdateColumnFamily.java
* /cassandra/trunk/src/java/org/apache/cassandra/cql/UpdateStatement.java
* /cassandra/trunk/test/unit/org/apache/cassandra/db/SerializationsTest.java
* /cassandra/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLog.java
* 
/cassandra/trunk/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
* /cassandra/trunk/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java
* /cassandra/trunk/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
* /cassandra/trunk/src/java/org/apache/cassandra/service/StorageService.java
* /cassandra/trunk/src/java/org/apache/cassandra/cache/AutoSavingCache.java
* /cassandra/trunk/src/java/org/apache/cassandra/db/ColumnFamily.java
* /cassandra/trunk/src/java/org/apache/cassandra/db/RowMutation.java
* 
/cassandra/trunk/test/unit/org/apache/cassandra/thrift/ThriftValidationTest.java
* /cassandra/trunk/src/java/org/apache/cassandra/db/migration/DropKeyspace.java
* /cassandra/trunk/src/java/org/apache/cassandra/thrift/ThriftValidation.java
* /cassandra/trunk/src/java/org/apache/cassandra/cql/AlterTableStatement.java
* /cassandra/trunk/src/java/org/apache/cassandra/dht/BootStrapper.java
* 
/cassandra/trunk/test/unit/org/apache/cassandra/service/ConsistencyLevelTest.java
* /cassandra/trunk/src/java/org/apache/cassandra/tools/SSTableImport.java
* /cassandra/trunk/test/unit/org/apache/cassandra/db/DefsTest.java
* 
/cassandra/trunk/test/unit/org/apache/cassandra/service/AntiEntropyServiceTestAbstract.java
* /cassandra/trunk/test/unit/org/apache/cassandra/SchemaLoader.java
* /cassandra/trunk/src/java/org/apache/cassandra/db/ColumnFamilySerializer.java
* 
/cassandra/trunk/src/java/org/apache/cassandra/db/migration/DropColumnFamily.java
* 
/cassandra/trunk/test/unit/org/apache/cassandra/locator/SimpleStrategyTest.java
* 
/cassandra/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java
* /cassandra/trunk/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
* 
/cassandra/trunk/src/java/org/apache/cassandra/db/migration/RenameKeyspace.java
* 
/cassandra/trunk/src/java/org/apache/cassandra/dht/OrderPreservingPartitioner.java
* 
/cassandra/trunk/test/unit/org/apache/cassandra/config/DatabaseDescriptorTest.java
* /cassandra/trunk/src/java/org/apache/cassandra/db/SchemaCheckVerbHandler.java
* 
/cassandra/trunk/test/unit/org/apache/cassandra/service/LeaveAndBootstrapTest.java
* /cassandra/trunk/src/java/org/apache/cassandra/db/migration/Migration.java
* /cassandra/trunk/src/java/org/apache/cassandra/cql/DeleteStatement.java
* 
/cassandra/trunk/src/java/org/apache/cassandra/db/DefinitionsUpdateVerbHandler.java
* /cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java
* /cassandra/trunk/src/java/org/apache/cassandra/service/MigrationManager.java
* 
/cassandra/trunk/src/java/org/apache/cassandra/service/AbstractCassandraDaemon.java
* /cassandra/trunk/src/java/org/apache/cassandra/tools/SSTableExport.java
* /cassandra/trunk/src/java/org/apache/cassandra/config/Schema.java
* /cassandra/trunk/src/java/org/apache/cassandra/db/migration/AddKeyspace.java
* /cassandra/trunk/src/java/org/apache/cassandra/thrift/CassandraServer.java
* /cassandra/trunk/src/java/org/apache/cassandra/service/ReadCallback.java
* 
/cassandra/trunk/src/java/org/apache/cassandra/dht/AbstractByteOrderedPartitioner.java
* 
/cassandra/trunk/src/java/org/apache/cassandra/db/migration/RenameColumnFamily.java
* /cassandra/trunk/test/unit/org/apache/cassandra/dht/BootStrapperTest.java
* /cassandra/trunk/test/unit/org/apache/cassandra/service/MoveTest.java
* /cassandra/trunk/src/java/org/apache/cassandra/db/DefsTable.java
* /cassandra/trunk/src/java/org/apache/cassandra/service/StorageProxy.java
* /cassandra/trunk/src/java/org/apache/cassandra/cql/SelectStatement.java
* 
/cassandra/trunk/src/java/org/apache/cassandra/db/migration/UpdateKeyspace.java
* 
/cassandra/trunk/src/java/org/apache/cassandra/db/migration/AddColumnFamily.java
* 

[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-07-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13073124#comment-13073124
 ] 

Jonathan Ellis commented on CASSANDRA-1391:
---

Thinking about this more, the really important part is that all nodes agree on 
the same schema no matter what order they get the Migrations in.  If we can 
make that guarantee, the actual conflict resolution doesn't have to be 
particularly good (since it will still be a rare occurrence).

So what is the simplest thing that can work here?

I think we need to be able to merge Migrations at a finer granularity.

If we do not, we have problems like this:

- Mutation 1 (M1) says set default_validation_class=ascii, comment='foo' at 
time T1.
- M2 says set row_cache_size=100 at time T0  T1.

If node A gets M2 first, applies it, then gets M1, it has all 3 changes made.  
If node B however gets M1 first, then rejects M2 because T0  T1 (for whatever 
kind of clock/comparator we are talking about), nodes A and B will end up with 
different schemas.

I think wall-clock-time plus content-based tiebreaker (like we currently do 
with Column values) will be just as good as more complex ordering, as long as 
we have the fine-grained merging.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
Assignee: Gary Dusbabek

 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-07-13 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064560#comment-13064560
 ] 

Jonathan Ellis commented on CASSANDRA-1391:
---

What would have to change?

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
Assignee: Gary Dusbabek

 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-07-13 Thread Gary Dusbabek (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064581#comment-13064581
 ] 

Gary Dusbabek commented on CASSANDRA-1391:
--

Bottom line: we need to solve the problem of how to handle conflicts.  I view 
this as a similar problem of handling merge conflicts in change sets: a good 
subset of the conflicts can be merged automatically because they are 
independent of each other.  But every once a while there is a conflict that 
needs a manual edit-the solution is not computable because it isn't 
deterministic.

This is currently addressed right now by strictly enforcing the relationship of 
a migration with its predecessor to ensure that all migrations are applied 
serially.

There is probably a pragmatic approach that I'm able to see because the 
merge-conflict problem makes it a non-starter for me.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
Assignee: Gary Dusbabek

 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-07-13 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064598#comment-13064598
 ] 

Jonathan Ellis commented on CASSANDRA-1391:
---

Seems like if we could come up with some way of ordering schema components we 
could solve that: user A says set default_validation_class=ascii, B says 
d_v_c=utf8, but we semi-arbitrarily decide that ascii is higher priority so 
when they conflict ascii wins.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
Assignee: Gary Dusbabek

 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-07-13 Thread Gary Dusbabek (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064626#comment-13064626
 ] 

Gary Dusbabek commented on CASSANDRA-1391:
--

bq. Seems like if we could come up with some way of ordering schema components

A lamport-ish clock consisting of node name and counter/timestamp would 
probably be sufficient for ordering.

This could be used with an approach that quarantines schema changes for a 
period of time.  This would allow for changes to come in from throughout the 
cluster and would allow them to be reordered before being applied.

I think this is sensible, but still gives some rope from which we can hang 
ourselves--the strict predecessor relationship is gone and we'd have to trust 
that nodes would be doing the right thing (by applying migrations) 
independently after the quarantine is over.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
Assignee: Gary Dusbabek

 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-07-13 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064637#comment-13064637
 ] 

Jonathan Ellis commented on CASSANDRA-1391:
---

bq. A lamport-ish clock consisting of node name and counter/timestamp would 
probably be sufficient for ordering

That would probably work too, although it feels weird to base it on who 
generated the migration, rather than the migration contents.

bq. we'd have to trust that nodes would be doing the right thing (by applying 
migrations) independently after the quarantine is over

Right.  I'm totally comfortable with this, feels like a good fit with how the 
rest of the system works.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
Assignee: Gary Dusbabek

 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-07-12 Thread Gary Dusbabek (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13064363#comment-13064363
 ] 

Gary Dusbabek commented on CASSANDRA-1391:
--

Not with the current state of things.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
Assignee: Gary Dusbabek

 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1391) Allow Concurrent Schema Migrations

2011-05-16 Thread Marko Mikulicic (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13034107#comment-13034107
 ] 

Marko Mikulicic commented on CASSANDRA-1391:


is there a way to fix this bad state? I'm not sure if this bug affects me or 
something similar, but my cluster cannot create new keyspaces

Cluster schema does not yet agree

I tried to drop all the nodes but one, but it still complains. Any idea?

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood

 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (CASSANDRA-1391) Allow Concurrent Schema Migrations

2010-08-30 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12904304#action_12904304
 ] 

Jonathan Ellis commented on CASSANDRA-1391:
---

ISTM we can allow concurrent migrations by computing the schema ID as an md5 of 
the keyspaces and CFs, instead of pushing that to VC.  it's exactly the kind of 
set-merge problem that both columns-within-a-CF and VC can handle.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood

 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1391) Allow Concurrent Schema Migrations

2010-08-13 Thread Stu Hood (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12898428#action_12898428
 ] 

Stu Hood commented on CASSANDRA-1391:
-

But, vitally, if concurrent Migrations do conflict in a way that can't be 
resolved, keyspaces that didn't conflict should not be affected.

 Allow Concurrent Schema Migrations
 --

 Key: CASSANDRA-1391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1391
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood

 CASSANDRA-1292 fixed multiple migrations started from the same node to 
 properly queue themselves, but it is still possible for migrations initiated 
 on different nodes to conflict and leave the cluster in a bad state. Since 
 the system_add/drop/rename methods are accessible directly from the client 
 API, they should be completely safe for concurrent use.
 It should be possible to allow for most types of concurrent migrations by 
 converting the UUID schema ID into a VersionVectorClock (as provided by 
 CASSANDRA-580).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.