[jira] Commented: (CASSANDRA-1715) More schema migration race conditions

Jonathan Ellis (JIRA) Sun, 14 Nov 2010 13:27:36 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-1715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12931890#action_12931890
 ]


Jonathan Ellis commented on CASSANDRA-1715:
-------------------------------------------

The v2 approach looks great.  I think the main improvement we need is to not do 
blocking flushes while the locks are held.  For the purposes of creating a new 
memtable a nonblocking flush is fine.  For creating indexes we'll need to set 
up a callback to do the index building after the flush completes.  (We used to 
have code that took a callback arg as part of the flush call, I think I took it 
out but it should be relatively easy to resurrect.)  I agree that it touches a 
lot of code, but the core changes (i.e. not one-line things like encapsulating 
gcgraceseconds that are messy but not dangerous) aren't much larger than v1.  
The huge improvement over waiting to re-sample indexes after UpdateCF is worth 
it imo.

I'm also fine with saying that changing the CFS will blow away any JMX-applied 
changes and reset values to what the new CFM says the should be.  But if you 
are happy with the Default* approach I am too.

bq. If left for 0.7.1, I need to explain that it changes the serialization 
format for Migrations in a non-backwards compatible way, which is not desirable

Is this saying that we'd need to tell beta3 users to rebuild their schemas if 
this goes in?  I am fine with that, I just want to make sure I understand.

> More schema migration race conditions
> -------------------------------------
>
>                 Key: CASSANDRA-1715
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1715
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.7 beta 1
>            Reporter: Jonathan Ellis
>            Assignee: Gary Dusbabek
>            Priority: Critical
>             Fix For: 0.7.0
>
>         Attachments: v1-0001-take-drop-off-CompactionManager.txt, 
> v1-0002-compaction-lock.txt, v1-0003-migration-uses-locks.txt, 
> v1-0004-handle-moved-dropped-CF-prior-to-pending-compaction-st.txt, 
> v2-0001-take-drop-off-CompactionManager.txt, v2-0002-compaction-lock.txt, 
> v2-0003-migration-uses-locks.txt, 
> v2-0004-handle-moved-dropped-CF-prior-to-pending-compaction-st.txt, 
> v2-0005-CFS.reload-assumes-metadata-is-mutable.txt, 
> v2-0006-replace-modifiable-CFM-members-with-private-fields-and.txt, 
> v2-0007-updateColumnFamily-uses-reload-remove-unneccesary-stru.txt
>
>
> Related to CASSANDRA-1631.
> This is still a bug with schema updates to an existing CF, since reloadCf is 
> doing a unload/init cycle. So flushing + compaction is an issue there as 
> well. Here is a stacktrace from during an index creation where it stubbed its 
> toe on an incomplete sstable from an in-progress compaction (path names 
> anonymized):
> {code}
> INFO [CompactionExecutor:1] 2010-11-02 16:31:00,553 CompactionManager.java 
> (line 224) Compacting 
> [org.apache.cassandra.io.sstable.SSTableReader(path='Standard1-e-6-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='Standard1-e-7-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='Standard1-e-8-Data.db'),org.apache.cassandra.io.sstable.SSTableReader(path='Standard1-e-9-Data.db')]
> ...
> ERROR [MigrationStage:1] 2010-11-02 16:31:10,939 ColumnFamilyStore.java (line 
> 244) Corrupt sstable Standard1-tmp-e-10-<>=[Data.db, Index.db]; skipped
> java.io.EOFException
>         at 
> org.apache.cassandra.utils.FBUtilities.skipShortByteArray(FBUtilities.java:308)
>         at 
> org.apache.cassandra.io.sstable.SSTable.estimateRowsFromIndex(SSTable.java:231)
>         at 
> org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:286)
>         at 
> org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:202)
>         at 
> org.apache.cassandra.db.ColumnFamilyStore.<init>(ColumnFamilyStore.java:235)
>         at 
> org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:443)
>         at 
> org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:431)
>         at org.apache.cassandra.db.Table.initCf(Table.java:335)
>         at org.apache.cassandra.db.Table.reloadCf(Table.java:343)
>         at 
> org.apache.cassandra.db.migration.UpdateColumnFamily.applyModels(UpdateColumnFamily.java:89)
>         at 
> org.apache.cassandra.db.migration.Migration.apply(Migration.java:158)
>         at 
> org.apache.cassandra.thrift.CassandraServer$2.call(CassandraServer.java:672)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> ...
>  INFO [CompactionExecutor:1] 2010-11-02 16:31:31,970 CompactionManager.java 
> (line 303) Compacted to Standard1-tmp-e-10-Data.db.  213,657,983 to 
> 213,657,983 (~100% of original) bytes for 626,563 keys.  Time: 31,416ms.
> {code}
> There is also a race between schema modification and streaming.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-1715) More schema migration race conditions

Reply via email to