[ 
https://issues.apache.org/jira/browse/CASSANDRA-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-5064:
----------------------------------------

    Attachment: 5064-v2.txt

I'm slightly confused by your branch. Do we agree that it doesn't do the 
"change maybeSwitchMemtable to unconditionally switch"? Cause I don't see it.

Now about the memtable reference counting, I see at least a few problems:
# I see nothing that prevents flushing the same memtable multiple times.
# getting the commit log context and switching the memtable is not done 
atomically with respect to writes. So a write can be pushed in the commit log 
after the context we're getting but still reach the memtable we're about to 
flush. For normal update, this is mostly inefficient in that we'll kept commit 
logs around long than necessary and potentially replay some update 
unnecessarily, but for counter this is a bug.
# it's also possible that for postFlush tasks to not be scheduled in the order 
the commit log context were acquired. So we could discard a commitlog for which 
the data is not yet fully flushed.

Tbh, it may be possible to fix those problems (though for the 2nd one I don't 
have much idea), but I doubt we'll end up with something simpler than the 
current implementation. That might still be worth it ultimately for the 
performance improvement (though I'm not sure it's one of our bottleneck, so 
keeping the lock that is easier to reason about may be better), but I'm really 
doubtful that changing such an important piece of code just before a release is 
a good idea (again, supposing we even have a solution for the problems above).

I also don't think using reference counting really makes this issue simpler.  
But I do agree that the looping in CFS.reload is a bit retarded. And more 
generally, the "maybe" part in maybeSwitchMemtable is probably not justified 
anymore. It was useful when this was called on the with path, where all we 
wanted was to avoid OOM and if some other thread was already flushing it was 
fine. But now that we don't do that, it probably does more bad than good. What 
I mean is that if you call forceFlush, you expect that any write that 
happens-before your call has been flushed. But currently, if some other thread 
flushes the memtable, you'll return immediately while some data may be 
currently under flush (but so not flushed yet). So attaching a v2 that remove 
the freeze business (thus getting rid of the CFS.reload loop). I've kept the 
forceSwitch flag of the first patch to avoid recreating a memtable object if 
not needed in the normal flush path, but we can also remove it and make 
forceSwitch the default if we prefer.

                
> 'Alter table' when it includes collections makes cqlsh hang
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-5064
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5064
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.2.0 beta 3, 1.2.0
>         Environment: Ubuntu 12.04 LTS
> 3.2.0-23-virtual
>            Reporter: Ryan McGuire
>            Assignee: Sylvain Lebresne
>            Priority: Critical
>             Fix For: 1.2.0
>
>         Attachments: 5064.txt, 5064-v2.txt
>
>
> Having just installed 1.2.0-beta3 issue the following CQL into cqlsh:
> {code}
> drop keyspace test;
> create keyspace test with replication = {
>           'class': 'SimpleStrategy',
>           'replication_factor': '1'
>         };
> use test;
> create table users (
>             user_id text PRIMARY KEY,
>             first_name text,
>             last_name text,
>             email_addresses set<text>
>         );
> alter table users add mailing_address_lines list<text>;
> {code}
> As soon as you issue the alter table statement cqlsh hangs, and the java 
> process hosting Cassandra consumes 100% of a single core's CPU.
> If the alter table doesn't include a collection, it runs fine.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to