Okay, so the fundamental problem is that deserializing a supercolumn with 30k subcolumns is really really slow. (Like we say on http://wiki.apache.org/cassandra/CassandraLimitations, "avoid a data model that requires large numbers of subcolumns.")
But we were also being needlessly inefficient after deserialization; I've attached a patch (against trunk) to https://issues.apache.org/jira/browse/CASSANDRA-510. This gives a 30-50% improvement in my tests. You're looking for more like an order of magnitude improvement though, so I would say splitting each supercolumn off into its own row is probably the way to go. -Jonathan
