[jira] [Commented] (CASSANDRA-10309) Avoid always looking up column type

Sylvain Lebresne (JIRA) Mon, 14 Sep 2015 06:25:37 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-10309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743503#comment-14743503
 ]


Sylvain Lebresne commented on CASSANDRA-10309:
----------------------------------------------

bq. we could set the map to null if in 
{{SerializationHeader.Component.toHeader}} if all of the types match

We can't. We don't know when a schema change will be made and the type of the 
column definition could change after the {{SerializationHeader}} object has 
been built.

bq. We could construct a special ColumnDefinition type, that contains the type 
we read from the sstable, but also stores the "real" ColumnDefinition as a field

Abusing {{ColumnDefinition}} itself might be a bit ugly, but we can easily add 
a new class that simply groups both the {{ColumnDefinition}} and the type from 
the sstable. So no need for {{instanceof}}.

Alternatively, since the {{Columns}} are "indexable" (and dealt in order), we 
could pass the column index down to {{BufferCell.deserialize}} (and ultimately 
to {{SerializationHeader.getType}}). The type map in {{SerializationHeader}} 
would then just be an array rather than a map.

Both option are pretty much equivalent, with the second one saving an 
allocation (and the existence of a new class) but being maybe slightly less 
clean.

bq. It looks like we permit type changes that have different representations

Depends on what you mean by representations. Bug notwithstanding, we only 
permit a type change if allowed by {{AbstractType.isValueCompatibleWith}} (or, 
for clustering, the even more constrained {{AbstractType.isCompatibleWith}}), 
which by definition should guarantee us that values from the old type are valid 
values of the new type. So there is no particular conversion to do.

Now, in 3.0, we use different serialization for fixed width types and non-fixed 
ones (we have a size for the latter not the former), and _it is_ allowed to 
switch from a fixed width type to a non-fixed ones (or the reverse). And that's 
the reason for this code to exist: we must guarantee the type used to 
deserialize is the one used to serialize. But once deserialized, we don't care 
if the type has changed since the value should still be valid for that new type.


> Avoid always looking up column type
> -----------------------------------
>
>                 Key: CASSANDRA-10309
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10309
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: T Jake Luciani
>            Priority: Minor
>             Fix For: 3.x
>
>
> Doing some read profiling I noticed we always seem to look up the type of a 
> column from the schema metadata when we have the type already in the column 
> class.
> This one simple change to SerializationHeader improves read performance 
> non-trivially.
> https://github.com/tjake/cassandra/commit/69b94c389b3f36aa035ac4619fd22d1f62ea80b2
> http://cstar.datastax.com/graph?stats=3fb1ced4-58c7-11e5-9faf-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=357.94&ymin=0&ymax=157416.6
> I assume we are looking this up to deal with schema changes. But I'm sure 
> there is a more performant way of doing this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10309) Avoid always looking up column type

Reply via email to