[
https://issues.apache.org/jira/browse/CASSANDRA-10309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743503#comment-14743503
]
Sylvain Lebresne commented on CASSANDRA-10309:
----------------------------------------------
bq. we could set the map to null if in
{{SerializationHeader.Component.toHeader}} if all of the types match
We can't. We don't know when a schema change will be made and the type of the
column definition could change after the {{SerializationHeader}} object has
been built.
bq. We could construct a special ColumnDefinition type, that contains the type
we read from the sstable, but also stores the "real" ColumnDefinition as a field
Abusing {{ColumnDefinition}} itself might be a bit ugly, but we can easily add
a new class that simply groups both the {{ColumnDefinition}} and the type from
the sstable. So no need for {{instanceof}}.
Alternatively, since the {{Columns}} are "indexable" (and dealt in order), we
could pass the column index down to {{BufferCell.deserialize}} (and ultimately
to {{SerializationHeader.getType}}). The type map in {{SerializationHeader}}
would then just be an array rather than a map.
Both option are pretty much equivalent, with the second one saving an
allocation (and the existence of a new class) but being maybe slightly less
clean.
bq. It looks like we permit type changes that have different representations
Depends on what you mean by representations. Bug notwithstanding, we only
permit a type change if allowed by {{AbstractType.isValueCompatibleWith}} (or,
for clustering, the even more constrained {{AbstractType.isCompatibleWith}}),
which by definition should guarantee us that values from the old type are valid
values of the new type. So there is no particular conversion to do.
Now, in 3.0, we use different serialization for fixed width types and non-fixed
ones (we have a size for the latter not the former), and _it is_ allowed to
switch from a fixed width type to a non-fixed ones (or the reverse). And that's
the reason for this code to exist: we must guarantee the type used to
deserialize is the one used to serialize. But once deserialized, we don't care
if the type has changed since the value should still be valid for that new type.
> Avoid always looking up column type
> -----------------------------------
>
> Key: CASSANDRA-10309
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10309
> Project: Cassandra
> Issue Type: Improvement
> Reporter: T Jake Luciani
> Priority: Minor
> Fix For: 3.x
>
>
> Doing some read profiling I noticed we always seem to look up the type of a
> column from the schema metadata when we have the type already in the column
> class.
> This one simple change to SerializationHeader improves read performance
> non-trivially.
> https://github.com/tjake/cassandra/commit/69b94c389b3f36aa035ac4619fd22d1f62ea80b2
> http://cstar.datastax.com/graph?stats=3fb1ced4-58c7-11e5-9faf-42010af0688f&metric=op_rate&operation=2_read&smoothing=1&show_aggregates=true&xmin=0&xmax=357.94&ymin=0&ymax=157416.6
> I assume we are looking this up to deal with schema changes. But I'm sure
> there is a more performant way of doing this.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)