[ 
https://issues.apache.org/jira/browse/CASSANDRA-8959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14358508#comment-14358508
 ] 

Benedict commented on CASSANDRA-8959:
-------------------------------------

bq. and what Aleksey describes is pretty close to things that CASSANDRA-8099 
currently does as it happens

Yes, although a little closer to what I outlined in 7447, but they're all 
shades of the same coin. This was kind of my point - we should design an 
approach that we consider optimal, preferably abstract it, and use the same 
tool in each place, including here.

FTR, I think the encoding (for both) should include a flag in the row header 
that encodes the kind of encoding, which should itself be either a header 
bitmap for inclusion, a bitmap for exclusion, or a sequence of name ids 
(denormalised in the same manner as column names for a table), and this 
approach should be used for the table format as well. I think it should be 
based on the schema at the time, not the fields present in the table, so that 
this mapping can be "interned".

> More efficient frozen UDT and tuple serialization format
> --------------------------------------------------------
>
>                 Key: CASSANDRA-8959
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8959
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Aleksey Yeschenko
>              Labels: performance
>             Fix For: 3.1
>
>
> The current serialization format for UDTs has a fixed overhead of 4 bytes per 
> defined field (encoding the size of the field).
> It is inefficient for sparse UDTs - ones with many defined fields, but few of 
> them present. We could keep a bitset to indicate the missing fields, if any.
> It's sub-optimal for encoding UDTs with all the values present as well. We 
> could use varint encoding for the field sizes of blob/text fields and encode 
> 'fixed' sized types directly, without the 4-bytes size prologue.
> That or something more brilliant. Any improvement right now is lhf.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to