[ 
https://issues.apache.org/jira/browse/THRIFT-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665323#action_12665323
 ] 

Ben Maurer commented on THRIFT-110:
-----------------------------------

I think it might be a net win to take out some of the bigger uses of the 
field-type-and-value and instead reserve the upper 2-3 bits of the field-type 
for doing a compact field ID. The field IDs can be encoded as deltas, so you 
only have to use a larger size field ID if you skip 3 or 7 items (depending on 
how many bits you take). I think that in the vast majority of cases use, this 
will save 1 byte per field while the extra types that have to be given up save 
an extra byte some of the time.

It's a shame that due to the API, there are redundant types (eg, positive-i64 
is a superset of positive-i16).  Since the TProtocol is only called from the 
generated code, it seems like it might be worth breaking the API to have a more 
clean (and potentially more compact ) protocol. 

> A more compact format 
> ----------------------
>
>                 Key: THRIFT-110
>                 URL: https://issues.apache.org/jira/browse/THRIFT-110
>             Project: Thrift
>          Issue Type: Improvement
>            Reporter: Noble Paul
>         Attachments: compact_proto_spec.txt, compact_proto_spec.txt, 
> thrift-110-v2.patch, thrift-110-v3.patch, thrift-110-v4.patch, 
> thrift-110-v5.patch, thrift-110.patch
>
>
> Thrift is not very compact in writing out data as (say protobuf) . It does 
> not have the concept of variable length integers and various other 
> optimizations possible . In Solr we use a lot of such optimizations to make a 
> very compact payload. Thrift has a lot common with that format.
> It is all done in a single class
> http://svn.apache.org/viewvc/lucene/solr/trunk/src/java/org/apache/solr/common/util/NamedListCodec.java?revision=685640&view=markup
> The other optimizations include writing type/value  in same byte, very fast 
> writes of Strings, externalizable strings etc 
> We could use a thrift format for non-java clients and I would like to see it 
> as compact as the current java version

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to