Re: [protobuf] Finding "ByteSize" of collection of messages

2014-01-09 Thread jonathan . wolk
Sorry, I deleted this post a while back. But your answer is appreciated. 
 

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to protobuf+unsubscr...@googlegroups.com.
To post to this group, send email to protobuf@googlegroups.com.
Visit this group at http://groups.google.com/group/protobuf.
For more options, visit https://groups.google.com/groups/opt_out.


Re: [protobuf] Finding "ByteSize" of collection of messages

2014-01-09 Thread Feng Xiao
On Fri, Jan 3, 2014 at 12:59 PM,  wrote:

> Sorry if this has been covered before. I searched but couldn't find a
> complete answer (or at least what I thought was complete). I'll give a
> little background and some small example code as I believe it will help.
> I'm using C++ and optimizing my protocol buffer messages for LITE_RUNTIME.
>
> I have a class called Resource and it currently looks something like this
>
> class Resource
> {
> public:
> // other stuff that isn't necessary for this example
> ::google::protobuf::MessageLite* createNewProtoBuff() const;
>
> // More stuff
> };
>
> As you can see the Resource class has a function which creates a new
> protocol buffer "message" in memory.
>
> I also have a class called ResourceMap which serves as a collection of
> Resources. Since there can be any number of Resources in a ResourceMap and
> I don't want to go over the 1 meg "limit" (not a hard limit I know) for a
> single message, when I serialize out to a file for a ResourceMap, I first
> serialize out a "header" protocol buffer message to the file and then
> serialize each Resource's protocol buffer message.
>
> The header looks like this:
>
> message ResourceMapHeaderPB {
>
>   message ResourceEntryPB
>
>   {
>
> optional string resource_name = 1;
>
> optional uint32 resource_byte_offset_from_header = 2; // Used if we
> want to deserialize a single Resource in a ResourceMap an not all
>
> optional uint32 resource_size_bytes = 3; // Used to set
> CodedInputStream limit upon deserialization
>
>   }
>
>   repeated ResourceEntryPB resource_entry = 1;
>
> }
>
> The serialization to file code for ResourceMap looks like this:
>
>
> // Loop over all Resources in the ResourceMap, creating MessageLite* from
> the resources
> for ( iter = allResources.begin(); iter != allResources.end(); ++iter )
> {
>  // Create new entry in the header
>
> entryPB = headerPB->add_resource_entry();
>
> entryPB->set_resource_name(iter->Name);
>
> entryPB->set_resource_size_bytes(resourcePB->ByteSize());
>
> entryPB->set_resource_byte_offset_from_header
> (currentByteOffsetFromHeader);
>
> currentByteOffsetFromHeader += resourcePB->ByteSize();
>
> }
>
> // Once all MessageLite* are constructed, write header and messages to
> CodedOutputStream
>
> codedOut->WriteLittleEndian32(headerPB->ByteSize());
>
> headerPB->SerializeToCodedStream(codedOut);
>
> for ( auto iter = resourcePBs.begin(); resourcePBs.end() != iter; ++iter )
>
> {
>
> (*iter)->SerializeToCodedStream(codedOut);
>
> }
> As you can see, the header needs to know the size of the Message in bytes
> to support a feature (deserializing a single resource instead of all in a
> given file).
>
> Right now, the code works fine and is great. But, there is one problem, *it
> is possible for the Messages created from a Resource to be > 1 MB*.
>
> In this scenario, where I don't want a single MessageLite to be > 1 MB,
> what can I do?
>
> I was thinking of making a ResourceSerialization class that would be
> something like
>
> class ResourceSerialization
> {
> public:
>
> uint32 GetSerializationSizeBytes();
>
> void SerializeToCodedStream(CodedOutputStream* codedOut);
>
> };
>
> For Resources which could serialize to a single message under 1 MB, the
> size function above could just return the size of a wrapped MessageLite. In
> other scenarios where the ResourceSerialization might consist of writing
> out the size of a message, then a message, then size of the next message,
> then the next message, etc. etc. (as indicated in the google protocol
> buffer documentation under "techniques") how would I find the "total" size
> of what would be serialized? Is there a way to know the size that would be
> written by CodedOutputStream::WriteVarint32(uint32 value) with a
> particular value?
>
CodedOutputStream::VarintSize32(uint32 value)?


>
> Sorry if this is all convoluted and confusing. I can clarify as needed.
> Thanks ahead of time!
>
> -Jonathan
>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "Protocol Buffers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to protobuf+unsubscr...@googlegroups.com.
> To post to this group, send email to protobuf@googlegroups.com.
> Visit this group at http://groups.google.com/group/protobuf.
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to protobuf+unsubscr...@googlegroups.com.
To post to this group, send email to protobuf@googlegroups.com.
Visit this group at http://groups.google.com/group/protobuf.
For more options, visit https://groups.google.com/groups/opt_out.