Hello,

I'm using the official proto3 cpp project.

My organization is interested in using protocol buffers to exchange
messages between services.  We solve physics simulation problems and deal
with a mix of structured metadata and large amounts of numerical data (on
the order of 1-10GB).

I ran some quick tests to investigate the feasibility of doing this with
protobuf.

message ByteContainer {
  string name = 1;
  bytes payload = 2;
  string other_data = 3;
}

What I found was surprising.  Here are the relative serialization speeds of
a bytes payload of 800 million bytes:

   - resizing a std::vector<uint8_t> to 800,000,000: 416 ms
   - memcpy of an initialized char* (named buffer) of the same size into
   that vector: 190ms
   - byte_container.set_file(buffer, length): 1004ms
   - serializing the protobuf: 2000ms
   - deserializing the protobuf: 1800ms

I understand that protobufs are not intended for messages of this scale
(the documentation warns to use messages under 1MB), and that protobufs
must use some custom memory allocation that is optimized in a different
direction.

I think that for byte messages, it is reasonable to expect performance on
the same order of magnitude of memcpy.  This is the case with Avro
(although we really really don't like the avro cpp API).

Is this possible to fix in the proto library?  If not for the general
'bytes' object, what if we add a tag like:

bytes payload = 2; [huge]


Thanks!

Mohamed Koubaa
Software Developer
ANSYS Inc

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to protobuf+unsubscr...@googlegroups.com.
To post to this group, send email to protobuf@googlegroups.com.
Visit this group at https://groups.google.com/group/protobuf.
For more options, visit https://groups.google.com/d/optout.

Reply via email to