I had the same idea (writing the size of the message first and then
the message). Here is a simple reader and writer for Python with this
idea. Note that I assume the message size is an (unsigned) integer of
4 bytes. For very long messages, this won't work. However, I am
relying on the assumption that messages aren't that large. If they
are, then you probably need a different format for storing the data
anyway (break up into many protocol buffers.)

Comments welcome.

class ProtocolBufferFileReader:
        def __init__(self, input_filename, message_constructor):
                self.file = open(input_filename, 'rb')
                self.message_constructor = message_constructor

        def next(self):
                read_byte = self.file.read(4)
                if len(read_byte) == 0:
                        raise StopIteration
                size = struct.unpack('I', read_byte)[0]

                message = self.message_constructor()
                bytes_read = message.MergeFromString(self.file.read(size))

                return message

        def __iter__(self):
                return self

        def close(self):

class ProtocolBufferFileWriter:
        def __init__(self, output_filename):
                self.file = open(output_filename, 'wb')

        def write(self, message):
                string_to_write = message.SerializeToString()
                size = struct.pack('I', message.ByteSize())


        def flush(self):

        def close(self):


You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To post to this group, send email to proto...@googlegroups.com.
To unsubscribe from this group, send email to 
For more options, visit this group at 

Reply via email to