I had the same idea (writing the size of the message first and then the message). Here is a simple reader and writer for Python with this idea. Note that I assume the message size is an (unsigned) integer of 4 bytes. For very long messages, this won't work. However, I am relying on the assumption that messages aren't that large. If they are, then you probably need a different format for storing the data anyway (break up into many protocol buffers.)
Comments welcome. Mark class ProtocolBufferFileReader: def __init__(self, input_filename, message_constructor): self.file = open(input_filename, 'rb') self.message_constructor = message_constructor def next(self): read_byte = self.file.read(4) if len(read_byte) == 0: raise StopIteration size = struct.unpack('I', read_byte)[0] message = self.message_constructor() bytes_read = message.MergeFromString(self.file.read(size)) return message def __iter__(self): return self def close(self): self.file.close() class ProtocolBufferFileWriter: def __init__(self, output_filename): self.file = open(output_filename, 'wb') def write(self, message): string_to_write = message.SerializeToString() size = struct.pack('I', message.ByteSize()) self.file.write(size) self.file.write(message.SerializeToString()) def flush(self): self.file.flush() def close(self): self.file.close() -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To post to this group, send email to proto...@googlegroups.com. To unsubscribe from this group, send email to protobuf+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.