I try to read chuncks of a file that contains sequence of PB blocks. Is 
there a way to detect where a block starts?

A little bit of context:
It is a huge file (around 60GB).
The file format is a sequences of [[Block header][Block content]]. In 
reallity, It is a little bit more complex, but as sample is enough.
The [Block header] contains the lenght of the next [block content].
So the way to read it is sequencially.

I wrote a Spark Connector. The first version is reading the file 
sequencially as well.

In the next version, I want to proccess the file splitted, as Spark 
provides it. So I will get chuncks of the file.
I need to search where a [block header] starts, to be able to read 
sequencially from that point.
So, How to find this first block? Any idea?

-- 
You received this message because you are subscribed to the Google Groups 
"Protocol Buffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to protobuf+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/protobuf/01bd0fbf-cc13-476d-ab3a-c50a278f81aen%40googlegroups.com.

Reply via email to