I try to read chuncks of a file that contains sequence of PB blocks. Is there a way to detect where a block starts?
A little bit of context: It is a huge file (around 60GB). The file format is a sequences of [[Block header][Block content]]. In reallity, It is a little bit more complex, but as sample is enough. The [Block header] contains the lenght of the next [block content]. So the way to read it is sequencially. I wrote a Spark Connector. The first version is reading the file sequencially as well. In the next version, I want to proccess the file splitted, as Spark provides it. So I will get chuncks of the file. I need to search where a [block header] starts, to be able to read sequencially from that point. So, How to find this first block? Any idea? -- You received this message because you are subscribed to the Google Groups "Protocol Buffers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/protobuf/01bd0fbf-cc13-476d-ab3a-c50a278f81aen%40googlegroups.com.
