GitHub user ilsley added a comment to the discussion: Seek implementation has unexpected behaviour
Thank you for considering this. I have a summary pargraph at the end, but here is some detail should you find it useful. I am parsing selected subsets of large binary files using binrw. The binary file has parts of variable size, which refer to other parts of the file, so I need to do it incrementally i.e. read in a chunk (with a minimum size), parse this (binrw needs Read and Seek), see how much more is needed, parse again if I have enough, otherwise, fetch again and add it to the existing buffer, and then parse. Throughout, I need to keep track of my position in the file since parts of the file refer to other parts by position. More specifically: - My previous implementation used a Cursor of Bytes. Before appending newly fetched Bytes, I would extract the position from the Cursor before creating a new Bytes and Cursor, and would update my offset in the file accordingly (because I discard the already parsed Bytes). I naively did the same when I tried Buffer, and it worked on most of the files I tested. However, it failed on a new one recently, and it took me a bit of time to find the bug, which was caused by needing to fetch more data plus my assumption that Seek would not change the length of the Buffer. - I can fix this error, but binrw does have directives that rely on Seek, so I would need to check what assumptions binrw makes regarding Seek if I go down this path (which does not seem necessary). To be clear, I like the flexibility of Buffer and its overall approach, but a) I wanted to point out this unexpected aspect, at least for me b) I wondered whether it might be useful to demarcate different traits e.g. using the approach of bytes Buf with the reader method (e.g. https://docs.rs/bytes/latest/bytes/buf/struct.Reader.html) that creates a Reader than can be returned to the underlying type (into_inner) when necessary. I can see the value of implementing BufRead on Buffer directly, so I mention this more as an example for future traits that you might provide e.g. a "non-consuming" Read & Seek - should that be more generally useful. For my purposes, I think the consume method of BufRead is clearer than a forward-only Seek. So tl;dr, I don't think the current implementation of Seek provides extra functionality for my use case over what's provided by BufRead. I am supposing I should create a contiguous Bytes and use a Cursor or similar, and it's likely that the performance of this will be good. GitHub link: https://github.com/apache/opendal/discussions/7113#discussioncomment-15376632 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
