RussellSpitzer commented on PR #13880: URL: https://github.com/apache/iceberg/pull/13880#issuecomment-3228453598
> Why did they type of the vector change from IntVectors to BaseVarWidthVectors? The vector changes because Dictionary encoded pages are a sequence of ints, {1, 2, 3, 4} that refer to entries in the Dictionary which maps the int to the actual column value. {1: "foo", 2: "bar", ....}. Other pages have literal representations of the values stored as binary {foo, bar, bazz }. So you have to switch vector types when you alternate. > If we clear out "this.vec" if it is set, wouldn't this type change in the vector cause problems? Shouldn't we explicitly close the `this.vec` if it is not null, before setting it to a new vector? No. To be clear, the code has *always* cleared out this.vec and we dont' have correctness issues because essentially what is happening is: 1. Reader looks to see if it can read the page 2. If it can't re-use the container do an allocate for the correct container What is missing here is 2.a If I previously had a container but it cannot be re-used, clear it -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org