zanmato1984 commented on PR #39685: URL: https://github.com/apache/arrow/pull/39685#issuecomment-1941617195
> So, there are two things: > > * all offsets of variable-width types (e.g. Binary, List...) are signed in Arrow; they should not be treated as unsigned, regardless of their widths > > * if some other 32-bit fields might overflow because they denote a number of rows or a number of bytes, then they should be converted to 64-bit instead of pedantically insisting on unsigned 32-bit > > +1 Maybe I can work out some edge test cases to identify what the blockers are if we have both signed offset types and over 2GB hash join ability. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
