We are using std::copy to cast the values on the write side (from int16_t to int32_t for storage in Parquet)
https://github.com/apache/parquet-cpp/blob/master/src/parquet/arrow/writer.cc#L336 and then casting back on read On Wed, Jul 26, 2017 at 2:28 PM, Felipe Aramburu <[email protected]> wrote: > How does this work when you are trying to move from a representation like > int32 to int16? reinterpret_cast can exhibit undefined behavior if you are > trying to cast between types that have different sizes. Should I just get > the first 4 bytes and handle it manually or is there a more concise way to > do that? > > On Wed, Jul 26, 2017 at 12:20 PM, Felipe Aramburu <[email protected]> > wrote: > >> perfect thats what I was hoping for :) >> >> On Wed, Jul 26, 2017 at 11:33 AM, Wes McKinney <[email protected]> >> wrote: >> >>> hi Felipe, >>> >>> In C++ it is the equivalent of >>> >>> uint64_t val = ...; >>> int64_t encoded_val = *reinterpret_cast<int64_t*>(&val); >>> >>> So no alteration of the bit pattern >>> >>> - Wes >>> >>> On Wed, Jul 26, 2017 at 12:18 PM, Felipe Aramburu <[email protected]> >>> wrote: >>> > https://github.com/Parquet/parquet-format/blob/master/src/ >>> thrift/parquet.thrift >>> > >>> > >>> > This file doesnt really specify how to interpret an unsigned type >>> stored in >>> > a signed type. >>> > >>> > So If I make a UINT64 as my logical type but its being stored as an >>> int64 >>> > are you shifting the value or are you storing the BYTE representation of >>> > the UNIT64 inside of an int64, or is it something else? >>> > >>> > I can't seem to find the code that actually converts from the physical >>> > types to the logical types which would also help explain how this >>> happens. >>> > >>> > Felipe >>> >> >>
