We are using std::copy to cast the values on the write side (from
int16_t to int32_t for storage in Parquet)

https://github.com/apache/parquet-cpp/blob/master/src/parquet/arrow/writer.cc#L336

and then casting back on read

On Wed, Jul 26, 2017 at 2:28 PM, Felipe Aramburu <[email protected]> wrote:
> How does this work when you are trying to move from a representation like
> int32 to int16? reinterpret_cast can exhibit undefined behavior if you are
> trying to cast between types that have different sizes. Should I just get
> the first 4 bytes and handle it manually or is there a more concise way to
> do that?
>
> On Wed, Jul 26, 2017 at 12:20 PM, Felipe Aramburu <[email protected]>
> wrote:
>
>> perfect thats what I was hoping for :)
>>
>> On Wed, Jul 26, 2017 at 11:33 AM, Wes McKinney <[email protected]>
>> wrote:
>>
>>> hi Felipe,
>>>
>>> In C++ it is the equivalent of
>>>
>>> uint64_t val = ...;
>>> int64_t encoded_val = *reinterpret_cast<int64_t*>(&val);
>>>
>>> So no alteration of the bit pattern
>>>
>>> - Wes
>>>
>>> On Wed, Jul 26, 2017 at 12:18 PM, Felipe Aramburu <[email protected]>
>>> wrote:
>>> > https://github.com/Parquet/parquet-format/blob/master/src/
>>> thrift/parquet.thrift
>>> >
>>> >
>>> > This file doesnt really specify how to interpret an unsigned type
>>> stored in
>>> > a signed type.
>>> >
>>> > So If I make a UINT64 as my logical type but its being stored as an
>>> int64
>>> > are you shifting the value or are you storing the BYTE representation of
>>> > the UNIT64 inside of an int64, or is it something else?
>>> >
>>> > I can't seem to find the code that actually converts from the physical
>>> > types to the logical types which would also help explain how this
>>> happens.
>>> >
>>> > Felipe
>>>
>>
>>

Reply via email to