[GitHub] [arrow] kiszk commented on issue #6981: PARQUET-1845: [C++] Add expected results of Int96 in big-endian

2020-04-22 Thread GitBox
kiszk commented on issue #6981: URL: https://github.com/apache/arrow/pull/6981#issuecomment-617935417 @wesm do you have any comment on this change? This is an automated message from the Apache Git Service. To respond to the m

[GitHub] [arrow] kiszk commented on issue #6981: PARQUET-1845: [C++] Add expected results of Int96 in big-endian

2020-04-21 Thread GitBox
kiszk commented on issue #6981: URL: https://github.com/apache/arrow/pull/6981#issuecomment-617369335 After the 1-day investigation, I knew the implementation looks a little complicated. Regarding encoding, `TypedBufferBuilder` is used in some test cases, but it is not in some test c

[GitHub] [arrow] kiszk commented on issue #6981: PARQUET-1845: [C++] Add expected results of Int96 in big-endian

2020-04-21 Thread GitBox
kiszk commented on issue #6981: URL: https://github.com/apache/arrow/pull/6981#issuecomment-617149858 I agree with it. Once it is stable, it looks good. Under the development, developers want the reader and writer independently. At least, for me. --

[GitHub] [arrow] kiszk commented on issue #6981: PARQUET-1845: [C++] Add expected results of Int96 in big-endian

2020-04-21 Thread GitBox
kiszk commented on issue #6981: URL: https://github.com/apache/arrow/pull/6981#issuecomment-617113951 Thank you for your suggestion. I think that these files are used only for read test now. This is an automated message from

[GitHub] [arrow] kiszk commented on issue #6981: PARQUET-1845: [C++] Add expected results of Int96 in big-endian

2020-04-21 Thread GitBox
kiszk commented on issue #6981: URL: https://github.com/apache/arrow/pull/6981#issuecomment-617087421 I think that the current test cases for parquet writer do not have tests to verify the bit pattern of the generated parquet file. I will also create the test case in another PR since they

[GitHub] [arrow] kiszk commented on issue #6981: PARQUET-1845: [C++] Add expected results of Int96 in big-endian

2020-04-20 Thread GitBox
kiszk commented on issue #6981: URL: https://github.com/apache/arrow/pull/6981#issuecomment-616759329 Thank you for staring the discussion. I will watch at the thread. Yeah, `LETypedBufferBuilder` makes sense. It looks better than adding `AppendLE`. Regarding `Serialize`, it l

[GitHub] [arrow] kiszk commented on issue #6981: PARQUET-1845: [C++] Add expected results of Int96 in big-endian

2020-04-20 Thread GitBox
kiszk commented on issue #6981: URL: https://github.com/apache/arrow/pull/6981#issuecomment-616623385 I have been thinking about place candidates of the interface between the native endian and a PARQUET little-endian. One of the good candidates is `Serialize()` in `parquet/column_wr

[GitHub] [arrow] kiszk commented on issue #6981: PARQUET-1845: [C++] Add expected results of Int96 in big-endian

2020-04-20 Thread GitBox
kiszk commented on issue #6981: URL: https://github.com/apache/arrow/pull/6981#issuecomment-616540298 Sure, at first, I derived this layout from [this method](https://github.com/apache/arrow/blob/master/cpp/src/parquet/types.h#L576). Other resources: https://github.com/apache/ar

[GitHub] [arrow] kiszk commented on issue #6981: PARQUET-1845: [C++] Add expected results of Int96 in big-endian

2020-04-20 Thread GitBox
kiszk commented on issue #6981: URL: https://github.com/apache/arrow/pull/6981#issuecomment-616507292 At first, this PR address only test cases. This PR does not address the routines (like `column_writer.h`) yet. To update test cases can clarify the specification and implementation like th

[GitHub] [arrow] kiszk commented on issue #6981: PARQUET-1845: [C++] Add expected results of Int96 in big-endian

2020-04-20 Thread GitBox
kiszk commented on issue #6981: URL: https://github.com/apache/arrow/pull/6981#issuecomment-616462606 Thank you for your clarification. Based on this, `Int96` structure in memory can be represented in native endian. When it will be written into a file, we carefully have to keep it in a lit

[GitHub] [arrow] kiszk commented on issue #6981: PARQUET-1845: [C++] Add expected results of Int96 in big-endian

2020-04-20 Thread GitBox
kiszk commented on issue #6981: URL: https://github.com/apache/arrow/pull/6981#issuecomment-616453579 I understand your point. First, I implemented the approach ` entirely little-endian`. Then, I reconsidered it. Each primitive type should be represented in a little-endian as shown