[GitHub] drill pull request #697: DRILL-5097: Using store.parquet.reader.int96_as_tim...
Github user vdiravka commented on a diff in the pull request: https://github.com/apache/drill/pull/697#discussion_r96213729 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/NullableFixedByteAlignedReaders.java --- @@ -132,6 +137,9 @@ protected void readField(long recordsToReadInThisPass) { valueVec.getMutator().setSafe(valuesReadInCurrentPass + i, getDateTimeValueFromBinary(binaryTimeStampValue)); } } + // The nanos precision is cut to millis. Therefore the length of single timestamp value is 8 bytes(s) + // instead of 12 byte(s). + dataTypeLengthInBits = timestampLengthInBits; --- End diff -- Yes, it is. Because when [`PARQUET_READER_INT96_AS_TIMESTAMP` is set to false]( https://github.com/apache/drill/blob/83513daf0903e0d94fcaad7b1ae4e8ad6272b494/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/ColumnReaderFactory.java#L246) `NullableFixedBinaryAsTimeStampReader` will not used. Instead of that `NullableFixedBinaryReader` will used. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #697: DRILL-5097: Using store.parquet.reader.int96_as_tim...
Github user vdiravka commented on a diff in the pull request: https://github.com/apache/drill/pull/697#discussion_r96212464 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/NullableFixedByteAlignedReaders.java --- @@ -110,9 +110,14 @@ protected void readField(long recordsToReadInThisPass) { /** * Class for reading parquet fixed binary type INT96, which is used for storing hive, - * impala timestamp values with nanoseconds precision. So it reads such values as a drill timestamp. + * impala timestamp values with nanoseconds precision (12 bytes). So it reads such values as a drill timestamp (8 bytes). */ static class NullableFixedBinaryAsTimeStampReader extends NullableFixedByteAlignedReader { +/** + * The width of each element of the TimeStampVector is 8 byte(s). + */ +private static final int timestampLengthInBits = 64; --- End diff -- Agree. Done. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #697: DRILL-5097: Using store.parquet.reader.int96_as_tim...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/697#discussion_r94350967 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/NullableFixedByteAlignedReaders.java --- @@ -132,6 +137,9 @@ protected void readField(long recordsToReadInThisPass) { valueVec.getMutator().setSafe(valuesReadInCurrentPass + i, getDateTimeValueFromBinary(binaryTimeStampValue)); } } + // The nanos precision is cut to millis. Therefore the length of single timestamp value is 8 bytes(s) + // instead of 12 byte(s). + dataTypeLengthInBits = timestampLengthInBits; --- End diff -- Just trying to understand this better: Is it OK to set the length to 64 bits even if PARQUET_READER_INT96_AS_TIMESTAMP is set to false? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #697: DRILL-5097: Using store.parquet.reader.int96_as_tim...
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/697#discussion_r94350780 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/NullableFixedByteAlignedReaders.java --- @@ -110,9 +110,14 @@ protected void readField(long recordsToReadInThisPass) { /** * Class for reading parquet fixed binary type INT96, which is used for storing hive, - * impala timestamp values with nanoseconds precision. So it reads such values as a drill timestamp. + * impala timestamp values with nanoseconds precision (12 bytes). So it reads such values as a drill timestamp (8 bytes). */ static class NullableFixedBinaryAsTimeStampReader extends NullableFixedByteAlignedReader { +/** + * The width of each element of the TimeStampVector is 8 byte(s). + */ +private static final int timestampLengthInBits = 64; --- End diff -- I would recommend the use of uppercase in naming constants. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] drill pull request #697: DRILL-5097: Using store.parquet.reader.int96_as_tim...
GitHub user vdiravka opened a pull request: https://github.com/apache/drill/pull/697 DRILL-5097: Using store.parquet.reader.int96_as_timestamp gives IOOB whereas convert_from works When the int96 value is converted into timestamp (long in java) we cut the nanos precision to millis. But need to change dataTypeLengthInBits of columnReader (while converting parquet fixed binary type INT96 into drill TimeStamp) from 12 byte(s) to 8 byte(s) as well. You can merge this pull request into a Git repository by running: $ git pull https://github.com/vdiravka/drill DRILL-5097 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/697.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #697 commit ab6bad27469234587a059cccd7b5497279828450 Author: Vitalii DiravkaDate: 2016-12-14T16:24:08Z DRILL-5097: Using store.parquet.reader.int96_as_timestamp gives IOOB whereas convert_from works --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---