[ https://issues.apache.org/jira/browse/DRILL-8423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17711761#comment-17711761 ]
Peter Franzen commented on DRILL-8423: -------------------------------------- The problem is cause by the column values being read as 32-bit values, not 64-bit values, in {code:java} org.apache.drill.exec.store.parquet.columnreaders.ParquetFixedWidthDictionaryReaders.DictionaryTimeMicrosReader::readField (long) {code} line 171: {code:java} valueVec.getMutator().setSafe(valuesReadInCurrentPass + i, valReader.readInteger() / 1000); {code} and line 176: {code:java} int value = pageReader.pageData.getInt((int) readStartInBytes + i * dataTypeLengthInBytes); {code} The bug is also present in {code:java} org.apache.drill.exec.store.parquet.columnreaders.NullableFixedByteAlignedReaders.NullableDictionaryTimeMicrosReader::readField(long) {code} The problem should be fixed by using the same read logic as for TIMESTAMP_MICROS in {{{}DictionaryTimeStampMicrosReader{}}}. > Parquet TIME_MICROS columns with values > Integer.MAX_VALUE are not displayed > correctly > --------------------------------------------------------------------------------------- > > Key: DRILL-8423 > URL: https://issues.apache.org/jira/browse/DRILL-8423 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Parquet > Affects Versions: 1.20.3 > Reporter: Peter Franzen > Priority: Major > > Assume a parquet file in a directory "Test" with a column _timeCol_ having > the type {{{}org.apache.parquet.schema.OriginalType.TIME_MICROS{}}}. > Assume there are two records with the values 2147483647 and 2147483648, > respectively, in that column (i.e. the times 00:35:47.483647 and > 00:35:47.483648). > Executing the query > {code:java} > SELECT timeCol FROM dfs.Test;{code} > produces the result > {code:java} > timeCol > ------- > 00:35:47.483 > 23:24:12.517{code} > i.e. the microsecond value of Integer.MAX_VALUE + 1 has wrapped around when > read from the parquet file (it is displayed as the same number of > milliseconds before midnight as the time represented by Integer.MAX_VALUE is > after midnight) > -- This message was sent by Atlassian Jira (v8.20.10#820010)