[Impala-ASF-CR] IMPALA-5051: Add INT64 timestamp write support in Parquet
Zoltan Ivanfi has posted comments on this change. ( http://gerrit.cloudera.org:8080/12247 ) Change subject: IMPALA-5051: Add INT64 timestamp write support in Parquet .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/12247/8/be/src/exec/parquet/parquet-metadata-utils.cc File be/src/exec/parquet/parquet-metadata-utils.cc: http://gerrit.cloudera.org:8080/#/c/12247/8/be/src/exec/parquet/parquet-metadata-utils.cc@142 PS8, Line 142: /// converted_type is not set because Impala always writes timestamps without UTC > As far as I know Parquet-MR does not do any timezone conversion and leaves That's true, but it has a metadata field to tell the caller the desired semantics, which can be either UTC-normalized or not. -- To view, visit http://gerrit.cloudera.org:8080/12247 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib41ad532ec902ed5a9a1528513726eac1c11441f Gerrit-Change-Number: 12247 Gerrit-PatchSet: 8 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: Zoltan Ivanfi Gerrit-Comment-Date: Fri, 01 Mar 2019 14:11:43 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5051: Add INT64 timestamp write support in Parquet
Zoltan Ivanfi has posted comments on this change. ( http://gerrit.cloudera.org:8080/12247 ) Change subject: IMPALA-5051: Add INT64 timestamp write support in Parquet .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/12247/8/be/src/exec/parquet/parquet-metadata-utils.cc File be/src/exec/parquet/parquet-metadata-utils.cc: http://gerrit.cloudera.org:8080/#/c/12247/8/be/src/exec/parquet/parquet-metadata-utils.cc@142 PS8, Line 142: /// converted_type is not set because Impala always writes timestamps without UTC > Does Parquet-MR also write INT64 timestamps un-normalized? Parquet supports both UTC-normalized and timezone-agnostic timestamps, aka Instant and LocalDateTime semantics. -- To view, visit http://gerrit.cloudera.org:8080/12247 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib41ad532ec902ed5a9a1528513726eac1c11441f Gerrit-Change-Number: 12247 Gerrit-PatchSet: 8 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: Zoltan Ivanfi Gerrit-Comment-Date: Thu, 28 Feb 2019 17:17:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5051: Add INT64 timestamp write support in Parquet
Zoltan Ivanfi has posted comments on this change. ( http://gerrit.cloudera.org:8080/12247 ) Change subject: IMPALA-5051: Add INT64 timestamp write support in Parquet .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/12247/8//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12247/8//COMMIT_MSG@39 PS8, Line 39: without conversion to UTC > Whouldn't it be better to convert to UTC? You write later that old readers No, "pure" TIMESTAMP and TIMESTAMP WITHOUT TIME ZONE shall not be normalized to UTC, only the TIMESTAMP WITH LOCAL TIME ZONE type shall be. Please see this document: https://goo.gl/VV88c5 -- To view, visit http://gerrit.cloudera.org:8080/12247 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib41ad532ec902ed5a9a1528513726eac1c11441f Gerrit-Change-Number: 12247 Gerrit-PatchSet: 8 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: Zoltan Ivanfi Gerrit-Comment-Date: Thu, 28 Feb 2019 17:16:26 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7725: [DOCS] Support for Parquet INT64 Timestamp in Impala
Zoltan Ivanfi has posted comments on this change. ( http://gerrit.cloudera.org:8080/12559 ) Change subject: IMPALA-7725: [DOCS] Support for Parquet INT64 Timestamp in Impala .. Patch Set 4: > Thank you, Zoltan! > How do I get +2? Gerrit only allows me to give a +1, one of the other reviewers can give you a +2. (I'm not an Impala committer myself, I just coordinate cross-component timestamp efforts.) -- To view, visit http://gerrit.cloudera.org:8080/12559 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id134036026876238622cb182f790ac0f46654654 Gerrit-Change-Number: 12559 Gerrit-PatchSet: 4 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: Zoltan Ivanfi Gerrit-Comment-Date: Thu, 28 Feb 2019 06:50:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7725: [DOCS] Support for Parquet INT64 Timestamp in Impala
Zoltan Ivanfi has posted comments on this change. ( http://gerrit.cloudera.org:8080/12559 ) Change subject: IMPALA-7725: [DOCS] Support for Parquet INT64 Timestamp in Impala .. Patch Set 4: Code-Review+1 LGTM, thanks! -- To view, visit http://gerrit.cloudera.org:8080/12559 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id134036026876238622cb182f790ac0f46654654 Gerrit-Change-Number: 12559 Gerrit-PatchSet: 4 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: Zoltan Ivanfi Gerrit-Comment-Date: Thu, 28 Feb 2019 06:31:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7725: [DOCS] Support for Parquet INT64 Timestamp in Impala
Zoltan Ivanfi has posted comments on this change. ( http://gerrit.cloudera.org:8080/12559 ) Change subject: IMPALA-7725: [DOCS] Support for Parquet INT64 Timestamp in Impala .. Patch Set 3: (5 comments) http://gerrit.cloudera.org:8080/#/c/12559/3/docs/topics/impala_parquet.xml File docs/topics/impala_parquet.xml: http://gerrit.cloudera.org:8080/#/c/12559/3/docs/topics/impala_parquet.xml@1270 PS3, Line 1270: BINAR annotated with the UTF8 OriginalType A "Y" is missing from the end of "BINAR[Y]" http://gerrit.cloudera.org:8080/#/c/12559/3/docs/topics/impala_parquet.xml@1314 PS3, Line 1314: INT64 annotated with the TIMESTAMP_MICROS "OriginalType" missing from "INT64 annotated with the TIMESTAMP_MICROS [OriginalType]" http://gerrit.cloudera.org:8080/#/c/12559/3/docs/topics/impala_timestamp.xml File docs/topics/impala_timestamp.xml: http://gerrit.cloudera.org:8080/#/c/12559/3/docs/topics/impala_timestamp.xml@221 PS3, Line 221: values with : the OriginalType annotation Change values with the OriginalType annotation to values annotated with the TIMESTAMP_MILLIS or TIMESTAMP_MICROS OriginalType http://gerrit.cloudera.org:8080/#/c/12559/3/docs/topics/impala_timestamp.xml@224 PS3, Line 224: with the LogicalType annotation specifies whether UTC to local Change with the LogicalType annotation to annotated with the TIMESTAMP LogicalType http://gerrit.cloudera.org:8080/#/c/12559/3/docs/topics/impala_timestamp.xml@262 PS3, Line 262: default for a performance reason, to avoid unexpected incompatibility problems This sentence is hard to understand, I would break into two sentences like this: ... is turned off by default for a performance reason. In order to avoid unexpected incompatibility problems, ... -- To view, visit http://gerrit.cloudera.org:8080/12559 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id134036026876238622cb182f790ac0f46654654 Gerrit-Change-Number: 12559 Gerrit-PatchSet: 3 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: Zoltan Ivanfi Gerrit-Comment-Date: Wed, 27 Feb 2019 13:40:29 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5051: Add INT64 timestamp write support in Parquet
Zoltan Ivanfi has posted comments on this change. ( http://gerrit.cloudera.org:8080/12247 ) Change subject: IMPALA-5051: Add INT64 timestamp write support in Parquet .. Patch Set 5: > About 'parquet-tools dump': it would be great to use it during > Impala EE tests, but adding it as a dependency would also mean +1 > way Impala builds can break, so I am not sure at the moment. I am very confident that we should have this extra check. Adding parquet-tools to the build toolchain is not that difficult as it does not have to be installed. I created a small Maven project that runs it without installing it (in the conventional sense at least). The whole project consists of a single POM file and nothing else. Feel free to add it to Implala and modify it to fit your needs: https://github.com/zivanfi/parquet-tools-executor -- To view, visit http://gerrit.cloudera.org:8080/12247 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib41ad532ec902ed5a9a1528513726eac1c11441f Gerrit-Change-Number: 12247 Gerrit-PatchSet: 5 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: Zoltan Ivanfi Gerrit-Comment-Date: Thu, 24 Jan 2019 14:04:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5051: Add INT64 timestamp write support in Parquet
Zoltan Ivanfi has posted comments on this change. ( http://gerrit.cloudera.org:8080/12247 ) Change subject: IMPALA-5051: Add INT64 timestamp write support in Parquet .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/12247/3/be/src/exec/parquet/parquet-metadata-utils.cc File be/src/exec/parquet/parquet-metadata-utils.cc: http://gerrit.cloudera.org:8080/#/c/12247/3/be/src/exec/parquet/parquet-metadata-utils.cc@158 PS3, Line 158: // converted_type is not set because older readers that do not use logical types > I mean I don't see that this function sets the converted type anywhere. Ah, yes, good point, I haven't noticed we are only dealing with LocalDateTime semantics here and regardless of the precision, there is no converted type for that. -- To view, visit http://gerrit.cloudera.org:8080/12247 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib41ad532ec902ed5a9a1528513726eac1c11441f Gerrit-Change-Number: 12247 Gerrit-PatchSet: 3 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: Zoltan Ivanfi Gerrit-Comment-Date: Tue, 22 Jan 2019 16:01:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5051: Add INT64 timestamp write support in Parquet
Zoltan Ivanfi has posted comments on this change. ( http://gerrit.cloudera.org:8080/12247 ) Change subject: IMPALA-5051: Add INT64 timestamp write support in Parquet .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/12247/3/be/src/exec/parquet/parquet-metadata-utils.cc File be/src/exec/parquet/parquet-metadata-utils.cc: http://gerrit.cloudera.org:8080/#/c/12247/3/be/src/exec/parquet/parquet-metadata-utils.cc@158 PS3, Line 158: // converted_type is not set because older readers that do not use logical types > Then I think you should change the name of the function. Personally I'm fine with the current name, it sets the converted type when it makes sense. I don't know how this could be described without making the function name overly long. http://gerrit.cloudera.org:8080/#/c/12247/3/be/src/service/query-options.cc File be/src/service/query-options.cc: http://gerrit.cloudera.org:8080/#/c/12247/3/be/src/service/query-options.cc@728 PS3, Line 728: if (iequals(value, "INT96_NANO")) { > It should be "INT96_NANOS" Shouldn't this typo have been caught by a test? -- To view, visit http://gerrit.cloudera.org:8080/12247 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib41ad532ec902ed5a9a1528513726eac1c11441f Gerrit-Change-Number: 12247 Gerrit-PatchSet: 3 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: Zoltan Ivanfi Gerrit-Comment-Date: Tue, 22 Jan 2019 15:44:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5051: Add INT64 timestamp write support in Parquet
Zoltan Ivanfi has posted comments on this change. ( http://gerrit.cloudera.org:8080/12247 ) Change subject: IMPALA-5051: Add INT64 timestamp write support in Parquet .. Patch Set 3: Would it be possible to run `parquet-tools dump` as a part of the tests to check whether the timestamp values can be read back correctly by parquet-mr? -- To view, visit http://gerrit.cloudera.org:8080/12247 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib41ad532ec902ed5a9a1528513726eac1c11441f Gerrit-Change-Number: 12247 Gerrit-PatchSet: 3 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Ivanfi Gerrit-Comment-Date: Tue, 22 Jan 2019 13:00:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet
Zoltan Ivanfi has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 ) Change subject: IMPALA-7889: Write new logical types in Parquet .. Patch Set 10: Code-Review+1 LGTM, thanks! -- To view, visit http://gerrit.cloudera.org:8080/12004 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91 Gerrit-Change-Number: 12004 Gerrit-PatchSet: 10 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Anonymous Coward (359) Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Zoltan Ivanfi Gerrit-Comment-Date: Wed, 19 Dec 2018 16:18:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7889: Write new logical types in Parquet
Zoltan Ivanfi has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 ) Change subject: IMPALA-7889: Write new logical types in Parquet .. Patch Set 8: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/12004/8/be/src/exec/parquet/parquet-metadata-utils.cc File be/src/exec/parquet/parquet-metadata-utils.cc: http://gerrit.cloudera.org:8080/#/c/12004/8/be/src/exec/parquet/parquet-metadata-utils.cc@303 PS8, Line 303: if (col_type.type == TYPE_DECIMAL) { (optional) I would prefer a switch-case here. -- To view, visit http://gerrit.cloudera.org:8080/12004 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91 Gerrit-Change-Number: 12004 Gerrit-PatchSet: 8 Gerrit-Owner: Csaba Ringhofer Gerrit-Reviewer: Anonymous Coward (359) Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Zoltan Ivanfi Gerrit-Comment-Date: Wed, 19 Dec 2018 14:51:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7233: [DOCS] Support for IANA timezone database
Zoltan Ivanfi has posted comments on this change. ( http://gerrit.cloudera.org:8080/11946 ) Change subject: IMPALA-7233: [DOCS] Support for IANA timezone database .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/11946/1/docs/topics/impala_timestamp.xml File docs/topics/impala_timestamp.xml: http://gerrit.cloudera.org:8080/#/c/11946/1/docs/topics/impala_timestamp.xml@333 PS1, Line 333: values the same way it stores : without any adjustment. > This paragraph is about Text file, and the next paragraph is about Parquet Sorry, my bad. I didn't notice the "in text tables" part. Should we make the table format references bold maybe or create separate subsections for textfile and Parquet to prevent confusion? -- To view, visit http://gerrit.cloudera.org:8080/11946 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id400cda5a1be321063d17e0ee6337e92a5da732a Gerrit-Change-Number: 11946 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Ivanfi Gerrit-Comment-Date: Wed, 21 Nov 2018 08:31:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7233: [DOCS] Support for IANA timezone database
Zoltan Ivanfi has posted comments on this change. ( http://gerrit.cloudera.org:8080/11946 ) Change subject: IMPALA-7233: [DOCS] Support for IANA timezone database .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/11946/1/docs/topics/impala_timestamp.xml File docs/topics/impala_timestamp.xml: http://gerrit.cloudera.org:8080/#/c/11946/1/docs/topics/impala_timestamp.xml@367 PS1, Line 367: turned off by : default to avoid performance overhead > I still think that the improvement should mentioned, e.g "Before 3.1, this I would rephrase "eliminated most of the overhead" as "scales well to multiple threads" or similar. The overhead is still there, it just does not effectively make Impala run in a single thread when doing the adjustments. -- To view, visit http://gerrit.cloudera.org:8080/11946 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id400cda5a1be321063d17e0ee6337e92a5da732a Gerrit-Change-Number: 11946 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Ivanfi Gerrit-Comment-Date: Tue, 20 Nov 2018 16:08:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7233: [DOCS] Support for IANA timezone database
Zoltan Ivanfi has posted comments on this change. ( http://gerrit.cloudera.org:8080/11946 ) Change subject: IMPALA-7233: [DOCS] Support for IANA timezone database .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/11946/1/docs/topics/impala_timestamp.xml File docs/topics/impala_timestamp.xml: http://gerrit.cloudera.org:8080/#/c/11946/1/docs/topics/impala_timestamp.xml@333 PS1, Line 333: values the same way it stores : without any adjustment. > Hive reads and writes TIMESTAMP values without converting with respect to t This sentence seems to contradict the description below. Hive 2.x _does_ adjust timestamps with respect to time zones. That is the cause of the incompatibility with Impala, which does not. -- To view, visit http://gerrit.cloudera.org:8080/11946 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id400cda5a1be321063d17e0ee6337e92a5da732a Gerrit-Change-Number: 11946 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoltan Ivanfi Gerrit-Comment-Date: Tue, 20 Nov 2018 15:45:09 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6538: Fix read path when Parquet min/max statistics contain NaN
Zoltan Ivanfi has posted comments on this change. ( http://gerrit.cloudera.org:8080/9358 ) Change subject: IMPALA-6538: Fix read path when Parquet min/max statistics contain NaN .. Patch Set 5: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/9358 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If3897fc1426541239223670812f59e2bed32f455 Gerrit-Change-Number: 9358 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-NagyGerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: Zoltan Ivanfi Gerrit-Comment-Date: Wed, 21 Feb 2018 15:12:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6538: Fix read path when Parquet min/max statistics contain NaN
Zoltan Ivanfi has posted comments on this change. ( http://gerrit.cloudera.org:8080/9358 ) Change subject: IMPALA-6538: Fix read path when Parquet min/max statistics contain NaN .. Patch Set 4: Code-Review+1 (1 comment) This workaround in the read path seems to be a good quick-fix, but I think the write path should also have a quick fix to make the written stats independent of the data order, i.e. it should not matter whether a NaN is the first value or not. http://gerrit.cloudera.org:8080/#/c/9358/4/testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test File testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test: http://gerrit.cloudera.org:8080/#/c/9358/4/testdata/workloads/functional-query/queries/QueryTest/parquet-stats.test@494 PS4, Line 494: QUERY This is a good test in general, but it does not specifically test for the fix. Both the read path and the write path can be modified to make this test pass and once we will have fixes in both paths, this test won't notice if one of them has a regression. -- To view, visit http://gerrit.cloudera.org:8080/9358 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If3897fc1426541239223670812f59e2bed32f455 Gerrit-Change-Number: 9358 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-NagyGerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Reviewer: Zoltan Ivanfi Gerrit-Comment-Date: Tue, 20 Feb 2018 15:41:42 + Gerrit-HasComments: Yes