[Impala-ASF-CR] IMPALA-6077: remove Parquet BIT PACKED def level support

2018-02-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9241 )

Change subject: IMPALA-6077: remove Parquet BIT_PACKED def level support
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1920/


--
To view, visit http://gerrit.cloudera.org:8080/9241
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12c75b7f162dd7de8e26cf31be142b692e3624ae
Gerrit-Change-Number: 9241
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 12 Feb 2018 18:05:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6077: remove Parquet BIT PACKED def level support

2018-02-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9241 )

Change subject: IMPALA-6077: remove Parquet BIT_PACKED def level support
..


Patch Set 6: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9241
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12c75b7f162dd7de8e26cf31be142b692e3624ae
Gerrit-Change-Number: 9241
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 12 Feb 2018 18:05:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6077: remove Parquet BIT PACKED def level support

2018-02-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9241 )

Change subject: IMPALA-6077: remove Parquet BIT_PACKED def level support
..


Patch Set 5:

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1918/


--
To view, visit http://gerrit.cloudera.org:8080/9241
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12c75b7f162dd7de8e26cf31be142b692e3624ae
Gerrit-Change-Number: 9241
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 10 Feb 2018 08:04:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6077: remove Parquet BIT PACKED def level support

2018-02-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9241 )

Change subject: IMPALA-6077: remove Parquet BIT_PACKED def level support
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1918/


--
To view, visit http://gerrit.cloudera.org:8080/9241
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12c75b7f162dd7de8e26cf31be142b692e3624ae
Gerrit-Change-Number: 9241
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 10 Feb 2018 04:30:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6077: remove Parquet BIT PACKED def level support

2018-02-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9241 )

Change subject: IMPALA-6077: remove Parquet BIT_PACKED def level support
..


Patch Set 5: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1915/


--
To view, visit http://gerrit.cloudera.org:8080/9241
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12c75b7f162dd7de8e26cf31be142b692e3624ae
Gerrit-Change-Number: 9241
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 10 Feb 2018 01:44:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6077: remove Parquet BIT PACKED def level support

2018-02-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9241 )

Change subject: IMPALA-6077: remove Parquet BIT_PACKED def level support
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1915/


--
To view, visit http://gerrit.cloudera.org:8080/9241
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12c75b7f162dd7de8e26cf31be142b692e3624ae
Gerrit-Change-Number: 9241
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 09 Feb 2018 21:51:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6077: remove Parquet BIT PACKED def level support

2018-02-09 Thread Tim Armstrong (Code Review)
Hello Lars Volker, Csaba Ringhofer, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9241

to look at the new patch set (#4).

Change subject: IMPALA-6077: remove Parquet BIT_PACKED def level support
..

IMPALA-6077: remove Parquet BIT_PACKED def level support

The encoding was added in an early version of the Parquet
spec and deprecated even in the Parquet 1.0 spec.

Parquet-MR switched to generating RLE at the same time as
the spec changed in mid-2013. Impala always wrote RLE:
see commit 6e293090e60aea300f9e83db67f56a5efd07c35c.

The Impala implementation of BIT_PACKED was never correct
because it implemented little endian bit unpacking instead of
the big endian unpacking required by the spec for levels.

Testing:
Updated tests to reflect expected behaviour for supported
and unsupported def level encodings.

Cherry-picks: not for 2.x.

Change-Id: I12c75b7f162dd7de8e26cf31be142b692e3624ae
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M common/thrift/generate_error_codes.py
M testdata/workloads/functional-query/queries/QueryTest/parquet-def-levels.test
M tests/query_test/test_scanners.py
5 files changed, 53 insertions(+), 32 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/9241/4
--
To view, visit http://gerrit.cloudera.org:8080/9241
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I12c75b7f162dd7de8e26cf31be142b692e3624ae
Gerrit-Change-Number: 9241
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6077: remove Parquet BIT PACKED def level support

2018-02-09 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9241 )

Change subject: IMPALA-6077: remove Parquet BIT_PACKED def level support
..


Patch Set 3: Code-Review+2

I assume this is not going to the 2.x branch?


--
To view, visit http://gerrit.cloudera.org:8080/9241
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12c75b7f162dd7de8e26cf31be142b692e3624ae
Gerrit-Change-Number: 9241
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 09 Feb 2018 21:38:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6077: remove Parquet BIT PACKED def level support

2018-02-09 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9241 )

Change subject: IMPALA-6077: remove Parquet BIT_PACKED def level support
..


Patch Set 3: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/9241
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12c75b7f162dd7de8e26cf31be142b692e3624ae
Gerrit-Change-Number: 9241
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 09 Feb 2018 19:42:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6077: remove Parquet BIT PACKED def level support

2018-02-09 Thread Tim Armstrong (Code Review)
Hello Lars Volker, Csaba Ringhofer,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9241

to look at the new patch set (#3).

Change subject: IMPALA-6077: remove Parquet BIT_PACKED def level support
..

IMPALA-6077: remove Parquet BIT_PACKED def level support

The encoding was added in an early version of the Parquet
spec and deprecated even in the Parquet 1.0 spec.

Parquet-MR switched to generating RLE at the same time as
the spec changed in mid-2013. Impala always wrote RLE:
see commit 6e293090e60aea300f9e83db67f56a5efd07c35c.

The Impala implementation of BIT_PACKED was never correct
because it implemented little endian bit unpacking instead of
the big endian unpacking required by the spec for levels.

Testing:
Updated tests to reflect expected behaviour for supported
and unsupported def level encodings.

Change-Id: I12c75b7f162dd7de8e26cf31be142b692e3624ae
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M common/thrift/generate_error_codes.py
M testdata/workloads/functional-query/queries/QueryTest/parquet-def-levels.test
M tests/query_test/test_scanners.py
5 files changed, 53 insertions(+), 32 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/9241/3
--
To view, visit http://gerrit.cloudera.org:8080/9241
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I12c75b7f162dd7de8e26cf31be142b692e3624ae
Gerrit-Change-Number: 9241
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-6077: remove Parquet BIT PACKED def level support

2018-02-09 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9241 )

Change subject: IMPALA-6077: remove Parquet BIT_PACKED def level support
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/9241/2/testdata/workloads/functional-query/queries/QueryTest/parquet-def-levels.test
File 
testdata/workloads/functional-query/queries/QueryTest/parquet-def-levels.test:

http://gerrit.cloudera.org:8080/#/c/9241/2/testdata/workloads/functional-query/queries/QueryTest/parquet-def-levels.test@55
PS2, Line 55: # IMPALA-6077: unsupported BIT_PACKED encoding fails when 
materializing columns.
: select count(id), count(tinyint_col), count(smallint_col), 
count(int_col),
:   count(bigint_col), count(float_col), count(double_col), 
count(date_string_col),
:   count(string_col), count(timestamp_col), count(year), 
count(month), count(day)
: from alltypesagg_bitpacked
This query seems to be the same as the next query, and should not materialize 
columns.


http://gerrit.cloudera.org:8080/#/c/9241/2/testdata/workloads/functional-query/queries/QueryTest/parquet-def-levels.test@64
PS2, Line 64: materializing
I am not 100% sure about this, but I think that if a column is not complex, and 
the stats are filled, then count can be served from column chunk stats without 
reading any data page, so this error will not be returned.

This may not be a problem for this specific parquet file, but I would mention 
it in a comment, or replace the query with something that has to read the data 
pages.


http://gerrit.cloudera.org:8080/#/c/9241/2/testdata/workloads/functional-query/queries/QueryTest/parquet-def-levels.test@65
PS2, Line 65: select count(id), count(tinyint_col), count(smallint_col), 
count(int_col),
:   count(bigint_col), count(float_col), count(double_col), 
count(date_string_col),
:   count(string_col), count(timestamp_col), count(year), 
count(month), count(day)
Is it necessary to list every column here? If one column is enough for the 
test, then I would prefer if it were shorter (for the sake of readability).



--
To view, visit http://gerrit.cloudera.org:8080/9241
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I12c75b7f162dd7de8e26cf31be142b692e3624ae
Gerrit-Change-Number: 9241
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Fri, 09 Feb 2018 14:58:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6077: remove Parquet BIT PACKED def level support

2018-02-07 Thread Tim Armstrong (Code Review)
Hello Lars Volker,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9241

to look at the new patch set (#2).

Change subject: IMPALA-6077: remove Parquet BIT_PACKED def level support
..

IMPALA-6077: remove Parquet BIT_PACKED def level support

The encoding was added in an early version of the Parquet
spec and deprecated even in the Parquet 1.0 spec.

Parquet-MR switched to generating RLE at the same time as
the spec changed in mid-2013. Impala always wrote RLE:
see commit 6e293090e60aea300f9e83db67f56a5efd07c35c.

The Impala implementation of BIT_PACKED was never correct
because it implemented little endian bit unpacking instead of
the big endian unpacking required by the spec for levels.

Testing:
Updated tests to reflect expected behaviour for supported
and unsupported def level encodings.

Change-Id: I12c75b7f162dd7de8e26cf31be142b692e3624ae
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M common/thrift/generate_error_codes.py
M testdata/workloads/functional-query/queries/QueryTest/parquet-def-levels.test
M tests/query_test/test_scanners.py
5 files changed, 37 insertions(+), 32 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/9241/2
--
To view, visit http://gerrit.cloudera.org:8080/9241
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I12c75b7f162dd7de8e26cf31be142b692e3624ae
Gerrit-Change-Number: 9241
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-6077: remove Parquet BIT PACKED def level support

2018-02-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/9241


Change subject: IMPALA-6077: remove Parquet BIT_PACKED def level support
..

IMPALA-6077: remove Parquet BIT_PACKED def level support

The encoding was added in an early version of the Parquet
spec and deprecated even in the Parquet 1.0 spec.

Parquet-MR switched to generating RLE at the same time as
the spec changed in mid-2013. Impala always wrote RLE:
see commit 6e293090e60aea300f9e83db67f56a5efd07c35c.

The Impala implementation of BIT_PACKED was never correct
because it implemented little endian bit unpacking instead of
the big endian unpacking required by the spec for levels.

Testing:
Updated tests to reflect expected behaviour for supported
and unsupported def level encodings.

Change-Id: I12c75b7f162dd7de8e26cf31be142b692e3624ae
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M common/thrift/generate_error_codes.py
M testdata/workloads/functional-query/queries/QueryTest/parquet-def-levels.test
M tests/query_test/test_scanners.py
5 files changed, 37 insertions(+), 32 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/41/9241/1
--
To view, visit http://gerrit.cloudera.org:8080/9241
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I12c75b7f162dd7de8e26cf31be142b692e3624ae
Gerrit-Change-Number: 9241
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong