[Impala-ASF-CR] IMPALA-6592: add test for invalid parquet codecs

2018-03-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9500 )

Change subject: IMPALA-6592: add test for invalid parquet codecs
..

IMPALA-6592: add test for invalid parquet codecs

IMPALA-6592 revealed a gap in test coverage for files with
invalid/unsupported Parquet codecs. This adds a test that reproduces the
bug that was present in my IMPALA-4835 patch. master is unaffected by
this bug.

I also hid the conversion tables and made the conversion go through
functions that validate the enum values, to make it easier to track down
problems like this in the future.

Testing:
Ran exhaustive tests.

Change-Id: I1502ea7b7f39aa09f0ed2677e84219b37c64c416
Reviewed-on: http://gerrit.cloudera.org:8080/9500
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins
---
M be/src/exec/CMakeLists.txt
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/parquet-column-readers.h
A be/src/exec/parquet-common.cc
M be/src/exec/parquet-common.h
M be/src/util/parquet-reader.cc
M testdata/data/README
A testdata/data/bad_codec.parquet
A testdata/workloads/functional-query/queries/QueryTest/parquet-bad-codec.test
M tests/query_test/test_scanners.py
10 files changed, 136 insertions(+), 46 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9500
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I1502ea7b7f39aa09f0ed2677e84219b37c64c416
Gerrit-Change-Number: 9500
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-6592: add test for invalid parquet codecs

2018-03-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9500 )

Change subject: IMPALA-6592: add test for invalid parquet codecs
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/9500
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1502ea7b7f39aa09f0ed2677e84219b37c64c416
Gerrit-Change-Number: 9500
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 08 Mar 2018 04:48:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6592: add test for invalid parquet codecs

2018-03-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9500 )

Change subject: IMPALA-6592: add test for invalid parquet codecs
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2059/


--
To view, visit http://gerrit.cloudera.org:8080/9500
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1502ea7b7f39aa09f0ed2677e84219b37c64c416
Gerrit-Change-Number: 9500
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 08 Mar 2018 01:05:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6592: add test for invalid parquet codecs

2018-03-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9500 )

Change subject: IMPALA-6592: add test for invalid parquet codecs
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9500/3/be/src/exec/parquet-common.h
File be/src/exec/parquet-common.h:

http://gerrit.cloudera.org:8080/#/c/9500/3/be/src/exec/parquet-common.h@39
PS3, Line 39: vlaidate
> validate
Done



--
To view, visit http://gerrit.cloudera.org:8080/9500
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1502ea7b7f39aa09f0ed2677e84219b37c64c416
Gerrit-Change-Number: 9500
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 08 Mar 2018 01:05:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6592: add test for invalid parquet codecs

2018-03-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9500 )

Change subject: IMPALA-6592: add test for invalid parquet codecs
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9500
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1502ea7b7f39aa09f0ed2677e84219b37c64c416
Gerrit-Change-Number: 9500
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 08 Mar 2018 01:05:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6592: add test for invalid parquet codecs

2018-03-07 Thread Tim Armstrong (Code Review)
Hello Zoltan Borok-Nagy, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9500

to look at the new patch set (#4).

Change subject: IMPALA-6592: add test for invalid parquet codecs
..

IMPALA-6592: add test for invalid parquet codecs

IMPALA-6592 revealed a gap in test coverage for files with
invalid/unsupported Parquet codecs. This adds a test that reproduces the
bug that was present in my IMPALA-4835 patch. master is unaffected by
this bug.

I also hid the conversion tables and made the conversion go through
functions that validate the enum values, to make it easier to track down
problems like this in the future.

Testing:
Ran exhaustive tests.

Change-Id: I1502ea7b7f39aa09f0ed2677e84219b37c64c416
---
M be/src/exec/CMakeLists.txt
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/parquet-column-readers.h
A be/src/exec/parquet-common.cc
M be/src/exec/parquet-common.h
M be/src/util/parquet-reader.cc
M testdata/data/README
A testdata/data/bad_codec.parquet
A testdata/workloads/functional-query/queries/QueryTest/parquet-bad-codec.test
M tests/query_test/test_scanners.py
10 files changed, 136 insertions(+), 46 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/9500/4
--
To view, visit http://gerrit.cloudera.org:8080/9500
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1502ea7b7f39aa09f0ed2677e84219b37c64c416
Gerrit-Change-Number: 9500
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-6592: add test for invalid parquet codecs

2018-03-07 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9500 )

Change subject: IMPALA-6592: add test for invalid parquet codecs
..


Patch Set 3: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9500/3/be/src/exec/parquet-common.h
File be/src/exec/parquet-common.h:

http://gerrit.cloudera.org:8080/#/c/9500/3/be/src/exec/parquet-common.h@39
PS3, Line 39: vlaidate
validate



--
To view, visit http://gerrit.cloudera.org:8080/9500
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1502ea7b7f39aa09f0ed2677e84219b37c64c416
Gerrit-Change-Number: 9500
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 08 Mar 2018 01:01:22 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6592: add test for invalid parquet codecs

2018-03-07 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9500 )

Change subject: IMPALA-6592: add test for invalid parquet codecs
..


Patch Set 3: Code-Review+1

LGTM, again :)


--
To view, visit http://gerrit.cloudera.org:8080/9500
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1502ea7b7f39aa09f0ed2677e84219b37c64c416
Gerrit-Change-Number: 9500
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 08 Mar 2018 00:31:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6592: add test for invalid parquet codecs

2018-03-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9500 )

Change subject: IMPALA-6592: add test for invalid parquet codecs
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9500/2/be/src/exec/parquet-common.cc
File be/src/exec/parquet-common.cc:

http://gerrit.cloudera.org:8080/#/c/9500/2/be/src/exec/parquet-common.cc@24
PS2, Line 24: static
> Nit: namespace-scoped const variables are implicitly have internal linkage
Done


http://gerrit.cloudera.org:8080/#/c/9500/2/be/src/exec/parquet-common.cc@45
PS2, Line 45: sizeof(INTERNAL_TO_PARQUET_TYPES) / 
sizeof(INTERNAL_TO_PARQUET_TYPES[0])
> Nit: maybe you could use std::array instead.
>From what I can tell, it's not currently possible to create a std::array 
>without specifying the size directly. There's a proposal for make_array but it 
>doesn't look like it's standardised. I could hard-code the size, but that 
>overall doesn't seem to result in a simpler solution.



--
To view, visit http://gerrit.cloudera.org:8080/9500
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1502ea7b7f39aa09f0ed2677e84219b37c64c416
Gerrit-Change-Number: 9500
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 08 Mar 2018 00:24:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6592: add test for invalid parquet codecs

2018-03-07 Thread Tim Armstrong (Code Review)
Hello Zoltan Borok-Nagy,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9500

to look at the new patch set (#3).

Change subject: IMPALA-6592: add test for invalid parquet codecs
..

IMPALA-6592: add test for invalid parquet codecs

IMPALA-6592 revealed a gap in test coverage for files with
invalid/unsupported Parquet codecs. This adds a test that reproduces the
bug that was present in my IMPALA-4835 patch. master is unaffected by
this bug.

I also hid the conversion tables and made the conversion go through
functions that validate the enum values, to make it easier to track down
problems like this in the future.

Testing:
Ran exhaustive tests.

Change-Id: I1502ea7b7f39aa09f0ed2677e84219b37c64c416
---
M be/src/exec/CMakeLists.txt
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/parquet-column-readers.h
A be/src/exec/parquet-common.cc
M be/src/exec/parquet-common.h
M be/src/util/parquet-reader.cc
M testdata/data/README
A testdata/data/bad_codec.parquet
A testdata/workloads/functional-query/queries/QueryTest/parquet-bad-codec.test
M tests/query_test/test_scanners.py
10 files changed, 136 insertions(+), 46 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/9500/3
--
To view, visit http://gerrit.cloudera.org:8080/9500
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1502ea7b7f39aa09f0ed2677e84219b37c64c416
Gerrit-Change-Number: 9500
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-6592: add test for invalid parquet codecs

2018-03-07 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9500 )

Change subject: IMPALA-6592: add test for invalid parquet codecs
..


Patch Set 2: Code-Review+1

(2 comments)

LGTM

http://gerrit.cloudera.org:8080/#/c/9500/2/be/src/exec/parquet-common.cc
File be/src/exec/parquet-common.cc:

http://gerrit.cloudera.org:8080/#/c/9500/2/be/src/exec/parquet-common.cc@24
PS2, Line 24: static
Nit: namespace-scoped const variables are implicitly have internal linkage


http://gerrit.cloudera.org:8080/#/c/9500/2/be/src/exec/parquet-common.cc@45
PS2, Line 45: sizeof(INTERNAL_TO_PARQUET_TYPES) / 
sizeof(INTERNAL_TO_PARQUET_TYPES[0])
Nit: maybe you could use std::array instead.



--
To view, visit http://gerrit.cloudera.org:8080/9500
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1502ea7b7f39aa09f0ed2677e84219b37c64c416
Gerrit-Change-Number: 9500
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 07 Mar 2018 23:56:34 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6592: add test for invalid parquet codecs

2018-03-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9500 )

Change subject: IMPALA-6592: add test for invalid parquet codecs
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9500/2/be/src/exec/parquet-common.h
File be/src/exec/parquet-common.h:

http://gerrit.cloudera.org:8080/#/c/9500/2/be/src/exec/parquet-common.h@39
PS2, Line 39: vlaidate
Typo.



--
To view, visit http://gerrit.cloudera.org:8080/9500
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1502ea7b7f39aa09f0ed2677e84219b37c64c416
Gerrit-Change-Number: 9500
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 06 Mar 2018 17:59:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6592: add test for invalid parquet codecs

2018-03-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/9500 )

Change subject: IMPALA-6592: add test for invalid parquet codecs
..

IMPALA-6592: add test for invalid parquet codecs

IMPALA-6592 revealed a gap in test coverage for files with
invalid/unsupported Parquet codecs. This adds a test that reproduces the
bug that was present in my IMPALA-4835 patch. master is unaffected by
this bug.

I also hid the conversion tables and made the conversion go through
functions that validate the enum values, to make it easier to track down
problems like this in the future.

Testing:
Ran exhaustive tests.

Change-Id: I1502ea7b7f39aa09f0ed2677e84219b37c64c416
---
M be/src/exec/CMakeLists.txt
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/parquet-column-readers.h
A be/src/exec/parquet-common.cc
M be/src/exec/parquet-common.h
M be/src/util/parquet-reader.cc
M testdata/data/README
A testdata/data/bad_codec.parquet
A testdata/workloads/functional-query/queries/QueryTest/parquet-bad-codec.test
M tests/query_test/test_scanners.py
10 files changed, 136 insertions(+), 46 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/9500/2
--
To view, visit http://gerrit.cloudera.org:8080/9500
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1502ea7b7f39aa09f0ed2677e84219b37c64c416
Gerrit-Change-Number: 9500
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong