[Impala-ASF-CR] IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet correctness

2017-11-07 Thread John Russell (Code Review)
Hello Greg Rahn, Tim Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8418

to look at the new patch set (#3).

Change subject: IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet 
correctness
..

IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet correctness

Change-Id: I731eb0e029dc3cc251f4df0c5a8ad281c81595cb
---
M docs/topics/impala_known_issues.xml
1 file changed, 45 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/8418/3
--
To view, visit http://gerrit.cloudera.org:8080/8418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I731eb0e029dc3cc251f4df0c5a8ad281c81595cb
Gerrit-Change-Number: 8418
Gerrit-PatchSet: 3
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet correctness

2017-11-07 Thread John Russell (Code Review)
John Russell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8418 )

Change subject: IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet 
correctness
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8418/2/docs/topics/impala_known_issues.xml
File docs/topics/impala_known_issues.xml:

http://gerrit.cloudera.org:8080/#/c/8418/2/docs/topics/impala_known_issues.xml@951
PS2, Line 951: Medium
> I think TSBs probably use a different scale though. Most of the medium seve
Sure! Changed.


http://gerrit.cloudera.org:8080/#/c/8418/2/docs/topics/impala_known_issues.xml@951
PS2, Line 951: Medium
> I agree it's a little strange. But Cloudera issued a technical service bull
Done


http://gerrit.cloudera.org:8080/#/c/8418/2/docs/topics/impala_known_issues.xml@951
PS2, Line 951: Medium
> Missed this on the first pass - shouldn't this be high?
Done



--
To view, visit http://gerrit.cloudera.org:8080/8418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I731eb0e029dc3cc251f4df0c5a8ad281c81595cb
Gerrit-Change-Number: 8418
Gerrit-PatchSet: 2
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 22:21:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet correctness

2017-11-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8418 )

Change subject: IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet 
correctness
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8418/2/docs/topics/impala_known_issues.xml
File docs/topics/impala_known_issues.xml:

http://gerrit.cloudera.org:8080/#/c/8418/2/docs/topics/impala_known_issues.xml@951
PS2, Line 951: Medium
> I agree it's a little strange. But Cloudera issued a technical service bull
I think TSBs probably use a different scale though. Most of the medium severity 
issues here wouldn't warrant a TSB. This bug is definitely worse than the ABS 
bug below.



--
To view, visit http://gerrit.cloudera.org:8080/8418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I731eb0e029dc3cc251f4df0c5a8ad281c81595cb
Gerrit-Change-Number: 8418
Gerrit-PatchSet: 2
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 06 Nov 2017 21:18:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet correctness

2017-11-06 Thread John Russell (Code Review)
John Russell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8418 )

Change subject: IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet 
correctness
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8418/2/docs/topics/impala_known_issues.xml
File docs/topics/impala_known_issues.xml:

http://gerrit.cloudera.org:8080/#/c/8418/2/docs/topics/impala_known_issues.xml@951
PS2, Line 951: Medium
> Missed this on the first pass - shouldn't this be high?
I agree it's a little strange. But Cloudera issued a technical service bulletin 
(TSB) about the issue and it was only rated "medium" there. We are in somewhat 
new territory with how the TSB info relates to the upstream docs. My impulse 
was not to deviate greatly from whatever info came from the support group.



-- 
To view, visit http://gerrit.cloudera.org:8080/8418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I731eb0e029dc3cc251f4df0c5a8ad281c81595cb
Gerrit-Change-Number: 8418
Gerrit-PatchSet: 2
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 06 Nov 2017 20:35:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet correctness

2017-11-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8418 )

Change subject: IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet 
correctness
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8418/2/docs/topics/impala_known_issues.xml
File docs/topics/impala_known_issues.xml:

http://gerrit.cloudera.org:8080/#/c/8418/2/docs/topics/impala_known_issues.xml@951
PS2, Line 951: Medium
Missed this on the first pass - shouldn't this be high?



--
To view, visit http://gerrit.cloudera.org:8080/8418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I731eb0e029dc3cc251f4df0c5a8ad281c81595cb
Gerrit-Change-Number: 8418
Gerrit-PatchSet: 2
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 03 Nov 2017 22:48:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet correctness

2017-11-03 Thread John Russell (Code Review)
John Russell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8418 )

Change subject: IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet 
correctness
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8418/1/docs/topics/impala_known_issues.xml
File docs/topics/impala_known_issues.xml:

http://gerrit.cloudera.org:8080/#/c/8418/1/docs/topics/impala_known_issues.xml@936
PS1, Line 936: Examine the HDFS_SCAN_NODE portion 
of a query profile that scans the
> This unfortunately won't give accurate info for all queries: if the query i
Done


http://gerrit.cloudera.org:8080/#/c/8418/1/docs/topics/impala_known_issues.xml@937
PS1, Line 937: suspected table. Use a query that performs a full 
table scan, and materializes the column
> It might be helpful to note common cases where uncompressed Parquet is/isn'
Done



--
To view, visit http://gerrit.cloudera.org:8080/8418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I731eb0e029dc3cc251f4df0c5a8ad281c81595cb
Gerrit-Change-Number: 8418
Gerrit-PatchSet: 2
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 03 Nov 2017 20:37:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet correctness

2017-11-03 Thread John Russell (Code Review)
Hello Greg Rahn, Tim Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8418

to look at the new patch set (#2).

Change subject: IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet 
correctness
..

IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet correctness

Change-Id: I731eb0e029dc3cc251f4df0c5a8ad281c81595cb
---
M docs/topics/impala_known_issues.xml
1 file changed, 45 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/8418/2
--
To view, visit http://gerrit.cloudera.org:8080/8418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I731eb0e029dc3cc251f4df0c5a8ad281c81595cb
Gerrit-Change-Number: 8418
Gerrit-PatchSet: 2
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet correctness

2017-10-30 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8418 )

Change subject: IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet 
correctness
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8418/1/docs/topics/impala_known_issues.xml
File docs/topics/impala_known_issues.xml:

http://gerrit.cloudera.org:8080/#/c/8418/1/docs/topics/impala_known_issues.xml@936
PS1, Line 936: Examine the HDFS_SCAN_NODE portion 
of a query profile that scans the
This unfortunately won't give accurate info for all queries: if the query isn't 
materialising any columns (e.g. count(*)) or the file is filtered out by 
runtime filters, the file compression was inaccurate in previous versions - see 
IMPALA-5311 and IMPALA-4863 and respectively.

One way to tell for sure is to run something like "select * from table" and 
then look. Or, say, "select min(string_col) from table"


http://gerrit.cloudera.org:8080/#/c/8418/1/docs/topics/impala_known_issues.xml@937
PS1, Line 937: suspected table. Look for File Formats. A 
value containing PARQUET/NONE
It might be helpful to note common cases where uncompressed Parquet is/isn't 
created. Impala generates snappy-compressed Parquet by default unless 
compression_codec is changed. Most uncompressed parquet we see in the wild is 
generated by Hive or other non-Impala tools.



--
To view, visit http://gerrit.cloudera.org:8080/8418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I731eb0e029dc3cc251f4df0c5a8ad281c81595cb
Gerrit-Change-Number: 8418
Gerrit-PatchSet: 1
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 30 Oct 2017 21:35:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet correctness

2017-10-30 Thread John Russell (Code Review)
John Russell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8418 )

Change subject: IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet 
correctness
..


Patch Set 1:

This was announced by Cloudera as TSB-225. Traditionally, TSBs get an 
equivalent Known Issue or similar in release notes. I think the best approach 
is to make a version-agnostic issue in upstream docs and be more explicit about 
fixed CDH maintenance releases in the equivalent downstream note.


--
To view, visit http://gerrit.cloudera.org:8080/8418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I731eb0e029dc3cc251f4df0c5a8ad281c81595cb
Gerrit-Change-Number: 8418
Gerrit-PatchSet: 1
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 30 Oct 2017 21:20:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet correctness

2017-10-30 Thread John Russell (Code Review)
John Russell has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/8418


Change subject: IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet 
correctness
..

IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet correctness

Change-Id: I731eb0e029dc3cc251f4df0c5a8ad281c81595cb
---
M docs/topics/impala_known_issues.xml
1 file changed, 43 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/8418/1
--
To view, visit http://gerrit.cloudera.org:8080/8418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I731eb0e029dc3cc251f4df0c5a8ad281c81595cb
Gerrit-Change-Number: 8418
Gerrit-PatchSet: 1
Gerrit-Owner: John Russell