[Impala-ASF-CR] IMPALA-6290: limit ScannerContext to 1 buffer at a time

2018-01-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8814 )

Change subject: IMPALA-6290: limit ScannerContext to 1 buffer at a time
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8814/6/be/src/exec/scanner-context.cc
File be/src/exec/scanner-context.cc:

http://gerrit.cloudera.org:8080/#/c/8814/6/be/src/exec/scanner-context.cc@99
PS6, Line 99: done ||
> Actually, doesn't line 93 always set io_buffer_pos_ to 0 when done is true
Oops, missed that. Done.



--
To view, visit http://gerrit.cloudera.org:8080/8814
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74c5960a75f7d88b0e1de4199af731fb13e592f0
Gerrit-Change-Number: 8814
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Wed, 10 Jan 2018 07:38:08 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6290: limit ScannerContext to 1 buffer at a time

2018-01-09 Thread Tim Armstrong (Code Review)
Hello Michael Ho, Lars Volker, Zoram Thanga,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8814

to look at the new patch set (#9).

Change subject: IMPALA-6290: limit ScannerContext to 1 buffer at a time
..

IMPALA-6290: limit ScannerContext to 1 buffer at a time

This is a prerequisite for constraining the number of buffers per scan
range. Before this patch, calling ReadBytes(), SkipBytes(), etc could
cause an arbitrary number of I/O buffers to accumulate in
'completed_io_buffers_'. E.g. if we allocated 3 * 8MB I/O buffers for
a range and then called ReadBytes(30MB), we would hit resource
exhaustion as soon as 3 buffers were accumulated in
'completed_io_buffers_'.

The fix is to avoid accumulating any buffers in 'completed_io_buffers_'.
Instead of adding them to 'completed_io_buffers_', completed buffers
are just returned to the I/O manager. It turned out that this did not
weaken the ScannerContext's guarantees about memory lifetime, because
ScannerContext::GetBytesInternal() cleared 'boundary_buffer_' each
time it was called regardless. I checked that this behaviour wasn't
a bug by inspecting the scanner code. I could not find any cases
where scanners depended on returned memory remaining valid beyond
the next Read*()/Get*()/Skip*() call on the stream.

This change makes that lifetime explicit in the comments. A
side-effect of this fix is that scanners do not need to call
ReleaseCompletedResources() in CommitRows() and means that the
ScannerContext only ever needs to hold one I/O buffer at a time.

This change also reimplements SkipBytes() to avoid it accumulating
memory in the boundary buffer for large skip sizes.

Also clarifies some of the invariants in ScannerContext. E.g. some
places assumed io_buffer_ != NULL, but that is no longer needed.

Testing:
Ran core tests with ASAN and exhaustive tests with DEBUG.

Change-Id: I74c5960a75f7d88b0e1de4199af731fb13e592f0
---
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/parquet-column-readers.cc
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M be/src/exec/scanner-context.inline.h
M be/src/runtime/io/disk-io-mgr.cc
M 
testdata/workloads/functional-query/queries/DataErrorsTest/hdfs-scan-node-errors.test
11 files changed, 321 insertions(+), 210 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/8814/9
--
To view, visit http://gerrit.cloudera.org:8080/8814
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I74c5960a75f7d88b0e1de4199af731fb13e592f0
Gerrit-Change-Number: 8814
Gerrit-PatchSet: 9
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoram Thanga 


[Impala-ASF-CR] IMPALA-1767: [DOCS] Document new Boolean operators

2018-01-09 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8942 )

Change subject: IMPALA-1767: [DOCS] Document new Boolean operators
..


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8942/3/docs/topics/impala_operators.xml
File docs/topics/impala_operators.xml:

http://gerrit.cloudera.org:8080/#/c/8942/3/docs/topics/impala_operators.xml@1245
PS3, Line 1245: In  and higher, you can use
  : the operators IS [NOT] TRUE and
  : IS [NOT] FALSE to perform 
null-safe
  : tests against Boolean expressions. These operators 
always
  : return TRUE or FALSE,
  : even if the other side of the expression evaluates to
  : NULL. These operators let you simplify
  : Boolean comparisons that must also check for 
NULL,
  : for example X != 10 AND X IS NOT NULL 
is equivalent
  : to (X != 10) IS TRUE.
  :   
why is this explained here for IS UNKNOWN ? its a repeat of the content for the 
IS TRUE section.


http://gerrit.cloudera.org:8080/#/c/8942/3/docs/topics/impala_operators.xml@1260
PS3, Line 1260: IS [NOT] NULL
same thing twice. perhaps the first one should be IS [NOT] UNKNOWN ?


http://gerrit.cloudera.org:8080/#/c/8942/3/docs/topics/impala_operators.xml@1332
PS3, Line 1332: UNKNOWN
TRUE ? (UNKNOWN is handled in the prev section)



--
To view, visit http://gerrit.cloudera.org:8080/8942
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iefebf210418ec2d47b154bd37166b76720f085bb
Gerrit-Change-Number: 8942
Gerrit-PatchSet: 3
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 10 Jan 2018 06:49:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5528: Upgrade GPerfTools to 2.6.3 and tune TCMalloc for KRPC

2018-01-09 Thread Michael Ho (Code Review)
Hello Sailesh Mukil, Tim Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8991

to look at the new patch set (#2).

Change subject: IMPALA-5528: Upgrade GPerfTools to 2.6.3 and tune TCMalloc for 
KRPC
..

IMPALA-5528: Upgrade GPerfTools to 2.6.3 and tune TCMalloc for KRPC

KRPC in general tends to put more pressure on the thread
caches due to allocations of more small objects (i.e. <1MB).
While some of them are being addressed in KUDU-1865, we
notice that the following TCMalloc workarounds will provide
reasonable performance with KRPC:

- TCMALLOC_TRANSFER_NUM_OBJ:
  - maximum number of object per classe type to transfer between
thread and central caches.
  - the default value of 512 in 2.5.2 seems to cause the spin lock
in the central cache to be held for too long with KRPC. 2.6.0
and latter reverts this value to 32 by default.

- TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES
  - total amount of memory allocated to all thread caches in bytes
  - the default value is 32MB. We need to bump it to 1GB which is the
internal cap in TCMalloc.

This change upgrades GPerfTools/TCMalloc to 2.6.3 to pick up the
change of the default value of TCMALLOC_TRANSFER_NUM_OBJ.

In addition, when KRPC is enabled and 
FLAGS_TCMALLOC_MAX_TOTAL_THREAD_CAHCE_BYTES
has the default value of 0, we will automatically bump the thread cache sizes
to 1GB. Without these workarounds, stress test with KRPC will grind to a halt
due to contention for the spinlock in TCMalloc's central cache. With these
workarounds, the stress test completes within the same ballpark as thrift.

Also did a perf run with Thrift. The regression in TPCH-Q2 is mostly due to 
sensitivity in
runtime filter timing and the avg can be dragged up due to a bad run when 
filters arrive late.
No regression as measured in targeted-perf.

++---+-++++
| Workload   | File Format   | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |
++---+-++++
| TPCH(_300) | parquet / none / none | 18.93   | -0.84% | 10.08  | 
+1.45% |
++---+-++++

++--+---++-++++-+---+
| Workload   | Query| File Format   | Avg(s) | Base Avg(s) | 
Delta(Avg) | StdDev(%)  | Base StdDev(%) | Num Clients | Iters |
++--+---++-++++-+---+
| TPCH(_300) | TPCH-Q2  | parquet / none / none | 6.28   | 3.25| R 
+93.41%  | * 49.77% * | * 12.47% * | 1   | 3 |
| TPCH(_300) | TPCH-Q4  | parquet / none / none | 5.00   | 4.77|   
+4.83%   |   0.41%|   0.03%| 1   | 3 |
| TPCH(_300) | TPCH-Q13 | parquet / none / none | 21.29  | 20.69   |   
+2.90%   |   0.55%|   0.37%| 1   | 3 |
| TPCH(_300) | TPCH-Q11 | parquet / none / none | 1.73   | 1.71|   
+0.94%   |   1.69%|   2.85%| 1   | 3 |
| TPCH(_300) | TPCH-Q14 | parquet / none / none | 6.03   | 5.99|   
+0.76%   |   0.00%|   0.95%| 1   | 3 |
| TPCH(_300) | TPCH-Q16 | parquet / none / none | 6.97   | 6.93|   
+0.58%   |   0.74%|   0.73%| 1   | 3 |
| TPCH(_300) | TPCH-Q3  | parquet / none / none | 29.15  | 29.03   |   
+0.40%   |   1.63%|   1.39%| 1   | 3 |
| TPCH(_300) | TPCH-Q1  | parquet / none / none | 14.01  | 13.96   |   
+0.34%   |   1.28%|   0.51%| 1   | 3 |
| TPCH(_300) | TPCH-Q6  | parquet / none / none | 1.27   | 1.27|   
-0.03%   |   3.69%|   0.07%| 1   | 3 |
| TPCH(_300) | TPCH-Q9  | parquet / none / none | 30.99  | 31.13   |   
-0.45%   |   0.54%|   0.19%| 1   | 3 |
| TPCH(_300) | TPCH-Q5  | parquet / none / none | 48.03  | 48.33   |   
-0.63%   |   4.72%|   0.11%| 1   | 3 |
| TPCH(_300) | TPCH-Q7  | parquet / none / none | 46.85  | 47.41   |   
-1.18%   |   1.59%|   0.46%| 1   | 3 |
| TPCH(_300) | TPCH-Q8  | parquet / none / none | 7.92   | 8.03|   
-1.39%   |   3.67%|   5.63%| 1   | 3 |
| TPCH(_300) | TPCH-Q19 | parquet / none / none | 30.98  | 31.51   |   
-1.67%   |   1.33%|   0.82%| 1   | 3 |
| TPCH(_300) | TPCH-Q18 | parquet / none / none | 33.55  | 34.13   |   
-1.71%   |   1.15%|   1.46%| 1   | 3 |
| TPCH(_300) | TPCH-Q10 | parquet / none / none | 9.46   | 9.64|   
-1.82%   |   0.63%|   0.75%   

[Impala-ASF-CR] IMPALA-6371: Additional check for delimiters

2018-01-09 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8959 )

Change subject: IMPALA-6371: Additional check for delimiters
..


Patch Set 3:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/8959/3/fe/src/main/java/org/apache/impala/catalog/HdfsStorageDescriptor.java
File fe/src/main/java/org/apache/impala/catalog/HdfsStorageDescriptor.java:

http://gerrit.cloudera.org:8080/#/c/8959/3/fe/src/main/java/org/apache/impala/catalog/HdfsStorageDescriptor.java@145
PS3, Line 145: // e.g. \u as delimVal will return a valid byte '11'
whitespace


http://gerrit.cloudera.org:8080/#/c/8959/3/fe/src/main/java/org/apache/impala/catalog/HdfsStorageDescriptor.java@147
PS3, Line 147: if (cp<0 || cp>255) return null;
style: if (cp < 0 || cp > 255) return null;


http://gerrit.cloudera.org:8080/#/c/8959/3/fe/src/main/java/org/apache/impala/catalog/HdfsStorageDescriptor.java@148
PS3, Line 148: return (byte) delimVal.charAt(0);
return (byte) cp;


http://gerrit.cloudera.org:8080/#/c/8959/3/fe/src/test/java/org/apache/impala/catalog/HdfsStorageDescriptorTest.java
File fe/src/test/java/org/apache/impala/catalog/HdfsStorageDescriptorTest.java:

http://gerrit.cloudera.org:8080/#/c/8959/3/fe/src/test/java/org/apache/impala/catalog/HdfsStorageDescriptorTest.java@161
PS3, Line 161: sd.getSerdeInfo().setParameters(new 
HashMap());
brief comment what this is testing


http://gerrit.cloudera.org:8080/#/c/8959/3/fe/src/test/java/org/apache/impala/catalog/HdfsStorageDescriptorTest.java@172
PS3, Line 172: sd.getSerdeInfo().setParameters(new 
HashMap());
brief comment what this is testing



--
To view, visit http://gerrit.cloudera.org:8080/8959
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If8dc335d39dd02f602cf93682bccf84b2c099dde
Gerrit-Change-Number: 8959
Gerrit-PatchSet: 3
Gerrit-Owner: Adam Holley 
Gerrit-Reviewer: Adam Holley 
Gerrit-Reviewer: Alex Behm 
Gerrit-Comment-Date: Wed, 10 Jan 2018 05:51:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2181: [DOCS] Document changes to SET output

2018-01-09 Thread John Russell (Code Review)
John Russell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8865 )

Change subject: IMPALA-2181: [DOCS] Document changes to SET output
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8865/1/docs/topics/impala_set.xml
File docs/topics/impala_set.xml:

http://gerrit.cloudera.org:8080/#/c/8865/1/docs/topics/impala_set.xml@60
PS1, Line 60: Regular Query
> You can call these "Regular query options"
Done


http://gerrit.cloudera.org:8080/#/c/8865/1/docs/topics/impala_set.xml@90
PS1, Line 90: JDBC or ODBC
> On these interfaces besides adding the 'level' column, this "SET" vs "SET A
Done



--
To view, visit http://gerrit.cloudera.org:8080/8865
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iade7cb326715ebbb8518230d518d05601d615f61
Gerrit-Change-Number: 8865
Gerrit-PatchSet: 2
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 10 Jan 2018 05:48:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2181: [DOCS] Document changes to SET output

2018-01-09 Thread John Russell (Code Review)
Hello Gabor Kaszab, Tim Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8865

to look at the new patch set (#2).

Change subject: IMPALA-2181: [DOCS] Document changes to SET output
..

IMPALA-2181: [DOCS] Document changes to SET output

Change-Id: Iade7cb326715ebbb8518230d518d05601d615f61
---
M docs/topics/impala_set.xml
1 file changed, 75 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/8865/2
--
To view, visit http://gerrit.cloudera.org:8080/8865
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iade7cb326715ebbb8518230d518d05601d615f61
Gerrit-Change-Number: 8865
Gerrit-PatchSet: 2
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5014: Part 2: Round when casting decimal to timestamp

2018-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8969 )

Change subject: IMPALA-5014: Part 2: Round when casting decimal to timestamp
..

IMPALA-5014: Part 2: Round when casting decimal to timestamp

When there are too many digits to the right of the dot in a decimal, we
would always truncate when casting to timestamp. In this patch we change
the behavior to round instead of truncating when decimal_v2 is enabled.

Testing:
- Added some EE tests, ran BE tests on my machine.

Change-Id: I8fb3a7d976ab980b8572d7e9524850572bad57da
Reviewed-on: http://gerrit.cloudera.org:8080/8969
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins
---
M be/src/exprs/decimal-operators-ir.cc
M be/src/exprs/decimal-operators.h
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.h
M testdata/workloads/functional-query/queries/QueryTest/decimal-exprs.test
5 files changed, 83 insertions(+), 15 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/8969
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I8fb3a7d976ab980b8572d7e9524850572bad57da
Gerrit-Change-Number: 8969
Gerrit-PatchSet: 3
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-5014: Part 2: Round when casting decimal to timestamp

2018-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8969 )

Change subject: IMPALA-5014: Part 2: Round when casting decimal to timestamp
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/8969
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8fb3a7d976ab980b8572d7e9524850572bad57da
Gerrit-Change-Number: 8969
Gerrit-PatchSet: 2
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Wed, 10 Jan 2018 05:47:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1767: [DOCS] Document new Boolean operators

2018-01-09 Thread John Russell (Code Review)
John Russell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8942 )

Change subject: IMPALA-1767: [DOCS] Document new Boolean operators
..


Patch Set 2:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/8942/2/docs/shared/impala_common.xml
File docs/shared/impala_common.xml:

http://gerrit.cloudera.org:8080/#/c/8942/2/docs/shared/impala_common.xml@779
PS2, Line 779: IS [NOT] FALSE as equivalents for the 
functions
> "builtin functions" instead of "functions" to match the blurb in the detail
Done


http://gerrit.cloudera.org:8080/#/c/8942/2/docs/topics/impala_operators.xml
File docs/topics/impala_operators.xml:

http://gerrit.cloudera.org:8080/#/c/8942/2/docs/topics/impala_operators.xml@1251
PS2, Line 1251: lets
> nit: let
Done


http://gerrit.cloudera.org:8080/#/c/8942/2/docs/topics/impala_operators.xml@1320
PS2, Line 1320: lets
> nit: let
Done


http://gerrit.cloudera.org:8080/#/c/8942/2/docs/topics/impala_operators.xml@1327
PS2, Line 1327: These operators are equivalent to the built-in conditional 
functions
> sync this blurb with the one in the index (see comment there).
Done. I'll reuse the wording verbatim with a conref= attribute.


http://gerrit.cloudera.org:8080/#/c/8942/2/docs/topics/impala_operators.xml@1337
PS2, Line 1337: query error
> must be same error as for "IS [NOT] NULL" for complex types? perhaps its le
Done


http://gerrit.cloudera.org:8080/#/c/8942/2/docs/topics/impala_operators.xml@1345
PS2, Line 1345: select assertion, b, b is true, b is false, b is unknown
  :   from boolean_test;
> just for my own info, but why is this lower-case whereas sql inlined in tex
I usually save the strict uppercase-keyword style for the detailed syntax 
blocks. Especially for built-in functions, I use mostly lowercase to be 
modern-looking like C code.



--
To view, visit http://gerrit.cloudera.org:8080/8942
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iefebf210418ec2d47b154bd37166b76720f085bb
Gerrit-Change-Number: 8942
Gerrit-PatchSet: 2
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 10 Jan 2018 05:42:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1767: [DOCS] Document new Boolean operators

2018-01-09 Thread John Russell (Code Review)
Hello Greg Rahn, Vuk Ercegovac,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8942

to look at the new patch set (#3).

Change subject: IMPALA-1767: [DOCS] Document new Boolean operators
..

IMPALA-1767: [DOCS] Document new Boolean operators

In a new subtopic:

IS [NOT] TRUE
IS [NOT] FALSE

Folded into IS [NOT] NULL:

IS [NOT] UNKNOWN
Change-Id: Iefebf210418ec2d47b154bd37166b76720f085bb
---
M docs/impala_keydefs.ditamap
M docs/shared/impala_common.xml
M docs/topics/impala_conditional_functions.xml
M docs/topics/impala_operators.xml
4 files changed, 129 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/8942/3
--
To view, visit http://gerrit.cloudera.org:8080/8942
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iefebf210418ec2d47b154bd37166b76720f085bb
Gerrit-Change-Number: 8942
Gerrit-PatchSet: 3
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-6128: Add support for AES-CTR encryption when spilling to disk

2018-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8861 )

Change subject: IMPALA-6128: Add support for AES-CTR encryption when spilling 
to disk
..


Patch Set 7: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/8861
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9debc240615dd8cdbf00ec8730cff62ffef52aff
Gerrit-Change-Number: 8861
Gerrit-PatchSet: 7
Gerrit-Owner: Xianda Ke 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Xianda Ke 
Gerrit-Comment-Date: Wed, 10 Jan 2018 05:39:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6128: Add support for AES-CTR encryption when spilling to disk

2018-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8861 )

Change subject: IMPALA-6128: Add support for AES-CTR encryption when spilling 
to disk
..

IMPALA-6128: Add support for AES-CTR encryption when spilling to disk

CFB mode is a stream cipher and is secure when used with a different nonce/IV
for every message. However it can be a performance bottleneck.
CTR mode is also stream cipher and is secure, 4~6x faster than CFB mode in
OpenSSL. AES-CTR+SHA256 is about 40~70% faster than AES-CFB+SHA256.

CTR mode is used if OpenSSL version>=1.0.1 at runtime, otherwise
fall back to using CFB mode.

Testing:
run runtime tmp-file-mgr-test, openssl-util-test, buffer-pool-test and
buffered-tuple-stream-test
The ut case openssl-util-test.EncryptInPlace tests encryption in both modes.

Change-Id: I9debc240615dd8cdbf00ec8730cff62ffef52aff
Reviewed-on: http://gerrit.cloudera.org:8080/8861
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins
---
M be/src/runtime/tmp-file-mgr.cc
M be/src/util/openssl-util-test.cc
M be/src/util/openssl-util.cc
M be/src/util/openssl-util.h
4 files changed, 94 insertions(+), 44 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/8861
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I9debc240615dd8cdbf00ec8730cff62ffef52aff
Gerrit-Change-Number: 8861
Gerrit-PatchSet: 8
Gerrit-Owner: Xianda Ke 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Xianda Ke 


[Impala-ASF-CR] IMPALA-5990: End-to-end compression of metadata

2018-01-09 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8825 )

Change subject: IMPALA-5990: End-to-end compression of metadata
..


Patch Set 4:

(34 comments)

General approach and flow looks good to me.

I'm still thinking about details and testing.

http://gerrit.cloudera.org:8080/#/c/8825/4/be/src/catalog/CMakeLists.txt
File be/src/catalog/CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/8825/4/be/src/catalog/CMakeLists.txt@29
PS4, Line 29: ADD_BE_TEST(catalog-util-test)
I think you forgot to add the .cc file


http://gerrit.cloudera.org:8080/#/c/8825/4/be/src/catalog/catalog-server.h
File be/src/catalog/catalog-server.h:

http://gerrit.cloudera.org:8080/#/c/8825/4/be/src/catalog/catalog-server.h@78
PS4, Line 78:   void AddTopicUpdate(std::string key, const uint8_t* buf, 
uint32_t size, bool deleted);
* AddPendingTopicItem()
(and corresponding change to comment, see a comment elsewhere explaining our 
use of "topic" and "item")
* Pass the key as reference? Seems clearer what is happening. Also comment that 
the function takes ownership of the key memory
* 'buf' needs a better name


http://gerrit.cloudera.org:8080/#/c/8825/4/be/src/catalog/catalog-server.cc
File be/src/catalog/catalog-server.cc:

http://gerrit.cloudera.org:8080/#/c/8825/4/be/src/catalog/catalog-server.cc@245
PS4, Line 245:   update.topic_entries.emplace_back();
Why not this instead of the loop:

update.topic_entries = std::move(pending_topic_updates_);


http://gerrit.cloudera.org:8080/#/c/8825/4/be/src/catalog/catalog-util.h
File be/src/catalog/catalog-util.h:

http://gerrit.cloudera.org:8080/#/c/8825/4/be/src/catalog/catalog-util.h@44
PS4, Line 44: Status CompressCatalogObject(const uint8_t* buf, uint32_t size,
Use src/dst as arg names. Same for decompress below.


http://gerrit.cloudera.org:8080/#/c/8825/4/be/src/catalog/catalog-util.cc
File be/src/catalog/catalog-util.cc:

http://gerrit.cloudera.org:8080/#/c/8825/4/be/src/catalog/catalog-util.cc@145
PS4, Line 145:   output->resize(static_cast(output_buffer_len));
why static_cast?


http://gerrit.cloudera.org:8080/#/c/8825/4/be/src/catalog/catalog-util.cc@162
PS4, Line 162:   output->resize(static_cast(decompressed_len));
why static_cast?


http://gerrit.cloudera.org:8080/#/c/8825/4/be/src/catalog/catalog.h
File be/src/catalog/catalog.h:

http://gerrit.cloudera.org:8080/#/c/8825/4/be/src/catalog/catalog.h@64
PS4, Line 64:   Status GetCatalogDelta(void* server_ptr, int64_t from_version,
"CatalogServer* caller" instead of "void* server_ptr", seems clearer, you can 
forward declare class CatalogServer to avoid the include


http://gerrit.cloudera.org:8080/#/c/8825/4/be/src/service/fe-support.cc
File be/src/service/fe-support.cc:

http://gerrit.cloudera.org:8080/#/c/8825/4/be/src/service/fe-support.cc@429
PS4, Line 429: jclass caller_class, jlong callback_ctx, jbyteArray key,
Currently, we get the key bytes via String.getBytes() which I believe must 
create a new byte[].

If we pass the key as a jstring we can avoid a copy in most cases.


http://gerrit.cloudera.org:8080/#/c/8825/4/be/src/service/fe-support.cc@432
PS4, Line 432:   signed char* key_buf = env->GetByteArrayElements(key, nullptr);
I think we should consider using the "critical" version of this function. See 
JniScopedArrayCritical from:
https://gerrit.cloudera.org/#/c/8150/6/be/src/util/jni-util.h

That patch never made it in, you could just copy the code.


http://gerrit.cloudera.org:8080/#/c/8825/4/be/src/service/fe-support.cc@446
PS4, Line 446: 
Java_org_apache_impala_service_FeSupport_NativeGetCatalogUpdate(JNIEnv* env,
NativeGetNextCatalogTopicItem() or similar

the name should convey that this is a "get next" type of call and what exactly 
is being iterated over


http://gerrit.cloudera.org:8080/#/c/8825/4/be/src/service/fe-support.cc@454
PS4, Line 454: jclass caller_class, jbyteArray hdfs_location) {
why not "jstring hdfs_location"


http://gerrit.cloudera.org:8080/#/c/8825/4/be/src/service/fe-support.cc@465
PS4, Line 465: jclass caller_class, jbyteArray hdfs_lib_file) {
why not "jstring hdfs_lib_file"


http://gerrit.cloudera.org:8080/#/c/8825/4/be/src/service/fe-support.cc@551
PS4, Line 551:   (char*)"NativeAddCatalogUpdate", (char*)"(J[B[BZ)V",
make indentation consistent


http://gerrit.cloudera.org:8080/#/c/8825/4/be/src/service/frontend.h
File be/src/service/frontend.h:

http://gerrit.cloudera.org:8080/#/c/8825/4/be/src/service/frontend.h@41
PS4, Line 41:   /// Request to update the Impalad catalog cache. The req 
argument contains a vector of
Update comment on the 'req' arg


http://gerrit.cloudera.org:8080/#/c/8825/4/be/src/service/frontend.h@214
PS4, Line 214: // A helper class used to pass catalog object updates to the 
frontend.
/// comment style here and below


http://gerrit.cloudera.org:8080/#/c/8825/4/be/src/service/frontend.h@217
PS4, Line 217:   // Return the next catalog object 

[Impala-ASF-CR] [DOCS] Recommend using Kudu Java API for rapid DMLs

2018-01-09 Thread John Russell (Code Review)
John Russell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8976 )

Change subject: [DOCS] Recommend using Kudu Java API for rapid DMLs
..


Patch Set 1:

JD are you an Impala committer or should I hit up someone else for a +2?


--
To view, visit http://gerrit.cloudera.org:8080/8976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0098f0c3d5d07c89e6bb589c4c04edce300c1ad3
Gerrit-Change-Number: 8976
Gerrit-PatchSet: 1
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Wed, 10 Jan 2018 05:19:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5736: [DOCS] Document --query option for impala-shell

2018-01-09 Thread John Russell (Code Review)
Hello Lars Volker, David Knupp, Csaba Ringhofer,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8771

to look at the new patch set (#3).

Change subject: IMPALA-5736: [DOCS] Document --query_option for impala-shell
..

IMPALA-5736: [DOCS] Document --query_option for impala-shell

Change-Id: I5fa4fc27d6566e87fdabe57edc176133d586a84b
---
M docs/impala_keydefs.ditamap
M docs/topics/impala_query_options.xml
M docs/topics/impala_set.xml
M docs/topics/impala_shell_options.xml
4 files changed, 39 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/8771/3
--
To view, visit http://gerrit.cloudera.org:8080/8771
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5fa4fc27d6566e87fdabe57edc176133d586a84b
Gerrit-Change-Number: 8771
Gerrit-PatchSet: 3
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-5736: [DOCS] Document --query option for impala-shell

2018-01-09 Thread John Russell (Code Review)
John Russell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8771 )

Change subject: IMPALA-5736: [DOCS] Document --query_option for impala-shell
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8771/2/docs/topics/impala_shell_options.xml
File docs/topics/impala_shell_options.xml:

http://gerrit.cloudera.org:8080/#/c/8771/2/docs/topics/impala_shell_options.xml@284
PS2, Line 284:   
--query_option="option=value"
> Maybe the short form -Q should be also mentioned here - as I see, both form
Done



--
To view, visit http://gerrit.cloudera.org:8080/8771
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5fa4fc27d6566e87fdabe57edc176133d586a84b
Gerrit-Change-Number: 8771
Gerrit-PatchSet: 3
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Wed, 10 Jan 2018 05:14:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5317: [DOCS] Doc for DATE TRUNC() function

2018-01-09 Thread John Russell (Code Review)
John Russell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8768 )

Change subject: IMPALA-5317: [DOCS] Doc for DATE_TRUNC() function
..


Patch Set 5:

Tagging in Tim who reviewed the code. Some other committers not available to +2.


--
To view, visit http://gerrit.cloudera.org:8080/8768
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifcf38903bb10db12cbb8d73a2dc875aef29cd359
Gerrit-Change-Number: 8768
Gerrit-PatchSet: 5
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: sandeep akinapelli 
Gerrit-Comment-Date: Wed, 10 Jan 2018 04:47:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6278: [DOCS] Add release note subtopics

2018-01-09 Thread John Russell (Code Review)
John Russell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8992 )

Change subject: IMPALA-6278: [DOCS] Add release note subtopics
..


Patch Set 1:

I would tag in Greg but he's away until the 15th.


--
To view, visit http://gerrit.cloudera.org:8080/8992
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I968f53c6652197774cdec364c47bc10277e6877a
Gerrit-Change-Number: 8992
Gerrit-PatchSet: 1
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Michael Brown 
Gerrit-Comment-Date: Wed, 10 Jan 2018 04:39:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6278: [DOCS] Add release note subtopics

2018-01-09 Thread John Russell (Code Review)
John Russell has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/8992


Change subject: IMPALA-6278: [DOCS] Add release note subtopics
..

IMPALA-6278: [DOCS] Add release note subtopics

Primarily placeholders that link to the 2.11
CHANGELOG file on the web.

Change-Id: I968f53c6652197774cdec364c47bc10277e6877a
---
M docs/impala_keydefs.ditamap
M docs/topics/impala_fixed_issues.xml
M docs/topics/impala_incompatible_changes.xml
M docs/topics/impala_new_features.xml
4 files changed, 50 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/8992/1
--
To view, visit http://gerrit.cloudera.org:8080/8992
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I968f53c6652197774cdec364c47bc10277e6877a
Gerrit-Change-Number: 8992
Gerrit-PatchSet: 1
Gerrit-Owner: John Russell 


[Impala-ASF-CR] [DOCS] Include the current Impala version number on PDF title page

2018-01-09 Thread John Russell (Code Review)
John Russell has abandoned this change. ( http://gerrit.cloudera.org:8080/8648 )

Change subject: [DOCS] Include the current Impala version number on PDF title 
page
..


Abandoned
--
To view, visit http://gerrit.cloudera.org:8080/8648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: I66d6669769d2aa22602057bfcf5b29307e2a0fa2
Gerrit-Change-Number: 8648
Gerrit-PatchSet: 1
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Laurel Hale 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-4993: extend dictionary filtering to collections

2018-01-09 Thread Vuk Ercegovac (Code Review)
Hello Lars Volker, Tim Armstrong, Alex Behm,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8775

to look at the new patch set (#8).

Change subject: IMPALA-4993: extend dictionary filtering to collections
..

IMPALA-4993: extend dictionary filtering to collections

Currently, top-level scalar columns in parquet files can
be used at runtime to prune row-groups by evaluating certain
conjuncts over the column's dictionary (if available).

This change extends such pruning to scalar values that are
stored in collection type columns. Currently, dictionary
pruning works by finding eligible conjuncts for top-level
slots. Since only top-level slots are supported, the slots
are implicitly part of the scan node's tuple descriptor.
With this change, we track eligible conjuncts by slot as well
as the tuple that contains the slot (either top-level or
nested collection). Since collection conjuncts are already
managed by a map that associates tuple descriptors to a list
of their conjuncts, this extension follows the existing
representation.

The frontend builds the mapping of  to
conjuncts that are dictionary filterable. The backend is
adjusted to use the same representation. In addition, collection
readers are decomposed into scalar filterable columns and
other, non-dictionary filterable readers. When filtering
a row group using a conjunct associated to a (possibly)
nested collection type, an additional tuple buffer is
allocated per tuple descriptor.

Testing:
- e2e test extended to illustrate row-groups that are pruned
  by nested collection dictionary filters.

Change-Id: If3a2abcfc3d0f7d18756816659fed77ce12668dd
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/parquet-column-readers.cc
M be/src/runtime/collection-value-builder.h
M be/src/runtime/scoped-buffer.h
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
M testdata/workloads/functional-query/queries/QueryTest/parquet-filtering.test
15 files changed, 670 insertions(+), 218 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/8775/8
--
To view, visit http://gerrit.cloudera.org:8080/8775
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If3a2abcfc3d0f7d18756816659fed77ce12668dd
Gerrit-Change-Number: 8775
Gerrit-PatchSet: 8
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-4993: extend dictionary filtering to collections

2018-01-09 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8775 )

Change subject: IMPALA-4993: extend dictionary filtering to collections
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8775/6/be/src/exec/hdfs-parquet-scanner.h
File be/src/exec/hdfs-parquet-scanner.h:

http://gerrit.cloudera.org:8080/#/c/8775/6/be/src/exec/hdfs-parquet-scanner.h@462
PS6, Line 462:   /// Memory used to store the tuples used for dictionary 
filtering. Tuples owned by
> A comment in ScopedBuffer would be helpful. The only time I think it is rea
Done



--
To view, visit http://gerrit.cloudera.org:8080/8775
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If3a2abcfc3d0f7d18756816659fed77ce12668dd
Gerrit-Change-Number: 8775
Gerrit-PatchSet: 7
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 10 Jan 2018 03:58:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5528: Tune TCMalloc for KRPC

2018-01-09 Thread Michael Ho (Code Review)
Michael Ho has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/8991


Change subject: IMPALA-5528: Tune TCMalloc for KRPC
..

IMPALA-5528: Tune TCMalloc for KRPC

KRPC in general tends to put more pressure on the thread
caches due to allocations of more small objects (i.e. <1MB).
While some of them are being addressed in KUDU-1865, we
notice that the following TCMalloc workarounds will provide
reasonable performance with KRPC:

- TCMALLOC_TRANSFER_NUM_OBJ:
  - maximum number of object per classe type to transfer between
thread and central caches.
  - the default value of 512 in 2.5.2 seems to cause the spin lock
in the central cache to be held for too long with KRPC. 2.6.0
and latter reverts this value to 32 by default.

- TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES
  - total amount of memory allocated to all thread caches in bytes
  - the default value is 32MB. We need to bump it to 1GB which is the
internal cap in TCMalloc.

This change upgrades GPerfTools/TCMalloc to 2.6.3 to pick up the
change of the default value of TCMALLOC_TRANSFER_NUM_OBJ.

In addition, when KRPC is enabled and 
FLAGS_TCMALLOC_MAX_TOTAL_THREAD_CAHCE_BYTES
has the default value of 0, we will automatically bump the thread cache sizes
to 1GB. Without these workarounds, stress test with KRPC will grind to a halt
due to unfriendly behavior with thread caches. With these workarounds, the 
stress
test completes within the same ballpark as thrift.

Also did a perf run with Thrift. The regression in TPCH-Q2 is mostly due to 
sensitivity in
runtime filter timing and the avg can be dragged up due to a bad run when 
filters arrive late.
No regression as measured in targeted-perf.

++---+-++++
| Workload   | File Format   | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |
++---+-++++
| TPCH(_300) | parquet / none / none | 18.93   | -0.84% | 10.08  | 
+1.45% |
++---+-++++

++--+---++-++++-+---+
| Workload   | Query| File Format   | Avg(s) | Base Avg(s) | 
Delta(Avg) | StdDev(%)  | Base StdDev(%) | Num Clients | Iters |
++--+---++-++++-+---+
| TPCH(_300) | TPCH-Q2  | parquet / none / none | 6.28   | 3.25| R 
+93.41%  | * 49.77% * | * 12.47% * | 1   | 3 |
| TPCH(_300) | TPCH-Q4  | parquet / none / none | 5.00   | 4.77|   
+4.83%   |   0.41%|   0.03%| 1   | 3 |
| TPCH(_300) | TPCH-Q13 | parquet / none / none | 21.29  | 20.69   |   
+2.90%   |   0.55%|   0.37%| 1   | 3 |
| TPCH(_300) | TPCH-Q11 | parquet / none / none | 1.73   | 1.71|   
+0.94%   |   1.69%|   2.85%| 1   | 3 |
| TPCH(_300) | TPCH-Q14 | parquet / none / none | 6.03   | 5.99|   
+0.76%   |   0.00%|   0.95%| 1   | 3 |
| TPCH(_300) | TPCH-Q16 | parquet / none / none | 6.97   | 6.93|   
+0.58%   |   0.74%|   0.73%| 1   | 3 |
| TPCH(_300) | TPCH-Q3  | parquet / none / none | 29.15  | 29.03   |   
+0.40%   |   1.63%|   1.39%| 1   | 3 |
| TPCH(_300) | TPCH-Q1  | parquet / none / none | 14.01  | 13.96   |   
+0.34%   |   1.28%|   0.51%| 1   | 3 |
| TPCH(_300) | TPCH-Q6  | parquet / none / none | 1.27   | 1.27|   
-0.03%   |   3.69%|   0.07%| 1   | 3 |
| TPCH(_300) | TPCH-Q9  | parquet / none / none | 30.99  | 31.13   |   
-0.45%   |   0.54%|   0.19%| 1   | 3 |
| TPCH(_300) | TPCH-Q5  | parquet / none / none | 48.03  | 48.33   |   
-0.63%   |   4.72%|   0.11%| 1   | 3 |
| TPCH(_300) | TPCH-Q7  | parquet / none / none | 46.85  | 47.41   |   
-1.18%   |   1.59%|   0.46%| 1   | 3 |
| TPCH(_300) | TPCH-Q8  | parquet / none / none | 7.92   | 8.03|   
-1.39%   |   3.67%|   5.63%| 1   | 3 |
| TPCH(_300) | TPCH-Q19 | parquet / none / none | 30.98  | 31.51   |   
-1.67%   |   1.33%|   0.82%| 1   | 3 |
| TPCH(_300) | TPCH-Q18 | parquet / none / none | 33.55  | 34.13   |   
-1.71%   |   1.15%|   1.46%| 1   | 3 |
| TPCH(_300) | TPCH-Q10 | parquet / none / none | 9.46   | 9.64|   
-1.82%   |   0.63%|   0.75%| 1   | 3 |
| TPCH(_300) | TPCH-Q22 | parquet / none / none | 6.00   | 6.16|   
-2.58%   |   0.08%|   5.12%| 1 

[Impala-ASF-CR] IMPALA-6231: Implement decimal v2 fuzz test

2018-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8898 )

Change subject: IMPALA-6231: Implement decimal_v2 fuzz test
..

IMPALA-6231: Implement decimal_v2 fuzz test

Implement a test that generates random decimal numbers in the pytest
framework, performs a random mathemtaical operation in Impala and
verifies that the result is correct by doing the same operating using
the Python decimal module. We try to generate not only completely random
decimal numbers, but also numbers that have interesting properties, such
as the number being a power of two.

Change-Id: I4328125de5c583ec8ead1f78d9a08703b18b2d85
Reviewed-on: http://gerrit.cloudera.org:8080/8898
Reviewed-by: Michael Brown 
Reviewed-by: Zach Amsden 
Tested-by: Impala Public Jenkins
---
A tests/query_test/test_decimal_fuzz.py
1 file changed, 248 insertions(+), 0 deletions(-)

Approvals:
  Michael Brown: Looks good to me, but someone else must approve
  Zach Amsden: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/8898
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I4328125de5c583ec8ead1f78d9a08703b18b2d85
Gerrit-Change-Number: 8898
Gerrit-PatchSet: 4
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-4168: Adds Oracle-style hint placement for INSERT/UPSERT

2018-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8676 )

Change subject: IMPALA-4168: Adds Oracle-style hint placement for INSERT/UPSERT
..


Patch Set 11: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/8676
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ied7629d70197a0270cdc0853e00cc021fdb4dc20
Gerrit-Change-Number: 8676
Gerrit-PatchSet: 11
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 10 Jan 2018 03:03:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5014: Part 2: Round when casting decimal to timestamp

2018-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8969 )

Change subject: IMPALA-5014: Part 2: Round when casting decimal to timestamp
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1697/


--
To view, visit http://gerrit.cloudera.org:8080/8969
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8fb3a7d976ab980b8572d7e9524850572bad57da
Gerrit-Change-Number: 8969
Gerrit-PatchSet: 2
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Wed, 10 Jan 2018 02:02:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4252: [DOCS] Document min/max filters for Kudu tables

2018-01-09 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8986 )

Change subject: IMPALA-4252: [DOCS] Document min/max filters for Kudu tables
..


Patch Set 1: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/8986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I15d8c952ab5b90e89fdd57640dfb4da882f7ecb2
Gerrit-Change-Number: 8986
Gerrit-PatchSet: 1
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 10 Jan 2018 02:00:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6128: Add support for AES-CTR encryption when spilling to disk

2018-01-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8861 )

Change subject: IMPALA-6128: Add support for AES-CTR encryption when spilling 
to disk
..


Patch Set 7: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8861
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9debc240615dd8cdbf00ec8730cff62ffef52aff
Gerrit-Change-Number: 8861
Gerrit-PatchSet: 7
Gerrit-Owner: Xianda Ke 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Xianda Ke 
Gerrit-Comment-Date: Wed, 10 Jan 2018 01:56:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5014: Part 2: Round when casting decimal to timestamp

2018-01-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8969 )

Change subject: IMPALA-5014: Part 2: Round when casting decimal to timestamp
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8969
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8fb3a7d976ab980b8572d7e9524850572bad57da
Gerrit-Change-Number: 8969
Gerrit-PatchSet: 2
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Wed, 10 Jan 2018 01:57:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6128: Add support for AES-CTR encryption when spilling to disk

2018-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8861 )

Change subject: IMPALA-6128: Add support for AES-CTR encryption when spilling 
to disk
..


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1696/


--
To view, visit http://gerrit.cloudera.org:8080/8861
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9debc240615dd8cdbf00ec8730cff62ffef52aff
Gerrit-Change-Number: 8861
Gerrit-PatchSet: 7
Gerrit-Owner: Xianda Ke 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Xianda Ke 
Gerrit-Comment-Date: Wed, 10 Jan 2018 01:56:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6128: Add support for AES-CTR encryption when spilling to disk

2018-01-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#7) to the change originally 
created by Xianda Ke. ( http://gerrit.cloudera.org:8080/8861 )

Change subject: IMPALA-6128: Add support for AES-CTR encryption when spilling 
to disk
..

IMPALA-6128: Add support for AES-CTR encryption when spilling to disk

CFB mode is a stream cipher and is secure when used with a different nonce/IV
for every message. However it can be a performance bottleneck.
CTR mode is also stream cipher and is secure, 4~6x faster than CFB mode in
OpenSSL. AES-CTR+SHA256 is about 40~70% faster than AES-CFB+SHA256.

CTR mode is used if OpenSSL version>=1.0.1 at runtime, otherwise
fall back to using CFB mode.

Testing:
run runtime tmp-file-mgr-test, openssl-util-test, buffer-pool-test and
buffered-tuple-stream-test
The ut case openssl-util-test.EncryptInPlace tests encryption in both modes.

Change-Id: I9debc240615dd8cdbf00ec8730cff62ffef52aff
---
M be/src/runtime/tmp-file-mgr.cc
M be/src/util/openssl-util-test.cc
M be/src/util/openssl-util.cc
M be/src/util/openssl-util.h
4 files changed, 94 insertions(+), 44 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/8861/7
--
To view, visit http://gerrit.cloudera.org:8080/8861
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9debc240615dd8cdbf00ec8730cff62ffef52aff
Gerrit-Change-Number: 8861
Gerrit-PatchSet: 7
Gerrit-Owner: Xianda Ke 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Xianda Ke 


[Impala-ASF-CR] IMPALA-5014: Part 2: Round when casting decimal to timestamp

2018-01-09 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/8969 )

Change subject: IMPALA-5014: Part 2: Round when casting decimal to timestamp
..

IMPALA-5014: Part 2: Round when casting decimal to timestamp

When there are too many digits to the right of the dot in a decimal, we
would always truncate when casting to timestamp. In this patch we change
the behavior to round instead of truncating when decimal_v2 is enabled.

Testing:
- Added some EE tests, ran BE tests on my machine.

Change-Id: I8fb3a7d976ab980b8572d7e9524850572bad57da
---
M be/src/exprs/decimal-operators-ir.cc
M be/src/exprs/decimal-operators.h
M be/src/runtime/timestamp-test.cc
M be/src/runtime/timestamp-value.h
M testdata/workloads/functional-query/queries/QueryTest/decimal-exprs.test
5 files changed, 83 insertions(+), 15 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/69/8969/2
--
To view, visit http://gerrit.cloudera.org:8080/8969
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8fb3a7d976ab980b8572d7e9524850572bad57da
Gerrit-Change-Number: 8969
Gerrit-PatchSet: 2
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-6290: limit ScannerContext to 1 buffer at a time

2018-01-09 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8814 )

Change subject: IMPALA-6290: limit ScannerContext to 1 buffer at a time
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8814/6/be/src/exec/scanner-context.cc
File be/src/exec/scanner-context.cc:

http://gerrit.cloudera.org:8080/#/c/8814/6/be/src/exec/scanner-context.cc@99
PS6, Line 99: done ||
> That's only true if the caller read all the way to the end of the stream. T
Actually, doesn't line 93 always set io_buffer_pos_ to 0 when done is true ?

So, I don't understand why this cannot be simplified to if (io_buffer_ != 
nullptr && io_buffer_bytes_left_ == 0) ?



--
To view, visit http://gerrit.cloudera.org:8080/8814
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74c5960a75f7d88b0e1de4199af731fb13e592f0
Gerrit-Change-Number: 8814
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Wed, 10 Jan 2018 01:24:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4323: "SET ROW FORMAT" option added to "ALTER TABLE" command

2018-01-09 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8928 )

Change subject: IMPALA-4323: "SET ROW FORMAT" option added to "ALTER TABLE" 
command
..


Patch Set 4:

(16 comments)

http://gerrit.cloudera.org:8080/#/c/8928/3/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

http://gerrit.cloudera.org:8080/#/c/8928/3/fe/src/main/cup/sql-parser.cup@1007
PS3, Line 1007: // row_format_val cannot be used as it conflicts with 
cache_op_val.
> When I dump the grammar, the following two lines seem to be the problem:
I think they conflict because cache_opt_val and row_format_val both accept 
empty.

Take a look at the hack in L1015. Maybe we can address this issue by 
distinguishing between optional and non-optional cache val and row format 
productions.


http://gerrit.cloudera.org:8080/#/c/8928/4/fe/src/main/java/org/apache/impala/analysis/AlterTableSetRowFormatStmt.java
File 
fe/src/main/java/org/apache/impala/analysis/AlterTableSetRowFormatStmt.java:

http://gerrit.cloudera.org:8080/#/c/8928/4/fe/src/main/java/org/apache/impala/analysis/AlterTableSetRowFormatStmt.java@72
PS4, Line 72:   affectedPartitions = ((HdfsTable) tbl).getPartitions();
I think we should allow changing the table-level row format even if one of the 
partitions has an incompatible file format.

My understanding is that partitions generally adopt the file format (and row 
format) specified in at the table level unless explicitly overridden. The 
table-level row format should only be relevant for those child partitions that 
have a compatible format.

Allowing that alteration seems more flexible and consistent with allowing 
mixed-format partitions in the first place.

For example, the user could get into the same state by altering the table-level 
row format first and then altering the partition formats. It's not clear why 
altering the partition formats first and then the table-level row format should 
not work.


http://gerrit.cloudera.org:8080/#/c/8928/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/8928/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@460
PS4, Line 460:   reloadFileMetadata = alterTableSetRowFormat(tbl,
I wonder why we need to reload the file metadata for this operation and SET 
FILEFORMAT. Maybe this flag is poorly named or it's use is ill-defined in 
HdfsTable.load().

Do you know why we need to set reloadFileMetadata to true sometimes?


http://gerrit.cloudera.org:8080/#/c/8928/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2266
PS4, Line 2266:* changing the row format the table metadata is marked as 
invalid and will be
I don't think this part is correct. If we return true the file metadata is 
reloaded as part of this alteration operation, and not when the table is 
accessed next.

The comment in alterTableSetFileFormat() is also wrong.


http://gerrit.cloudera.org:8080/#/c/8928/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2267
PS4, Line 2267:* reloaded on the next access.  The return value is true if 
the fileMetadata needs
Why would we even need to reload the file metadata?


http://gerrit.cloudera.org:8080/#/c/8928/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2285
PS4, Line 2285: if (tbl instanceof HdfsTable)
style nit: we always use { } for multi-line if


http://gerrit.cloudera.org:8080/#/c/8928/4/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2290
PS4, Line 2290: Preconditions.checkArgument(tbl instanceof HdfsTable);
This can go at the top if this function.


http://gerrit.cloudera.org:8080/#/c/8928/4/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java:

http://gerrit.cloudera.org:8080/#/c/8928/4/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@471
PS4, Line 471: AnalyzesOk("alter table functional_seq.alltypes set row 
format delimited " +
also tests for the PARTITION clause


http://gerrit.cloudera.org:8080/#/c/8928/4/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@473
PS4, Line 473: AnalysisError("alter table functional_kudu.alltypes set row 
format delimited " +
test the PARTITION clause on an unpartitioned HDFS table


http://gerrit.cloudera.org:8080/#/c/8928/4/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@476
PS4, Line 476: AnalysisError("alter table functional_parquet.alltypes set 
row format delimited " +
write these negative tests in a loop over a list of databases


http://gerrit.cloudera.org:8080/#/c/8928/4/testdata/workloads/functional-query/queries/QueryTest/alter-table.test
File testdata/workloads/functional-query/queries/QueryTest/alter-table.test:


[Impala-ASF-CR] IMPALA-6330, IMPALA-5702: Avoid boost's trim() to workaround crash after dynamic linking.

2018-01-09 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/ )

Change subject: IMPALA-6330, IMPALA-5702: Avoid boost's trim() to workaround 
crash after dynamic linking.
..


Patch Set 2:

(1 comment)

The change LGTM. Please confirm the answer for the question below and I can +2 
it.

http://gerrit.cloudera.org:8080/#/c//2/be/src/util/mem-info.cc
File be/src/util/mem-info.cc:

http://gerrit.cloudera.org:8080/#/c//2/be/src/util/mem-info.cc@130
PS2, Line 130: " "
Sorry to keep picking on this but does boost::trim() also consider "tab" as 
space or is it purely space character ? It's probably worth double checking 
just in case.



--
To view, visit http://gerrit.cloudera.org:8080/
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8dd807f869a9359d991ba515177fb2298054520e
Gerrit-Change-Number: 
Gerrit-PatchSet: 2
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Wed, 10 Jan 2018 01:01:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4993: extend dictionary filtering to collections

2018-01-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8775 )

Change subject: IMPALA-4993: extend dictionary filtering to collections
..


Patch Set 7: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8775/6/be/src/exec/hdfs-parquet-scanner.h
File be/src/exec/hdfs-parquet-scanner.h:

http://gerrit.cloudera.org:8080/#/c/8775/6/be/src/exec/hdfs-parquet-scanner.h@462
PS6, Line 462:   /// Memory used to store the tuples used for dictionary 
filtering. Tuples owned by
> Replaced with MemPool.
A comment in ScopedBuffer would be helpful. The only time I think it is really 
the right thing to use is when it's a temporary buffer that does not share a 
lifetime with any other allocations - if there are multiple allocations with 
the same lifetime, MemPool is preferred.

I think there was a bit of scope creep (no pun intended) where the abstraction 
was added for a fairly narrow purpose but it's expanded into neighbouring code.

2) would be good unless there's some complication with doing so that I'm not 
sure (entirely possible).



--
To view, visit http://gerrit.cloudera.org:8080/8775
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If3a2abcfc3d0f7d18756816659fed77ce12668dd
Gerrit-Change-Number: 8775
Gerrit-PatchSet: 7
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 10 Jan 2018 00:54:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6059: Enhance ltrim()/rtrim() functions to trim any set of characters.

2018-01-09 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8349 )

Change subject: IMPALA-6059: Enhance ltrim()/rtrim() functions to trim any set 
of characters.
..


Patch Set 7:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/8349/7/be/src/exprs/string-functions-ir.cc
File be/src/exprs/string-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/8349/7/be/src/exprs/string-functions-ir.cc@441
PS7, Line 441: IS_CONSTANT
We can possibly get rid of this after IMPALA-6380 is fixed.


http://gerrit.cloudera.org:8080/#/c/8349/7/be/src/exprs/string-functions-ir.cc@443
PS7, Line 443: chars_to_trim.is_null
As discussed offline, we cannot make any assumption about the length of 
StringVal which is null.


http://gerrit.cloudera.org:8080/#/c/8349/7/be/src/exprs/string-functions.h
File be/src/exprs/string-functions.h:

http://gerrit.cloudera.org:8080/#/c/8349/7/be/src/exprs/string-functions.h@87
PS7, Line 87:   static void TrimPrepare(FunctionContext*, 
FunctionContext::FunctionStateScope);
A quick comment on what this Prepare() function do would be useful.


http://gerrit.cloudera.org:8080/#/c/8349/7/be/src/exprs/string-functions.h@149
PS7, Line 149: template 
Please add a comment on what these template parameters stand for.



--
To view, visit http://gerrit.cloudera.org:8080/8349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8a5ae3f59762e70c3268a01e14ed57a9e36b8d79
Gerrit-Change-Number: 8349
Gerrit-PatchSet: 7
Gerrit-Owner: Zoram Thanga 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Wed, 10 Jan 2018 00:48:46 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5014: Part 2: Round when casting decimal to timestamp

2018-01-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8969 )

Change subject: IMPALA-5014: Part 2: Round when casting decimal to timestamp
..


Patch Set 1: Code-Review+1

(1 comment)

One minor suggestion.

http://gerrit.cloudera.org:8080/#/c/8969/1/be/src/exprs/decimal-operators-ir.cc
File be/src/exprs/decimal-operators-ir.cc:

http://gerrit.cloudera.org:8080/#/c/8969/1/be/src/exprs/decimal-operators-ir.cc@633
PS1, Line 633: FromUnixTimeNanos
FromUnixTimeNanos doesn't explicitly document where it supports nanoseconds >= 
1e9. Seems worth extending its comment and adding a simple test to 
timestamp-test.



--
To view, visit http://gerrit.cloudera.org:8080/8969
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8fb3a7d976ab980b8572d7e9524850572bad57da
Gerrit-Change-Number: 8969
Gerrit-PatchSet: 1
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Wed, 10 Jan 2018 00:30:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4993: extend dictionary filtering to collections

2018-01-09 Thread Vuk Ercegovac (Code Review)
Hello Lars Volker, Tim Armstrong, Alex Behm,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8775

to look at the new patch set (#7).

Change subject: IMPALA-4993: extend dictionary filtering to collections
..

IMPALA-4993: extend dictionary filtering to collections

Currently, top-level scalar columns in parquet files can
be used at runtime to prune row-groups by evaluating certain
conjuncts over the column's dictionary (if available).

This change extends such pruning to scalar values that are
stored in collection type columns. Currently, dictionary
pruning works by finding eligible conjuncts for top-level
slots. Since only top-level slots are supported, the slots
are implicitly part of the scan node's tuple descriptor.
With this change, we track eligible conjuncts by slot as well
as the tuple that contains the slot (either top-level or
nested collection). Since collection conjuncts are already
managed by a map that associates tuple descriptors to a list
of their conjuncts, this extension follows the existing
representation.

The frontend builds the mapping of  to
conjuncts that are dictionary filterable. The backend is
adjusted to use the same representation. In addition, collection
readers are decomposed into scalar filterable columns and
other, non-dictionary filterable readers. When filtering
a row group using a conjunct associated to a (possibly)
nested collection type, an additional tuple buffer is
allocated per tuple descriptor.

Testing:
- e2e test extended to illustrate row-groups that are pruned
  by nested collection dictionary filters.

Change-Id: If3a2abcfc3d0f7d18756816659fed77ce12668dd
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/parquet-column-readers.cc
M be/src/runtime/collection-value-builder.h
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
M testdata/workloads/functional-query/queries/QueryTest/parquet-filtering.test
14 files changed, 659 insertions(+), 210 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/8775/7
--
To view, visit http://gerrit.cloudera.org:8080/8775
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If3a2abcfc3d0f7d18756816659fed77ce12668dd
Gerrit-Change-Number: 8775
Gerrit-PatchSet: 7
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-4993: extend dictionary filtering to collections

2018-01-09 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8775 )

Change subject: IMPALA-4993: extend dictionary filtering to collections
..


Patch Set 6:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/8775/6/be/src/exec/hdfs-parquet-scanner.h
File be/src/exec/hdfs-parquet-scanner.h:

http://gerrit.cloudera.org:8080/#/c/8775/6/be/src/exec/hdfs-parquet-scanner.h@462
PS6, Line 462:   std::unordered_map>
> It would make more sense at this point to use a MemPool instead of a prolif
Replaced with MemPool.

It was unclear to me whether to use MemPool or ScopedBuffer for this case (I 
didn't see anything in ScopedBuffer to point me in this direction). Let me know 
if you'd like:
1) a blurb in ScopedBuffer that refers to MemPool for the right use-cases
2) min_max_tuple_buffer also replaced to use MemPool.


http://gerrit.cloudera.org:8080/#/c/8775/6/be/src/exec/hdfs-parquet-scanner.h@649
PS6, Line 649:   /// Gets the TupleDescriptor of slot_desc.
> Mention that 'slot_desc' can belong to the top-level tuple or a tuple neste
Done


http://gerrit.cloudera.org:8080/#/c/8775/6/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8775/6/be/src/exec/hdfs-parquet-scanner.cc@771
PS6, Line 771:   if (!col_reader->IsCollectionReader()) {
> nit: I think this would be easier to follow if we reversed the branches and
Done


http://gerrit.cloudera.org:8080/#/c/8775/6/be/src/exec/hdfs-parquet-scanner.cc@815
PS6, Line 815:   for (auto* col_reader : column_readers_) {
> nit: could fit on one line now
Done


http://gerrit.cloudera.org:8080/#/c/8775/6/be/src/exec/hdfs-parquet-scanner.cc@1657
PS6, Line 1657:   if (column_readers.empty()) return Status::OK();
> Is the early exit necessary for correctness? Might be worth mentioning if i
removed.



--
To view, visit http://gerrit.cloudera.org:8080/8775
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If3a2abcfc3d0f7d18756816659fed77ce12668dd
Gerrit-Change-Number: 8775
Gerrit-PatchSet: 6
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Tue, 09 Jan 2018 23:58:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6330, IMPALA-5702: Avoid boost's trim() to workaround crash after dynamic linking.

2018-01-09 Thread Philip Zeyliger (Code Review)
Hello Michael Ho, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/

to look at the new patch set (#2).

Change subject: IMPALA-6330, IMPALA-5702: Avoid boost's trim() to workaround 
crash after dynamic linking.
..

IMPALA-6330, IMPALA-5702: Avoid boost's trim() to workaround crash after 
dynamic linking.

Replaces boost::algorithm::trim() with std::string methods when parsing
/proc/self/smaps and adds a trivial unit test for MemInfo::ParseSmaps().

I did *not* replace other uses of trim() with equivalents from
be/src/gutil/strings/strip.h at this moment.

The backstory here is that
TestAdmissionControllerStress::test_admission_controller_with_flags
fails occasionally on dynamically linked builds of Impala. I was able
to reproduce the failure reliably (within 3 tries) with the following:

  $ ./buildall.sh -notests -so -noclean
  $ bin/start-impala-cluster.py  
--impalad_args="--memory_maintenance_sleep_time_ms=1"
  $ impala-shell.sh --query 'select max(t.c1), avg(t.c2), min(t.c3), avg(c4), 
avg(c5), avg(c6) from (select max(tinyint_col) over (order by int_col) c1, 
avg(tinyint_col) over (order by smallint_col) c2, min(tinyint_col) over (order 
by smallint_col desc) c3, rank() over (order by int_col desc) c4, dense_rank() 
over (order by bigint_col) c5, first_value(tinyint_col) over (order by 
bigint_col desc) c6 from functional.alltypes) t;'

The stack trace looks like:

  (gdb) bt
  #0  0x7fe230df2428 in __GI_raise (sig=sig@entry=6) at 
../sysdeps/unix/sysv/linux/raise.c:54
  #1  0x7fe230df402a in __GI_abort () at abort.c:89
  #2  0x7fe23312026d in __gnu_cxx::__verbose_terminate_handler() () at 
../../../../gcc-4.9.2/libstdc++-v3/libsupc++/vterminate.cc:95
  #3  0x7fe2330d8b66 in __cxxabiv1::__terminate(void (*)()) 
(handler=) at 
../../../../gcc-4.9.2/libstdc++-v3/libsupc++/eh_terminate.cc:47
  #4  0x7fe2330d8bb1 in std::terminate() () at 
../../../../gcc-4.9.2/libstdc++-v3/libsupc++/eh_terminate.cc:57
  #5  0x7fe2330d8cb8 in __cxxabiv1::__cxa_throw(void*, std::type_info*, 
void (*)(void*)) (obj=0x8e54080, tinfo=0x7fe233356210 , dest=0x7fe23311ea70 ) at 
../../../../gcc-4.9.2/libstdc++-v3/libsupc++/eh_throw.cc:87
  #6  0x7fe233110332 in std::__throw_bad_cast() () at 
../../../../../gcc-4.9.2/libstdc++-v3/src/c++11/functexcept.cc:63
  #7  0x7fe2330e8ad7 in std::use_facet(std::locale 
const&) (__loc=...) at 
/data/jenkins/workspace/verify-impala-toolchain-package-build/label/ec2-package-ubuntu-16-04/toolchain/source/gcc/build-4.9.2/x86_64-unknown-linux-gnu/libstdc++-v3/include/bits/locale_classes.tcc:137
  #8  0x008d2cdf in void 
boost::algorithm::trim(std::string&, std::locale const&) ()
  #9  0x7fe2396d5057 in impala::MemInfo::ParseSmaps() () at 
/home/philip/src/Impala/be/src/util/mem-info.cc:132
  ...

My best theory is that there's a race/bug, wherein the std::locale* static 
initialization
work is getting somehow 'reset' by the dynamic linker, when more libraries are 
linked
in as a result of the query. My evidence to support this theory is scant, but
I do notice that LD_DEBUG=all prints the following when the query is executed
(but not right at startup):

  binding file /home/philip/src/Impala/toolchain/gcc-4.9.2/lib64/libstdc++.so.6 
[0] to
  /home/philip/src/Impala/toolchain/gflags-2.2.0-p1/lib/libgflags.so.2.2 [0]:
  normal symbol `std::locale::facet::_S_destroy_c_locale(__locale_struct*&)'

Note that there are BSS segments for some of std::locale::facet::* inside
of libgflags.so.

  $nm toolchain/gflags-2.2.0-p1/lib/libgflags.so | c++filt | grep facet | grep 
' B '
  002e2d10 B std::locale::facet::_S_c_locale
  002e2d0c B std::locale::facet::_S_once

I'm not the first to run into variants of these issues, though the results
are fairly unhelpful:

  http://www.boost.org/doc/libs/1_58_0/libs/locale/doc/html/faq.html
  
https://stackoverflow.com/questions/26990412/c-boost-crashes-while-using-locale
  https://svn.boost.org/trac10/ticket/4671
  
http://clang-developers.42468.n3.nabble.com/std-use-facet-lt-std-ctype-lt-char-gt-gt-crashes-on-linux-td4033967.html
  
https://unix.stackexchange.com/questions/719/can-we-get-compiler-information-from-an-elf-binary
  
https://stackoverflow.com/questions/42376100/linking-with-library-causes-collate-facet-to-be-missing-from-char
  http://lists.llvm.org/pipermail/cfe-dev/2012-July/023289.html
  https://gcc.gnu.org/ml/libstdc++/2014-11/msg00122.html

Change-Id: I8dd807f869a9359d991ba515177fb2298054520e
---
M be/src/util/mem-info.cc
M be/src/util/proc-info-test.cc
2 files changed, 14 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88//2
--
To view, visit http://gerrit.cloudera.org:8080/
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: 

[Impala-ASF-CR] IMPALA-6330, IMPALA-5702: Avoid boost's trim() to workaround crash after dynamic linking.

2018-01-09 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/ )

Change subject: IMPALA-6330, IMPALA-5702: Avoid boost's trim() to workaround 
crash after dynamic linking.
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c//1/be/src/util/mem-info.cc
File be/src/util/mem-info.cc:

http://gerrit.cloudera.org:8080/#/c//1/be/src/util/mem-info.cc@131
PS1, Line 131: line.find_first_not_of(" ", colon_pos + 1)
> Can this return string::npos ? Will that be a reasonable input to line.subs
It can. I wrote a quick test program and it crashed.

Good catch.



--
To view, visit http://gerrit.cloudera.org:8080/
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8dd807f869a9359d991ba515177fb2298054520e
Gerrit-Change-Number: 
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Tue, 09 Jan 2018 23:52:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5014: Part 2: Round when casting decimal to timestamp

2018-01-09 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8969 )

Change subject: IMPALA-5014: Part 2: Round when casting decimal to timestamp
..


Patch Set 1: Code-Review+2

Took a look at the tests, everything looks good.  Feel free to wait if you want 
another pair of eyes but I can't find anything that needs changes or think of 
more useful tests.


--
To view, visit http://gerrit.cloudera.org:8080/8969
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8fb3a7d976ab980b8572d7e9524850572bad57da
Gerrit-Change-Number: 8969
Gerrit-PatchSet: 1
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Tue, 09 Jan 2018 23:49:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4252: [DOCS] Document min/max filters for Kudu tables

2018-01-09 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8986 )

Change subject: IMPALA-4252: [DOCS] Document min/max filters for Kudu tables
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8986/1/docs/topics/impala_runtime_filtering.xml
File docs/topics/impala_runtime_filtering.xml:

http://gerrit.cloudera.org:8080/#/c/8986/1/docs/topics/impala_runtime_filtering.xml@173
PS1, Line 173: . (The probability-based aspects means that the
 : filter
Maybe note here that bloom filters are only for HDFS target scans, and that we 
do min-max for Kudu scans.

Of course, either type might include some non-matching values.


http://gerrit.cloudera.org:8080/#/c/8986/1/docs/topics/impala_runtime_filtering.xml@335
PS1, Line 335:
Note here: setting EXPLAIN_LEVEL=2 will display the type of filter in the 
format:
filter_id[type] -> table
filter_id[type] <- table
where type is either bloom or min_max



--
To view, visit http://gerrit.cloudera.org:8080/8986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I15d8c952ab5b90e89fdd57640dfb4da882f7ecb2
Gerrit-Change-Number: 8986
Gerrit-PatchSet: 1
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Tue, 09 Jan 2018 23:44:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5014: Part 2: Round when casting decimal to timestamp

2018-01-09 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8969 )

Change subject: IMPALA-5014: Part 2: Round when casting decimal to timestamp
..


Patch Set 1: Code-Review+1

I'm still going through the test cases.

For a timestamp related diff, this looks remarkably clean and simple.


--
To view, visit http://gerrit.cloudera.org:8080/8969
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8fb3a7d976ab980b8572d7e9524850572bad57da
Gerrit-Change-Number: 8969
Gerrit-PatchSet: 1
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Tue, 09 Jan 2018 23:42:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4168: Adds Oracle-style hint placement for INSERT/UPSERT

2018-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8676 )

Change subject: IMPALA-4168: Adds Oracle-style hint placement for INSERT/UPSERT
..


Patch Set 11:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1694/


--
To view, visit http://gerrit.cloudera.org:8080/8676
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ied7629d70197a0270cdc0853e00cc021fdb4dc20
Gerrit-Change-Number: 8676
Gerrit-PatchSet: 11
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Tue, 09 Jan 2018 23:18:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6231: Implement decimal v2 fuzz test

2018-01-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8898 )

Change subject: IMPALA-6231: Implement decimal_v2 fuzz test
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1693/


--
To view, visit http://gerrit.cloudera.org:8080/8898
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4328125de5c583ec8ead1f78d9a08703b18b2d85
Gerrit-Change-Number: 8898
Gerrit-PatchSet: 3
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Tue, 09 Jan 2018 23:17:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4168: Adds Oracle-style hint placement for INSERT/UPSERT

2018-01-09 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8676 )

Change subject: IMPALA-4168: Adds Oracle-style hint placement for INSERT/UPSERT
..


Patch Set 11: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8676
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ied7629d70197a0270cdc0853e00cc021fdb4dc20
Gerrit-Change-Number: 8676
Gerrit-PatchSet: 11
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Tue, 09 Jan 2018 23:17:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4168: Adds Oracle-style hint placement for INSERT/UPSERT

2018-01-09 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8676 )

Change subject: IMPALA-4168: Adds Oracle-style hint placement for INSERT/UPSERT
..


Patch Set 10: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8676
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ied7629d70197a0270cdc0853e00cc021fdb4dc20
Gerrit-Change-Number: 8676
Gerrit-PatchSet: 10
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Tue, 09 Jan 2018 23:17:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5478: Run TPCDS queries with decimal v2 enabled

2018-01-09 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/8985 )

Change subject: IMPALA-5478: Run TPCDS queries with decimal_v2 enabled
..

IMPALA-5478: Run TPCDS queries with decimal_v2 enabled

We add new TPCDS .test files that are expected to be run with decimal_v2
enabled. The new expected results were generated using Impala and I
inspected them manually.

Change-Id: Ib867c51a521ec4a087bc127d99aee4b95ba97733
---
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q1.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q10a.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q11.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q12.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q13.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q15.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q16.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q17.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q18a.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q19.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q2.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q20.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q21.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q22a.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q25.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q29.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q3.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q32.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q33.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q34.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q37.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q39-1.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q39-2.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q4.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q40.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q41.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q42.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q43.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q46.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q50.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q51.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q51a.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q52.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q53.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q54.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q55.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q56.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q6.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q60.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q61.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q62.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q64.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q65.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q67a.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q68.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q69.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q7.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q70a.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q71.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q72.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q73.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q74.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q75.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q76.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q77a.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q78.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q79.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q8.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q80a.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q81.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q82.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q84.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q86a.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q88.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q91.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q92.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q94.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q95.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q96.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q97.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q98.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q99.test
M tests/common/test_vector.py
M tests/query_test/test_tpcds_queries.py
74 files 

[Impala-ASF-CR] IMPALA-6307: CTAS statement fails with duplicate column exception.

2018-01-09 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8930 )

Change subject: IMPALA-6307: CTAS statement fails with duplicate column 
exception.
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8930/3/fe/src/main/java/org/apache/impala/analysis/TableDataLayout.java
File fe/src/main/java/org/apache/impala/analysis/TableDataLayout.java:

http://gerrit.cloudera.org:8080/#/c/8930/3/fe/src/main/java/org/apache/impala/analysis/TableDataLayout.java@58
PS3, Line 58: kuduPartitionParams_.clear();
This is new. Do we have any existing tests with a CTAS on Kudu where the select 
gets rewritten? You may want to verify that nothing breaks.



--
To view, visit http://gerrit.cloudera.org:8080/8930
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee053abecd4384e15eec8db10cb06f5ace159da2
Gerrit-Change-Number: 8930
Gerrit-PatchSet: 3
Gerrit-Owner: Zoram Thanga 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Tue, 09 Jan 2018 23:16:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5654: Disallow setting Kudu table name in CREATE TABLE

2018-01-09 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8820 )

Change subject: IMPALA-5654: Disallow setting Kudu table name in CREATE TABLE
..


Patch Set 10:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/8820/10/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java:

http://gerrit.cloudera.org:8080/#/c/8820/10/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java@347
PS10, Line 347: Puts the Kudu table name to the generatedKuduTableNameProperty_ 
using the
  :* current database name and the provided table name
Maybe "Generates a Kudu table name based on the target database and table and 
stores it in TableDef.generatedKuduTableName_."


http://gerrit.cloudera.org:8080/#/c/8820/10/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java:

http://gerrit.cloudera.org:8080/#/c/8820/10/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@2043
PS10, Line 2043: AnalysisError
Do you have a positive test where you set both the table name and EXTERNAL=true 
through an ALTER TABLE statement?


http://gerrit.cloudera.org:8080/#/c/8820/10/testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
File testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test:

http://gerrit.cloudera.org:8080/#/c/8820/10/testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test@347
PS10, Line 347: # Rename the underlying Kudu table is not allowed for managed 
Kudu tables
  : alter table tbl_to_alter set 
tblproperties('kudu.table_name'='kudu_tbl_to_alter')
  :  CATCH
  : AnalysisException: Not allowed to set 'kudu.table_name' 
manually for managed Kudu tables
This is kind of redundant I think with the other analysis tests, remove?


http://gerrit.cloudera.org:8080/#/c/8820/9/tests/query_test/test_kudu.py
File tests/query_test/test_kudu.py:

http://gerrit.cloudera.org:8080/#/c/8820/9/tests/query_test/test_kudu.py@327
PS9, Line 327: try:
 :   cursor.execute("ALTER TABLE %s.foo SET 
TBLPROPERTIES('kudu.table_name'='blah')"
 :   % (unique_database))
 : except Exception as e:
 :   expected_error = "AnalysisException: Not allowed to set 
'kudu.table_name' " + \
 :   "manually for managed Kudu tables"
 :   assert expected_error in str(e)
> There is also a test for this in AnalyzeDDLTest.java. Do you think this one
Yes, if it is testing the same thing.



--
To view, visit http://gerrit.cloudera.org:8080/8820
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ieca037498abf8f5fde67b77e824b720482cdbe6f
Gerrit-Change-Number: 8820
Gerrit-PatchSet: 10
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 09 Jan 2018 23:12:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6128: Add support for AES-CTR encryption when spilling to disk CFB mode is a stream cipher and is secure when used with a different nonce/IV for every message. However it can be

2018-01-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8861 )

Change subject: IMPALA-6128: Add support for AES-CTR encryption when spilling 
to disk CFB mode is a stream cipher and is secure when used with a different 
nonce/IV for every message. However it can be a performance bottleneck. CTR 
mode is also stream cipher and is secure
..


Patch Set 6:

The merge failed with an Impala daemon crashing. I wasn't able to get a stack 
from the job so rerunning tests elsewhere to see if I can repro the crash. If 
that's clean I'll rerun the merge.


--
To view, visit http://gerrit.cloudera.org:8080/8861
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9debc240615dd8cdbf00ec8730cff62ffef52aff
Gerrit-Change-Number: 8861
Gerrit-PatchSet: 6
Gerrit-Owner: Xianda Ke 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Xianda Ke 
Gerrit-Comment-Date: Tue, 09 Jan 2018 23:10:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6231: Implement decimal v2 fuzz test

2018-01-09 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8898 )

Change subject: IMPALA-6231: Implement decimal_v2 fuzz test
..


Patch Set 3: Code-Review+2

Looks great!


--
To view, visit http://gerrit.cloudera.org:8080/8898
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4328125de5c583ec8ead1f78d9a08703b18b2d85
Gerrit-Change-Number: 8898
Gerrit-PatchSet: 3
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Tue, 09 Jan 2018 23:11:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] [DOCS] Recommend using Kudu Java API for rapid DMLs

2018-01-09 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8976 )

Change subject: [DOCS] Recommend using Kudu Java API for rapid DMLs
..


Patch Set 1: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/8976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0098f0c3d5d07c89e6bb589c4c04edce300c1ad3
Gerrit-Change-Number: 8976
Gerrit-PatchSet: 1
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Jean-Daniel Cryans 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Tue, 09 Jan 2018 23:09:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6290: limit ScannerContext to 1 buffer at a time

2018-01-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8814 )

Change subject: IMPALA-6290: limit ScannerContext to 1 buffer at a time
..


Patch Set 6:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/8814/6/be/src/exec/scanner-context.h
File be/src/exec/scanner-context.h:

http://gerrit.cloudera.org:8080/#/c/8814/6/be/src/exec/scanner-context.h@278
PS6, Line 278: buffer
> nit: buffers
Done


http://gerrit.cloudera.org:8080/#/c/8814/6/be/src/exec/scanner-context.h@286
PS6, Line 286: /// to point at the boundary buffer variables. Returns an 
error if the boundary
> This function also advances io_buffer_pos_ and io_buffer_bytes_left_, right
Done


http://gerrit.cloudera.org:8080/#/c/8814/6/be/src/exec/scanner-context.cc
File be/src/exec/scanner-context.cc:

http://gerrit.cloudera.org:8080/#/c/8814/6/be/src/exec/scanner-context.cc@99
PS6, Line 99: done ||
> It appears that done == true implies io_buffer_bytes_left_ == 0 in the new
That's only true if the caller read all the way to the end of the stream. 
That's not true if it hit an error or was cancelled early.


http://gerrit.cloudera.org:8080/#/c/8814/6/be/src/exec/scanner-context.cc@239
PS6, Line 239: if (boundary_buffer_bytes_left_ >= requested_len) break;
> Not sure I understand why this is needed in the new change ? Wouldn't the w
CopyIoToBoundary() copies over some or all of the I/O buffer increments 
boundary_buffer_bytes_left_. We don't want to get a new I/O buffer in the case 
where we copied over only part of the I/O buffer.


http://gerrit.cloudera.org:8080/#/c/8814/6/be/src/exec/scanner-context.inline.h
File be/src/exec/scanner-context.inline.h:

http://gerrit.cloudera.org:8080/#/c/8814/6/be/src/exec/scanner-context.inline.h@70
PS6, Line 70: SkipBytes(
> This implementation of SkipBytes is more efficient than GetBytes() as it av
Yeah, exactly. It also seemed dangerous that skipping N bytes had the 
unexpected side-effect of allocating N bytes of data - callers of the API 
wouldn't expect that.


http://gerrit.cloudera.org:8080/#/c/8814/6/be/src/exec/scanner-context.inline.h@84
PS6, Line 84: if (bytes_left == 0) return true;
> So, if boundary_buffer_bytes_left_ == 0 && io_buffer_bytes_left_ == 0 at th
Done


http://gerrit.cloudera.org:8080/#/c/8814/6/be/src/exec/scanner-context.inline.h@187
PS6, Line 187:   *buffer_pos += bytes;
> DCHECK_LE(bytes, *buffer_bytes_left);
Done



--
To view, visit http://gerrit.cloudera.org:8080/8814
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74c5960a75f7d88b0e1de4199af731fb13e592f0
Gerrit-Change-Number: 8814
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 09 Jan 2018 23:06:07 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6075: Add Impala daemon metric for catalog version.

2018-01-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8949 )

Change subject: IMPALA-6075: Add Impala daemon metric for catalog version.
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8949/1/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/8949/1/be/src/service/impala-server.cc@1514
PS1, Line 1514:   
ImpaladMetrics::CATALOG_CURR_VER->set_value(min_req_catalog_version);
I think this may not be necessary?



--
To view, visit http://gerrit.cloudera.org:8080/8949
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id2307eb434561ed74ff058106541c0ebda017d97
Gerrit-Change-Number: 8949
Gerrit-PatchSet: 1
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 09 Jan 2018 23:05:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6290: limit ScannerContext to 1 buffer at a time

2018-01-09 Thread Tim Armstrong (Code Review)
Hello Michael Ho, Lars Volker,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8814

to look at the new patch set (#8).

Change subject: IMPALA-6290: limit ScannerContext to 1 buffer at a time
..

IMPALA-6290: limit ScannerContext to 1 buffer at a time

This is a prerequisite for constraining the number of buffers per scan
range. Before this patch, calling ReadBytes(), SkipBytes(), etc could
cause an arbitrary number of I/O buffers to accumulate in
'completed_io_buffers_'. E.g. if we allocated 3 * 8MB I/O buffers for
a range and then called ReadBytes(30MB), we would hit resource
exhaustion as soon as 3 buffers were accumulated in
'completed_io_buffers_'.

The fix is to avoid accumulating any buffers in 'completed_io_buffers_'.
Instead of adding them to 'completed_io_buffers_', completed buffers
are just returned to the I/O manager. It turned out that this did not
weaken the ScannerContext's guarantees about memory lifetime, because
ScannerContext::GetBytesInternal() cleared 'boundary_buffer_' each
time it was called regardless. I checked that this behaviour wasn't
a bug by inspecting the scanner code. I could not find any cases
where scanners depended on returned memory remaining valid beyond
the next Read*()/Get*()/Skip*() call on the stream.

This change makes that lifetime explicit in the comments. A
side-effect of this fix is that scanners do not need to call
ReleaseCompletedResources() in CommitRows() and means that the
ScannerContext only ever needs to hold one I/O buffer at a time.

This change also reimplements SkipBytes() to avoid it accumulating
memory in the boundary buffer for large skip sizes.

Also clarifies some of the invariants in ScannerContext. E.g. some
places assumed io_buffer_ != NULL, but that is no longer needed.

Testing:
Ran core tests with ASAN and exhaustive tests with DEBUG.

Change-Id: I74c5960a75f7d88b0e1de4199af731fb13e592f0
---
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/parquet-column-readers.cc
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M be/src/exec/scanner-context.inline.h
M be/src/runtime/io/disk-io-mgr.cc
M 
testdata/workloads/functional-query/queries/DataErrorsTest/hdfs-scan-node-errors.test
11 files changed, 321 insertions(+), 210 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/8814/8
--
To view, visit http://gerrit.cloudera.org:8080/8814
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I74c5960a75f7d88b0e1de4199af731fb13e592f0
Gerrit-Change-Number: 8814
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6290: limit ScannerContext to 1 buffer at a time

2018-01-09 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8814 )

Change subject: IMPALA-6290: limit ScannerContext to 1 buffer at a time
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8814/6/be/src/exec/scanner-context.cc
File be/src/exec/scanner-context.cc:

http://gerrit.cloudera.org:8080/#/c/8814/6/be/src/exec/scanner-context.cc@99
PS6, Line 99: done ||
It appears that done == true implies io_buffer_bytes_left_ == 0 in the new 
code. May be we can skip checking for done ?



--
To view, visit http://gerrit.cloudera.org:8080/8814
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74c5960a75f7d88b0e1de4199af731fb13e592f0
Gerrit-Change-Number: 8814
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 09 Jan 2018 22:34:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6290: limit ScannerContext to 1 buffer at a time

2018-01-09 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8814 )

Change subject: IMPALA-6290: limit ScannerContext to 1 buffer at a time
..


Patch Set 6:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/8814/6/be/src/exec/scanner-context.h
File be/src/exec/scanner-context.h:

http://gerrit.cloudera.org:8080/#/c/8814/6/be/src/exec/scanner-context.h@278
PS6, Line 278: buffer
nit: buffers


http://gerrit.cloudera.org:8080/#/c/8814/6/be/src/exec/scanner-context.h@286
PS6, Line 286: /// to point at the boundary buffer variables. Returns an 
error if the boundary
This function also advances io_buffer_pos_ and io_buffer_bytes_left_, right ?


http://gerrit.cloudera.org:8080/#/c/8814/6/be/src/exec/scanner-context.cc
File be/src/exec/scanner-context.cc:

http://gerrit.cloudera.org:8080/#/c/8814/6/be/src/exec/scanner-context.cc@239
PS6, Line 239: if (boundary_buffer_bytes_left_ >= requested_len) break;
Not sure I understand why this is needed in the new change ? Wouldn't the while 
loop condition imply that this is not true ?


http://gerrit.cloudera.org:8080/#/c/8814/6/be/src/exec/scanner-context.inline.h
File be/src/exec/scanner-context.inline.h:

http://gerrit.cloudera.org:8080/#/c/8814/6/be/src/exec/scanner-context.inline.h@70
PS6, Line 70: SkipBytes(
This implementation of SkipBytes is more efficient than GetBytes() as it avoids 
both the memcpy overhead into boundary_buffer and the memory allocation needed 
to hold the skipped bytes, right ?


http://gerrit.cloudera.org:8080/#/c/8814/6/be/src/exec/scanner-context.inline.h@84
PS6, Line 84: if (bytes_left == 0) return true;
So, if boundary_buffer_bytes_left_ == 0 && io_buffer_bytes_left_ == 0 at this 
point, I assume output_buffer_bytes_left_ has to be 0 as 
output_buffer_bytes_left_ should be pointing to boundary_buffer_bytes_left_, 
right ? Can we add the following at line 74 DCHECK(output_buffer_pos_ == 
boundary_buffer_pos_); DCHECK(output_buffer_bytes_left_ == 
boundary_buffer_bytes_left_) ?


http://gerrit.cloudera.org:8080/#/c/8814/6/be/src/exec/scanner-context.inline.h@187
PS6, Line 187:   *buffer_pos += bytes;
DCHECK_LE(bytes, *buffer_bytes_left);



--
To view, visit http://gerrit.cloudera.org:8080/8814
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74c5960a75f7d88b0e1de4199af731fb13e592f0
Gerrit-Change-Number: 8814
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 09 Jan 2018 22:20:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6307: CTAS statement fails with duplicate column exception.

2018-01-09 Thread Zoram Thanga (Code Review)
Hello Michael Ho, Dimitris Tsirogiannis, Alex Behm,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8930

to look at the new patch set (#3).

Change subject: IMPALA-6307: CTAS statement fails with duplicate column 
exception.
..

IMPALA-6307: CTAS statement fails with duplicate column exception.

A CTAS statement with a 'partition by' clause causes the statement
to fail with a duplicate column name exception. This is happening
because on expression rewrite, the partition defs state is not reset.

IMPALA-5796 added TableDef::reset(). This patch expands the method by
adding calls to reset ColumnDefs and PartitionColumnDefs.

Testing:
  * Regression test added to AnalyzeDDLTest.
  * Exhaustive Jenkins build and test.

Change-Id: Iee053abecd4384e15eec8db10cb06f5ace159da2
---
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableDataLayout.java
M fe/src/main/java/org/apache/impala/analysis/TableDef.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
5 files changed, 21 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/8930/3
--
To view, visit http://gerrit.cloudera.org:8080/8930
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iee053abecd4384e15eec8db10cb06f5ace159da2
Gerrit-Change-Number: 8930
Gerrit-PatchSet: 3
Gerrit-Owner: Zoram Thanga 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Zoram Thanga 


[Impala-ASF-CR] IMPALA-6307: CTAS statement fails with duplicate column exception.

2018-01-09 Thread Zoram Thanga (Code Review)
Zoram Thanga has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8930 )

Change subject: IMPALA-6307: CTAS statement fails with duplicate column 
exception.
..


Patch Set 2:

(3 comments)

Thanks! Uploading patch set #3.

http://gerrit.cloudera.org:8080/#/c/8930/2/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
File fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java:

http://gerrit.cloudera.org:8080/#/c/8930/2/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java@408
PS2, Line 408: // of the table that will be created.
> including the  partition columns, if any.
Done


http://gerrit.cloudera.org:8080/#/c/8930/2/fe/src/main/java/org/apache/impala/analysis/TableDataLayout.java
File fe/src/main/java/org/apache/impala/analysis/TableDataLayout.java:

http://gerrit.cloudera.org:8080/#/c/8930/2/fe/src/main/java/org/apache/impala/analysis/TableDataLayout.java@56
PS2, Line 56: partitionColDefs_.clear(); kuduPartitionParams_.clear();
> formatting nit: one statement per line
Fixed.


http://gerrit.cloudera.org:8080/#/c/8930/2/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java:

http://gerrit.cloudera.org:8080/#/c/8930/2/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@1510
PS2, Line 1510: IMPALA-6307: A CTAS query fails with error: AnalysisException:
  : // Duplicate column name: 
> I think it's best to outline what this query is testing, i.e. CTAS with a s
Done



--
To view, visit http://gerrit.cloudera.org:8080/8930
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee053abecd4384e15eec8db10cb06f5ace159da2
Gerrit-Change-Number: 8930
Gerrit-PatchSet: 2
Gerrit-Owner: Zoram Thanga 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Tue, 09 Jan 2018 22:04:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4252: [DOCS] Document min/max filters for Kudu tables

2018-01-09 Thread John Russell (Code Review)
John Russell has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/8986


Change subject: IMPALA-4252: [DOCS] Document min/max filters for Kudu tables
..

IMPALA-4252: [DOCS] Document min/max filters for Kudu tables

Change-Id: I15d8c952ab5b90e89fdd57640dfb4da882f7ecb2
---
M docs/shared/impala_common.xml
M docs/topics/impala_disable_row_runtime_filtering.xml
M docs/topics/impala_kudu.xml
M docs/topics/impala_max_num_runtime_filters.xml
M docs/topics/impala_runtime_bloom_filter_size.xml
M docs/topics/impala_runtime_filter_max_size.xml
M docs/topics/impala_runtime_filter_min_size.xml
M docs/topics/impala_runtime_filtering.xml
8 files changed, 55 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/8986/1
--
To view, visit http://gerrit.cloudera.org:8080/8986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I15d8c952ab5b90e89fdd57640dfb4da882f7ecb2
Gerrit-Change-Number: 8986
Gerrit-PatchSet: 1
Gerrit-Owner: John Russell 


[Impala-ASF-CR] IMPALA-6290: limit ScannerContext to 1 buffer at a time

2018-01-09 Thread Tim Armstrong (Code Review)
Hello Michael Ho, Lars Volker,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8814

to look at the new patch set (#7).

Change subject: IMPALA-6290: limit ScannerContext to 1 buffer at a time
..

IMPALA-6290: limit ScannerContext to 1 buffer at a time

This is a prerequisite for constraining the number of buffers per scan
range. Before this patch, calling ReadBytes(), SkipBytes(), etc could
cause an arbitrary number of I/O buffers to accumulate in
'completed_io_buffers_'. E.g. if we allocated 3 * 8MB I/O buffers for
a range and then called ReadBytes(30MB), we would hit resource
exhaustion as soon as 3 buffers were accumulated in
'completed_io_buffers_'.

The fix is to avoid accumulating any buffers in 'completed_io_buffers_'.
Instead of adding them to 'completed_io_buffers_', completed buffers
are just returned to the I/O manager. It turned out that this did not
weaken the ScannerContext's guarantees about memory lifetime, because
ScannerContext::GetBytesInternal() cleared 'boundary_buffer_' each
time it was called regardless. I checked that this behaviour wasn't
a bug by inspecting the scanner code. I could not find any cases
where scanners depended on returned memory remaining valid beyond
the next Read*()/Get*()/Skip*() call on the stream.

This change makes that lifetime explicit in the comments. A
side-effect of this fix is that scanners do not need to call
ReleaseCompletedResources() in CommitRows() and means that the
ScannerContext only ever needs to hold one I/O buffer at a time.

This change also reimplements SkipBytes() to avoid it accumulating
memory in the boundary buffer for large skip sizes.

Also clarifies some of the invariants in ScannerContext. E.g. some
places assumed io_buffer_ != NULL, but that is no longer needed.

Testing:
Ran core tests with ASAN and exhaustive tests with DEBUG.

Change-Id: I74c5960a75f7d88b0e1de4199af731fb13e592f0
---
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/parquet-column-readers.cc
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M be/src/exec/scanner-context.inline.h
M be/src/runtime/io/disk-io-mgr.cc
M 
testdata/workloads/functional-query/queries/DataErrorsTest/hdfs-scan-node-errors.test
11 files changed, 317 insertions(+), 210 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/8814/7
--
To view, visit http://gerrit.cloudera.org:8080/8814
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I74c5960a75f7d88b0e1de4199af731fb13e592f0
Gerrit-Change-Number: 8814
Gerrit-PatchSet: 7
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6290: limit ScannerContext to 1 buffer at a time

2018-01-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8814 )

Change subject: IMPALA-6290: limit ScannerContext to 1 buffer at a time
..


Patch Set 6:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8814/6/be/src/exec/scanner-context.h
File be/src/exec/scanner-context.h:

http://gerrit.cloudera.org:8080/#/c/8814/6/be/src/exec/scanner-context.h@162
PS6, Line 162: /// Equivalent to java.io.DataInput.readBoolean()
> May help to document what returning false means for the Read*() functions.
Done


http://gerrit.cloudera.org:8080/#/c/8814/6/be/src/exec/scanner-context.cc
File be/src/exec/scanner-context.cc:

http://gerrit.cloudera.org:8080/#/c/8814/6/be/src/exec/scanner-context.cc@94
PS6, Line 94: boundary_buffer_pos_ = nullptr;
: boundary_buffer_bytes_left_ = 0;
: boundary_buffer_->Reset();
> Why weren't we doing it before ?
I don't think it's strictly necessary, since we shouldn't be doing anything 
with the stream once this is called with done=true and the memory gets freed 
with the MemPool, but it seemed worth doing defensively.


http://gerrit.cloudera.org:8080/#/c/8814/6/be/src/exec/scanner-context.cc@99
PS6, Line 99: ReturnIoBuffer()
> I wonder if it warrants poisoning the IOBuffer returned to catch any cases
I added poison of cached I/O buffers, since it's only a few lines of change. 
We'll get it for free in follow-on patches once it's switched to the buffer 
pool but nice to get the test coverage sooner.



--
To view, visit http://gerrit.cloudera.org:8080/8814
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74c5960a75f7d88b0e1de4199af731fb13e592f0
Gerrit-Change-Number: 8814
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 09 Jan 2018 21:51:59 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6307: CTAS statement fails with duplicate column exception.

2018-01-09 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8930 )

Change subject: IMPALA-6307: CTAS statement fails with duplicate column 
exception.
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8930/2/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
File fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java:

http://gerrit.cloudera.org:8080/#/c/8930/2/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java@408
PS2, Line 408: // of the table that will be created.
including the  partition columns, if any.


http://gerrit.cloudera.org:8080/#/c/8930/2/fe/src/main/java/org/apache/impala/analysis/TableDataLayout.java
File fe/src/main/java/org/apache/impala/analysis/TableDataLayout.java:

http://gerrit.cloudera.org:8080/#/c/8930/2/fe/src/main/java/org/apache/impala/analysis/TableDataLayout.java@56
PS2, Line 56: partitionColDefs_.clear(); kuduPartitionParams_.clear();
formatting nit: one statement per line


http://gerrit.cloudera.org:8080/#/c/8930/2/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java:

http://gerrit.cloudera.org:8080/#/c/8930/2/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@1510
PS2, Line 1510: IMPALA-6307: A CTAS query fails with error: AnalysisException:
  : // Duplicate column name: 
I think it's best to outline what this query is testing, i.e. CTAS with a 
select query that requires a rewrite (IMPALA-6307).



--
To view, visit http://gerrit.cloudera.org:8080/8930
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iee053abecd4384e15eec8db10cb06f5ace159da2
Gerrit-Change-Number: 8930
Gerrit-PatchSet: 2
Gerrit-Owner: Zoram Thanga 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Tue, 09 Jan 2018 21:28:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5152: Introduce metadata loading phase

2018-01-09 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8958 )

Change subject: IMPALA-5152: Introduce metadata loading phase
..


Patch Set 1:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@2383
PS1, Line 2383: wit
nit: with


http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/analysis/DropStatsStmt.java
File fe/src/main/java/org/apache/impala/analysis/DropStatsStmt.java:

http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/analysis/DropStatsStmt.java@82
PS1, Line 82: tableName_
other statements assert that tableName is not null on construction. this 
statement doesn't, pls add this assertion or check the condition here.


http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/analysis/DropTableOrViewStmt.java
File fe/src/main/java/org/apache/impala/analysis/DropTableOrViewStmt.java:

http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/analysis/DropTableOrViewStmt.java@85
PS1, Line 85: tableName_
can tableName_ be null?


http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java
File fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java:

http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java@108
PS1, Line 108: sq:
nit: other loops in this file add a space after the iter variable.


http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/analysis/Path.java
File fe/src/main/java/org/apache/impala/analysis/Path.java:

http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/analysis/Path.java@275
PS1, Line 275: list of table names
pls state any restrictions on the path-- I found this unclear to read. 
for example,
  path: <"tableName"> should return <(defaultDb, tableName)>
  path: <"dbName", "tableName"> will return <(defaultDb, dbName), (dbName, 
tableName)> ?


http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
File fe/src/main/java/org/apache/impala/analysis/QueryStmt.java:

http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java@105
PS1, Line 105: }
why aren't baseTblResultExprs_ included? the comment there states that these 
will be resolved to "base tbl refs". I assume the child class does this?


http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
File fe/src/main/java/org/apache/impala/analysis/SelectStmt.java:

http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/analysis/SelectStmt.java@1033
PS1, Line 1033: }
I was expecting this class to look at baseTblResultExprs_


http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/service/MetadataOp.java
File fe/src/main/java/org/apache/impala/service/MetadataOp.java:

http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/main/java/org/apache/impala/service/MetadataOp.java@281
PS1, Line 281: ;
nit: extra ";"


http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeUpsertStmtTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeUpsertStmtTest.java:

http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeUpsertStmtTest.java@104
PS1, Line 104: Table does
I think the previous error was more informative (and it implies that the the 
table does not exist).



--
To view, visit http://gerrit.cloudera.org:8080/8958
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I68d32d5acd4a6f6bc6cedb05e6cc5cf604d24a55
Gerrit-Change-Number: 8958
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Tue, 09 Jan 2018 21:25:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5478: Run TPCDS queries with decimal v2 enabled

2018-01-09 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/8985


Change subject: IMPALA-5478: Run TPCDS queries with decimal_v2 enabled
..

IMPALA-5478: Run TPCDS queries with decimal_v2 enabled

We add new TPCDS .test files that are expected to be run with decimal_v2
enabled. The new expected results were generated using Impala and I
inspected them manually.

Change-Id: Ib867c51a521ec4a087bc127d99aee4b95ba97733
---
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q1.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q10a.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q11.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q12.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q13.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q15.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q16.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q17.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q18a.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q19.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q2.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q20.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q21.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q22a.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q25.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q29.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q3.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q32.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q33.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q34.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q37.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q39-1.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q39-2.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q4.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q40.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q41.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q42.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q43.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q46.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q50.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q51.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q51a.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q52.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q53.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q54.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q55.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q56.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q6.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q60.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q61.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q62.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q64.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q65.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q67a.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q68.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q69.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q7.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q70a.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q71.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q72.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q73.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q74.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q75.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q76.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q77a.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q78.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q79.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q8.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q80a.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q81.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q82.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q84.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q86a.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q88.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q91.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q92.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q94.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q95.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q96.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q97.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q98.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q99.test
A tests/query_test/test_tpcds_decimal_v2_queries.py
73 files changed, 14,428 

[Impala-ASF-CR] IMPALA-6290: limit ScannerContext to 1 buffer at a time

2018-01-09 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8814 )

Change subject: IMPALA-6290: limit ScannerContext to 1 buffer at a time
..


Patch Set 6:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8814/6/be/src/exec/scanner-context.h
File be/src/exec/scanner-context.h:

http://gerrit.cloudera.org:8080/#/c/8814/6/be/src/exec/scanner-context.h@162
PS6, Line 162: /// Equivalent to java.io.DataInput.readBoolean()
May help to document what returning false means for the Read*() functions.


http://gerrit.cloudera.org:8080/#/c/8814/6/be/src/exec/scanner-context.cc
File be/src/exec/scanner-context.cc:

http://gerrit.cloudera.org:8080/#/c/8814/6/be/src/exec/scanner-context.cc@94
PS6, Line 94: boundary_buffer_pos_ = nullptr;
: boundary_buffer_bytes_left_ = 0;
: boundary_buffer_->Reset();
Why weren't we doing it before ?


http://gerrit.cloudera.org:8080/#/c/8814/6/be/src/exec/scanner-context.cc@99
PS6, Line 99: ReturnIoBuffer()
I wonder if it warrants poisoning the IOBuffer returned to catch any cases in 
which the buffer is referenced after return to IOMgr.



--
To view, visit http://gerrit.cloudera.org:8080/8814
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74c5960a75f7d88b0e1de4199af731fb13e592f0
Gerrit-Change-Number: 8814
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 09 Jan 2018 20:30:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6353: Fix crash in snappy decompressor

2018-01-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8977 )

Change subject: IMPALA-6353: Fix crash in snappy decompressor
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8977/1/testdata/datasets/functional/functional_schema_template.sql
File testdata/datasets/functional/functional_schema_template.sql:

http://gerrit.cloudera.org:8080/#/c/8977/1/testdata/datasets/functional/functional_schema_template.sql@1650
PS1, Line 1650: -- Parquet file with invalid (0) compressed_page_size in a dict 
page.
We've mostly been loading files like this dynamically, e.g. see 
test_corrupt_rle_counts(). This avoids the need to reload data for just this 
one test.



--
To view, visit http://gerrit.cloudera.org:8080/8977
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0d42937aab92a74f8e104d2f7fcd64dc24f6a500
Gerrit-Change-Number: 8977
Gerrit-PatchSet: 1
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 09 Jan 2018 20:15:16 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5317: [DOCS] Doc for DATE TRUNC() function

2018-01-09 Thread John Russell (Code Review)
Hello Greg Rahn, sandeep akinapelli, Alex Behm,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8768

to look at the new patch set (#5).

Change subject: IMPALA-5317: [DOCS] Doc for DATE_TRUNC() function
..

IMPALA-5317: [DOCS] Doc for DATE_TRUNC() function

Change-Id: Ifcf38903bb10db12cbb8d73a2dc875aef29cd359
---
M docs/impala_keydefs.ditamap
M docs/shared/impala_common.xml
M docs/topics/impala_datetime_functions.xml
3 files changed, 104 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/8768/5
--
To view, visit http://gerrit.cloudera.org:8080/8768
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifcf38903bb10db12cbb8d73a2dc875aef29cd359
Gerrit-Change-Number: 8768
Gerrit-PatchSet: 5
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: sandeep akinapelli 


[Impala-ASF-CR] IMPALA-5317: [DOCS] Doc for DATE TRUNC() function

2018-01-09 Thread John Russell (Code Review)
John Russell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8768 )

Change subject: IMPALA-5317: [DOCS] Doc for DATE_TRUNC() function
..


Patch Set 4:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8768/4/docs/topics/impala_datetime_functions.xml
File docs/topics/impala_datetime_functions.xml:

http://gerrit.cloudera.org:8080/#/c/8768/4/docs/topics/impala_datetime_functions.xml@403
PS4, Line 403: values
> TIMESTAMP value or values?
'values' is grammatically correct here.


http://gerrit.cloudera.org:8080/#/c/8768/4/docs/topics/impala_datetime_functions.xml@422
PS4, Line 422: TS
> caps on TS and small TS mentioned earlier??
Sure. I follow a consistent pattern of capitalization for discussions in a pure 
SQL context - identifiers are lowercase in a statement like 'CREATE TABLE foo', 
but if I'm referring to the identifier itself I'll capitalize it, as in 'the 
table named FOO'. But in the discussions of built-in functions I use lowercase 
more frequently, more of a C-like style.


http://gerrit.cloudera.org:8080/#/c/8768/4/docs/topics/impala_datetime_functions.xml@430
PS4, Line 430: one secon
> what do you think about "one second" vs "second"
Grammatically there would need to be one other word there, either "a second" or 
"one second". Let's leave it as-is.



--
To view, visit http://gerrit.cloudera.org:8080/8768
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifcf38903bb10db12cbb8d73a2dc875aef29cd359
Gerrit-Change-Number: 8768
Gerrit-PatchSet: 4
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: sandeep akinapelli 
Gerrit-Comment-Date: Tue, 09 Jan 2018 20:04:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [DOCS] Recommend using Kudu Java API for rapid DMLs

2018-01-09 Thread John Russell (Code Review)
John Russell has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/8976


Change subject: [DOCS] Recommend using Kudu Java API for rapid DMLs
..

[DOCS] Recommend using Kudu Java API for rapid DMLs

Change-Id: I0098f0c3d5d07c89e6bb589c4c04edce300c1ad3
---
M docs/topics/impala_jdbc.xml
1 file changed, 14 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/8976/1
--
To view, visit http://gerrit.cloudera.org:8080/8976
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I0098f0c3d5d07c89e6bb589c4c04edce300c1ad3
Gerrit-Change-Number: 8976
Gerrit-PatchSet: 1
Gerrit-Owner: John Russell 


[Impala-ASF-CR] IMPALA-4993: extend dictionary filtering to collections

2018-01-09 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8775 )

Change subject: IMPALA-4993: extend dictionary filtering to collections
..


Patch Set 6:

ping.


--
To view, visit http://gerrit.cloudera.org:8080/8775
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If3a2abcfc3d0f7d18756816659fed77ce12668dd
Gerrit-Change-Number: 8775
Gerrit-PatchSet: 6
Gerrit-Owner: Vuk Ercegovac 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Tue, 09 Jan 2018 17:58:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5191: Standardize column alias behavior

2018-01-09 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8801 )

Change subject: IMPALA-5191: Standardize column alias behavior
..


Patch Set 9:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/8801/8/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
File fe/src/main/java/org/apache/impala/analysis/QueryStmt.java:

http://gerrit.cloudera.org:8080/#/c/8801/8/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java@273
PS8, Line 273:* also refers to a select-list expression.
> Let's inline this check into the single caller.
Done


http://gerrit.cloudera.org:8080/#/c/8801/8/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java@283
PS8, Line 283: Returns clone
> "corresponding" is not really explained, I think we should explain ordinal
I rewrote the whole comment.


http://gerrit.cloudera.org:8080/#/c/8801/8/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java@285
PS8, Line 285:*/
> The returned expr is analyzed regardless of whether substitution was perfor
Done


http://gerrit.cloudera.org:8080/#/c/8801/8/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java@288
PS8, Line 288: Expr substituteExpr = trySubstituteOrdinal(expr, 
errorPrefix, analyzer);
> substituteOrdinalOrAlias()
Done


http://gerrit.cloudera.org:8080/#/c/8801/8/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java@290
PS8, Line 290: if (ambiguousAliasList_.contains(expr)) {
> Let's move this after attempting to substitute ordinals to make the flow cl
Done


http://gerrit.cloudera.org:8080/#/c/8801/8/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java@291
PS8, Line 291:   throw new AnalysisException("Column '" + expr.toSql() +
> Integrate this into the function comment.
Done


http://gerrit.cloudera.org:8080/#/c/8801/8/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java@296
PS8, Line 296: } else {
> Slightly easier for me to read/reason with inverted logic, i.e.
Done


http://gerrit.cloudera.org:8080/#/c/8801/8/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java@314
PS8, Line 314: }
> substituteOrdinalsAndAliases()
Done


http://gerrit.cloudera.org:8080/#/c/8801/8/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java@316
PS8, Line 316: 
> Less code to use a loop over indexes:
Done


http://gerrit.cloudera.org:8080/#/c/8801/8/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java:

http://gerrit.cloudera.org:8080/#/c/8801/8/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@987
PS8, Line 987: // Ambiguous alias in HAVING
> Let's move this complex test to the end. Easier to follow the tests if we s
Done


http://gerrit.cloudera.org:8080/#/c/8801/8/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@1041
PS8, Line 1041: "GROUP BY expression must not contain analytic 
expressions: " +
> Weird error. GROUPBY+HAVING come before analytics so the error should be "H
Done



--
To view, visit http://gerrit.cloudera.org:8080/8801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0f82483b486acf6953876cfa672b0d034f3709a8
Gerrit-Change-Number: 8801
Gerrit-PatchSet: 9
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 09 Jan 2018 17:52:08 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5191: Standardize column alias behavior

2018-01-09 Thread Zoltan Borok-Nagy (Code Review)
Hello Taras Bobrovytsky, Tim Armstrong, Alex Behm,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8801

to look at the new patch set (#9).

Change subject: IMPALA-5191: Standardize column alias behavior
..

IMPALA-5191: Standardize column alias behavior

We should not perform alias substitution in the
subexpressions of GROUP BY, HAVING, and ORDER BY
to be more standard conformant.

=== Allowed ===
SELECT int_col / 2 AS x
FROM functional.alltypes
GROUP BY x;

SELECT int_col / 2 AS x
FROM functional.alltypes
ORDER BY x;

SELECT NOT bool_col AS nb
FROM functional.alltypes
GROUP BY nb
HAVING nb;

=== Not allowed ===
SELECT int_col / 2 AS x
FROM functional.alltypes
GROUP BY x / 2;

SELECT int_col / 2 AS x
FROM functional.alltypes
ORDER BY -x;

SELECT int_col / 2 AS x
FROM functional.alltypes
GROUP BY x
HAVING x > 3;

Some extra checks were added to AnalyzeExprsTest.java.

I had to update other tests to make them pass
since the new behavior is more restrictive.

I added alias.test to the end-to-end tests.

Change-Id: I0f82483b486acf6953876cfa672b0d034f3709a8
---
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test
A testdata/workloads/functional-query/queries/QueryTest/alias.test
M tests/query_test/test_queries.py
8 files changed, 206 insertions(+), 45 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/8801/9
--
To view, visit http://gerrit.cloudera.org:8080/8801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0f82483b486acf6953876cfa672b0d034f3709a8
Gerrit-Change-Number: 8801
Gerrit-PatchSet: 9
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-3942: Fix wronly escaped string literal in front-end

2018-01-09 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8818 )

Change subject: IMPALA-3942: Fix wronly escaped string literal in front-end
..


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8818/5/fe/src/main/java/org/apache/impala/analysis/StringLiteral.java
File fe/src/main/java/org/apache/impala/analysis/StringLiteral.java:

http://gerrit.cloudera.org:8080/#/c/8818/5/fe/src/main/java/org/apache/impala/analysis/StringLiteral.java@120
PS5, Line 120: sb.append("\"");
Just asking: Wouldn't this be more readable to say sb.append('"')
(double quote surrounded by single quotes) same as the row above?
Or can't you append nextChar here?

If the answer is yes for the last question then it might seem reasonable to 
rework this if-else:
if (nextchar is not doublequote) {
  append(currentChar)
}
append(nextchar)


http://gerrit.cloudera.org:8080/#/c/8818/5/tests/query_test/test_queries.py
File tests/query_test/test_queries.py:

http://gerrit.cloudera.org:8080/#/c/8818/5/tests/query_test/test_queries.py@272
PS5, Line 272:   def prepare(self, unique_database):
nit: as the only usage of prepare() is in the last test of this file, I'd move 
this function closer to it.


http://gerrit.cloudera.org:8080/#/c/8818/5/tests/query_test/test_queries.py@297
PS5, Line 297: result = self.execute_query("select '\\\''")
I'm a bit confused of all these backslashes and single-double quotes to be 
honest :), but shouldn't this query return:
"\'" instead of "'"?



--
To view, visit http://gerrit.cloudera.org:8080/8818
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibc4b5f5d8ffaa8feb96a466959427a04b3b06fec
Gerrit-Change-Number: 8818
Gerrit-PatchSet: 5
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 09 Jan 2018 10:24:11 +
Gerrit-HasComments: Yes