[Impala-ASF-CR] IMPALA-6187: Fix missing conjuncts evaluation with empty projection

2017-11-27 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8623 )

Change subject: IMPALA-6187: Fix missing conjuncts evaluation with empty 
projection
..


Patch Set 1: Code-Review+1

(1 comment)

I'm comfortable with this. Not sure if anyone else wants to take a look.

http://gerrit.cloudera.org:8080/#/c/8623/1/testdata/workloads/functional-query/queries/QueryTest/scanners.test
File testdata/workloads/functional-query/queries/QueryTest/scanners.test:

http://gerrit.cloudera.org:8080/#/c/8623/1/testdata/workloads/functional-query/queries/QueryTest/scanners.test@80
PS1, Line 80: select count(*) from alltypes where rand() * 10 > 0;
I think rand() can return 0.0, so maybe it would be better to make this >= so 
it's guaranteed to be always true.



--
To view, visit http://gerrit.cloudera.org:8080/8623
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib530f1fdcd2c6de699977db163b3f6eb38481517
Gerrit-Change-Number: 8623
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 27 Nov 2017 17:35:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5936: operator '%' overflows on large decimals

2017-11-28 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8574 )

Change subject: IMPALA-5936: operator '%' overflows on large decimals
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8574
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2b06c8acd5aa490943e84013faf2eaac7c26ceb4
Gerrit-Change-Number: 8574
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 28 Nov 2017 17:35:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6241: timeout in admission control test under ASAN

2017-11-27 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/8652


Change subject: IMPALA-6241: timeout in admission control test under ASAN
..

IMPALA-6241: timeout in admission control test under ASAN

The fix for IMPALA-6241 is to increase the timeout for all slow builds.

While testing that fix, I discovered that the ASAN build detection logic
was failing silently, resulting in it assuming that it was testing a
DEBUG build. The error was:

  Unexpected DW_AT_name in first CU:
  
/data/jenkins/workspace/verify-impala-toolchain-package-build/label/ec2-package-ubuntu-16-04/toolchain/source/llvm/llvm-3.9.1.src/projects/compiler-rt/lib/asan/asan_preinit.cc;
  choosing DEBUG

The fix for that issue is to remove the build type detection heuristic
and instead just write a file with the build type as part of the build process.

Testing:
Before this change I was able to reproduce locally every 5-10 test
iterations. After this change I haven't seen it reproduce.

Change-Id: Ia4ed949cac99b9925f72e19e4adaa2ead370b536
---
M .gitignore
M CMakeLists.txt
M bin/clean-cmake.sh
M infra/python/deps/requirements.txt
M tests/common/environ.py
M tests/custom_cluster/test_admission_controller.py
6 files changed, 54 insertions(+), 140 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/8652/1
--
To view, visit http://gerrit.cloudera.org:8080/8652
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia4ed949cac99b9925f72e19e4adaa2ead370b536
Gerrit-Change-Number: 8652
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6187: Fix missing conjuncts evaluation with empty projection

2017-11-28 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8623 )

Change subject: IMPALA-6187: Fix missing conjuncts evaluation with empty 
projection
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8623/2/be/src/exec/hdfs-scanner.cc
File be/src/exec/hdfs-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8623/2/be/src/exec/hdfs-scanner.cc@226
PS2, Line 226: DCHECK_EQ(scan_node_->tuple_desc()->byte_size(), 0);
This seems to violate the invariant that the scan node only produces non-NULL 
tuples (the previous version could too). I'm not sure if the planner can 
produce plans that have a TupleIsNull() predicate on a scan producing zero 
slots, because as far as I understand, TupleIsNull() predicates are only 
emitted when a slot from the scan is referenced, but I could be wrong - I feel 
like there could be some edge case if you get creative with nested loop joins, 
etc (which pass the tuple pointers through unmodified).

I think we should either write a dummy non-NULL pointer or have a comment 
explaining why that's not necessary.


http://gerrit.cloudera.org:8080/#/c/8623/2/be/src/runtime/row-batch.cc
File be/src/runtime/row-batch.cc:

http://gerrit.cloudera.org:8080/#/c/8623/2/be/src/runtime/row-batch.cc@60
PS2, Line 60:   tuple_ptrs_ = reinterpret_cast(calloc(1, 
tuple_ptrs_size_));
I don't think we should zero-initialise this. We can't depend on the 
tuple_ptrs_ being zero-initialised since Reset() doesn't zero-initialise them, 
so it's just unnecessary overhead.



--
To view, visit http://gerrit.cloudera.org:8080/8623
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib530f1fdcd2c6de699977db163b3f6eb38481517
Gerrit-Change-Number: 8623
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 28 Nov 2017 19:52:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests

2017-11-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8631 )

Change subject: IMPALA-6227: deflake admission stress tests
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8631/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8631/1//COMMIT_MSG@11
PS1, Line 11: All of the accounting in the test implicitly relies on queries 
not being
: dequeued until queries are later explicitly ended, so if this 
happened,
: the test broke in multiple subtle ways.
> and this assumption is no longer true because of the final change for IMPAL
I don't think it was ever true for the mem_limit tests. Even before IMPALA-1575 
it depended implicitly on the memory staying reserved on all backends.

>From what I can tell, the bug was always there, it was a tweak to the 
>statestore frequency that threw it out of balance.


http://gerrit.cloudera.org:8080/#/c/8631/1/tests/custom_cluster/test_admission_controller.py
File tests/custom_cluster/test_admission_controller.py:

http://gerrit.cloudera.org:8080/#/c/8631/1/tests/custom_cluster/test_admission_controller.py@705
PS1, Line 705: b4
> where is that used?
Leftover debugging code that I missed removing.


http://gerrit.cloudera.org:8080/#/c/8631/1/tests/custom_cluster/test_admission_controller.py@792
PS1, Line 792: amount of time
> is this actually timing sensitive, or is it that as long as we don't fetch
It shouldn't be timing sensitive. Updated the comment.



--
To view, visit http://gerrit.cloudera.org:8080/8631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4
Gerrit-Change-Number: 8631
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 22 Nov 2017 22:45:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests

2017-11-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8631 )

Change subject: IMPALA-6227: deflake admission stress tests
..


Patch Set 5: Code-Review+2

Carry


--
To view, visit http://gerrit.cloudera.org:8080/8631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4
Gerrit-Change-Number: 8631
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 22 Nov 2017 23:17:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests

2017-11-22 Thread Tim Armstrong (Code Review)
Hello Bikramjeet Vig, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8631

to look at the new patch set (#2).

Change subject: IMPALA-6227: deflake admission stress tests
..

IMPALA-6227: deflake admission stress tests

The problem was that, during the initial admission decision phase, some
queries were initially queued then dequeued once memory came available.
All of the accounting in the test implicitly relies on queries not being
dequeued until queries are later explicitly ended, so if this happened,
the test broke in multiple subtle ways.

This happened because the query only scanned a small number of
rows, which could be all buffered on the receiver side of the
exchange even before the client fetched any rows from the coordinator.
This means that the reserved memory on some backends could increase
then decrease during the initial admission phase, resulting in a
query being queued then dequeued.

The fix is to increase the number of rows returned by the query so that
all fragments remain active during the initial admission phase.
This increased test execution time somewhat, so I also had to bump the
queue wait timeout for the admission stress tests (they assume that
queries don't time out in the queue).

Testing:
Ran the test under debug, release and ASAN builds, i.e.

  impala-py.test tests/custom_cluster/test_admission_controller.py \
--workload_exploration_strategy="functional-query:exhaustive"

I looped the mem_limit test for a while to confirm it didn't reproduce
(it reproduced reliably every 2-3 iterations before this fix).

I will try looping it a bit more under different build types
concurrently with the review.

Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4
---
M fe/src/test/resources/llama-site-test2.xml
M tests/custom_cluster/test_admission_controller.py
2 files changed, 89 insertions(+), 47 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/8631/2
-- 
To view, visit http://gerrit.cloudera.org:8080/8631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4
Gerrit-Change-Number: 8631
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 


[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests

2017-11-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8631 )

Change subject: IMPALA-6227: deflake admission stress tests
..


Patch Set 3:

(1 comment)

Just to summarise where this is at, it's a lot less flaky but I'm still seeing 
occasional flakes that I think are the result of the test seeing inconsistent 
metrics (i.e. admitted, queued, dequeued, etc are out of sync).

http://gerrit.cloudera.org:8080/#/c/8631/3/tests/custom_cluster/test_admission_controller.py
File tests/custom_cluster/test_admission_controller.py:

http://gerrit.cloudera.org:8080/#/c/8631/3/tests/custom_cluster/test_admission_controller.py@706
PS3, Line 706: sleep(0.2)
> what's that about? does the QUERY_TIMEOUT comment above need updating?
I was experimenting with ways to reduce test runtime and reduce the chances of 
it hitting various timeouts.

Factored out a constant to make the relationship clearer.



--
To view, visit http://gerrit.cloudera.org:8080/8631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4
Gerrit-Change-Number: 8631
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 22 Nov 2017 23:03:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests

2017-11-22 Thread Tim Armstrong (Code Review)
Hello Bikramjeet Vig, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8631

to look at the new patch set (#4).

Change subject: IMPALA-6227: deflake admission stress tests
..

IMPALA-6227: deflake admission stress tests

The problem was that, during the initial admission decision phase, some
queries were initially queued then dequeued once memory came available.
All of the accounting in the test implicitly relies on queries not being
dequeued until queries are later explicitly ended, so if this happened,
the test broke in multiple subtle ways.

This happened because the query only scanned a small number of
rows, which could be all buffered on the receiver side of the
exchange even before the client fetched any rows from the coordinator.
This means that the reserved memory on some backends could increase
then decrease during the initial admission phase, resulting in a
query being queued then dequeued.

The fix is to increase the number of rows returned by the query so that
all fragments remain active during the initial admission phase.
This increased test execution time somewhat, so I also had to bump the
queue wait timeout for the admission stress tests (they assume that
queries don't time out in the queue).

Testing:
Ran the test under debug, release and ASAN builds, i.e.

  impala-py.test tests/custom_cluster/test_admission_controller.py \
--workload_exploration_strategy="functional-query:exhaustive"

I looped the mem_limit test for a while to confirm it didn't reproduce
(it reproduced reliably every 2-3 iterations before this fix).

It still reproduces every 5-10 runs with exhaustive+release, so I
need to do further work to make it more robust.

Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4
---
M fe/src/test/resources/llama-site-test2.xml
M tests/custom_cluster/test_admission_controller.py
2 files changed, 114 insertions(+), 47 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/8631/4
--
To view, visit http://gerrit.cloudera.org:8080/8631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4
Gerrit-Change-Number: 8631
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests

2017-11-22 Thread Tim Armstrong (Code Review)
Hello Bikramjeet Vig, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8631

to look at the new patch set (#5).

Change subject: IMPALA-6227: deflake admission stress tests
..

IMPALA-6227: deflake admission stress tests

The problem was that, during the initial admission decision phase, some
queries were initially queued then dequeued once memory came available.
All of the accounting in the test implicitly relies on queries not being
dequeued until queries are later explicitly ended, so if this happened,
the test broke in multiple subtle ways.

This happened because the query only scanned a small number of
rows, which could be all buffered on the receiver side of the
exchange even before the client fetched any rows from the coordinator.
This means that the reserved memory on some backends could increase
then decrease during the initial admission phase, resulting in a
query being queued then dequeued.

The fix is to increase the number of rows returned by the query so that
all fragments remain active during the initial admission phase.
This increased test execution time somewhat, so I also had to bump the
queue wait timeout for the admission stress tests (they assume that
queries don't time out in the queue).

Testing:
Ran the test under debug, release and ASAN builds, i.e.

  impala-py.test tests/custom_cluster/test_admission_controller.py \
--workload_exploration_strategy="functional-query:exhaustive"

I looped the mem_limit test for a while to confirm it didn't reproduce
(it reproduced reliably every 2-3 iterations before this fix).

It still reproduces every 5-10 runs with exhaustive+release, so I
need to do further work to make it more robust.

Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4
---
M fe/src/test/resources/llama-site-test2.xml
M tests/custom_cluster/test_admission_controller.py
2 files changed, 116 insertions(+), 49 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/8631/5
--
To view, visit http://gerrit.cloudera.org:8080/8631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4
Gerrit-Change-Number: 8631
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6227: deflake admission stress tests

2017-11-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8631 )

Change subject: IMPALA-6227: deflake admission stress tests
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8631/4/tests/custom_cluster/test_admission_controller.py
File tests/custom_cluster/test_admission_controller.py:

http://gerrit.cloudera.org:8080/#/c/8631/4/tests/custom_cluster/test_admission_controller.py@698
PS4, Line 698: every
 :   # second
> shouldn't that say "at least every second" or something?
Done



--
To view, visit http://gerrit.cloudera.org:8080/8631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iafb3af0ce68f96e5d713dbb3b37dd0b50ea66bb4
Gerrit-Change-Number: 8631
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 22 Nov 2017 23:16:40 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3804: Re-enable per-scan filtering for sequence-based scanners

2017-11-29 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8684 )

Change subject: IMPALA-3804: Re-enable per-scan filtering for sequence-based 
scanners
..


Patch Set 1:

(2 comments)

The code looks correct to me, but I had some ideas about how a cleaner solution 
- let me know if they make sense or if I'm missing something.

http://gerrit.cloudera.org:8080/#/c/8684/1/be/src/exec/base-sequence-scanner.cc
File be/src/exec/base-sequence-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8684/1/be/src/exec/base-sequence-scanner.cc@174
PS1, Line 174:   if 
(!scan_node_->PartitionPassesFilters(context_->partition_descriptor()->id(),
This is a bit inconsistent with the other file types because it checks once per 
1024-row batch (GetNextInternal() is called many times per file) instead of 
once per scan range. Not the biggest deal but it adds another special case to 
this code (which unfortunately already has a lot of them).


http://gerrit.cloudera.org:8080/#/c/8684/1/be/src/exec/hdfs-scan-node.cc
File be/src/exec/hdfs-scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/8684/1/be/src/exec/hdfs-scan-node.cc@a497
PS1, Line 497:
 :
After looking at the code I'm wondering if this idea would lead to a more 
elegant solution. If I understand correctly, the idea is that, instead of 
marking the scan range done by incrementing progress_, for sequence file header 
ranges, we should instead mark all the ranges in the file as done by calling 
RangeComplete() with all of the ranges in the file.

I think then we could maybe remove the special casing for 
FileFormatIsSequenceBased() and instead check something like

  if (metadata->is_header_range())

Which seems to express the logic more directly.



--
To view, visit http://gerrit.cloudera.org:8080/8684
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4b38c26bcbe67f83efcc65a1723d766626ae3d3e
Gerrit-Change-Number: 8684
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 29 Nov 2017 17:20:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Update incubator-impala -> impala URLs

2017-11-29 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/8685


Change subject: Update incubator-impala -> impala URLs
..

Update incubator-impala -> impala URLs

This fixes push_to_asf.py and various other scripts that had the Apache
repo location hard-coded. Also fixed the location of the github mirror
and mailing list archives.

Testing:
Ran push_to_asf.py to check I got the URL right. Checked a couple of the
github and mailing list URLs to make sure the new URL is valid.

Change-Id: Ie49221300340ef34bdd7c01670c35bdbbce3e84f
---
M bin/bootstrap_system.sh
M bin/push_to_asf.py
M docs/README.md
M docs/impala_keydefs.ditamap
M docs/topics/impala_install.xml
M docs/topics/impala_reserved_words.xml
M tests/benchmark/report_benchmark_results.py
7 files changed, 11 insertions(+), 11 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/85/8685/1
--
To view, visit http://gerrit.cloudera.org:8080/8685
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ie49221300340ef34bdd7c01670c35bdbbce3e84f
Gerrit-Change-Number: 8685
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6187: Fix missing conjuncts evaluation with empty projection

2017-11-28 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8623 )

Change subject: IMPALA-6187: Fix missing conjuncts evaluation with empty 
projection
..


Patch Set 4: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/8623
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib530f1fdcd2c6de699977db163b3f6eb38481517
Gerrit-Change-Number: 8623
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 29 Nov 2017 01:24:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6255: Add device names to DiskIoMgr thread names

2017-11-29 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8669 )

Change subject: IMPALA-6255: Add device names to DiskIoMgr thread names
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8669
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I30faeda6db8846e4aad64ce29ca811366d84910b
Gerrit-Change-Number: 8669
Gerrit-PatchSet: 2
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 29 Nov 2017 22:33:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] Remove "incubator-" from URLs.

2017-11-30 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8698 )

Change subject: Remove "incubator-" from URLs.
..


Patch Set 1:

Heh :)


--
To view, visit http://gerrit.cloudera.org:8080/8698
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6a8e10418f75c89a1f0b2111f860417a80dc10c5
Gerrit-Change-Number: 8698
Gerrit-PatchSet: 1
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 30 Nov 2017 16:32:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2248: Make idle session timeout a query option

2017-11-30 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8490 )

Change subject: IMPALA-2248: Make idle_session_timeout a query option
..


Patch Set 10:

(5 comments)

Did an initial pass. Still need to look at the tests in detail.

http://gerrit.cloudera.org:8080/#/c/8490/10//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8490/10//COMMIT_MSG@7
PS10, Line 7: IMPALA-2248: Make idle_session_timeout a query option
I did an initial pass over the code and wanted to clarify my understanding of 
the current behaviour:

* Does "set idle_session_timeout" in impala-shell have any effect? I wouldn't 
think so since it just updates a client-side map, right?
* The request pool query options overlay doesn't have an effect on this option, 
does it? Since a session doesn't belong to a request pool.

I think unfortunately this will end up being a bit of a special case in terms 
of behaviour, but I think this is expected.


http://gerrit.cloudera.org:8080/#/c/8490/10/be/src/service/client-request-state.h
File be/src/service/client-request-state.h:

http://gerrit.cloudera.org:8080/#/c/8490/10/be/src/service/client-request-state.h@423
PS10, Line 423:   void UpdateSessionTimeout(int32_t requested_timeout);
I think this assumes that session_->lock_ is held. We should be careful to 
document assumptions about which locks are and aren't held in this part of the 
code (there are unfortunately a lot of locks).


http://gerrit.cloudera.org:8080/#/c/8490/10/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/8490/10/be/src/service/client-request-state.cc@207
PS10, Line 207:   UpdateSessionTimeout(atoi(value.c_str()));
It feels a bit weird that we're doing string parsing of a query option here. 
Maybe we should set the query option first and then get the parsed value?


http://gerrit.cloudera.org:8080/#/c/8490/10/be/src/service/impala-server.h
File be/src/service/impala-server.h:

http://gerrit.cloudera.org:8080/#/c/8490/10/be/src/service/impala-server.h@436
PS10, Line 436: void SetTimeout(int32_t requested_timeout);
The caller must hold 'lock', right?


http://gerrit.cloudera.org:8080/#/c/8490/7/be/src/service/impala-server.h
File be/src/service/impala-server.h:

http://gerrit.cloudera.org:8080/#/c/8490/7/be/src/service/impala-server.h@423
PS7, Line 423: /// session may be correctly expired after a timeout (when 
ref_count == 0). Typically
Need to think about whether this is necessary - should we just use 
set_query_options?



--
To view, visit http://gerrit.cloudera.org:8080/8490
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I32e2775f80da387b0df4195fe2c5435b3f8e585e
Gerrit-Change-Number: 8490
Gerrit-PatchSet: 10
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 01 Dec 2017 02:13:08 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6201: Fix test basic filters on ASAN

2017-11-27 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8646 )

Change subject: IMPALA-6201: Fix test_basic_filters on ASAN
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8646/1/tests/query_test/test_runtime_filters.py
File tests/query_test/test_runtime_filters.py:

http://gerrit.cloudera.org:8080/#/c/8646/1/tests/query_test/test_runtime_filters.py@27
PS1, Line 27: WAIT_TIME_MS = specific_build_type_timeout(6, 
slow_build_timeout=10)
Just curious - did the longer timeout take effect on ASAN for you? I was seeing 
some cases locally where the ASAN build type detection logic flaked out on me 
and fell back to the DEBUG timing.



--
To view, visit http://gerrit.cloudera.org:8080/8646
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c20cbb75a9b6da73137f220657aa75dea9dfdce
Gerrit-Change-Number: 8646
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 27 Nov 2017 22:46:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6241: timeout in admission control test under ASAN

2017-11-27 Thread Tim Armstrong (Code Review)
Hello Michael Brown,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8652

to look at the new patch set (#2).

Change subject: IMPALA-6241: timeout in admission control test under ASAN
..

IMPALA-6241: timeout in admission control test under ASAN

The fix for IMPALA-6241 is to increase the timeout for all slow builds.

While testing that fix, I discovered that the ASAN build detection logic
was failing silently, resulting in it assuming that it was testing a
DEBUG build. The error was:

  Unexpected DW_AT_name in first CU:
  
/data/jenkins/workspace/verify-impala-toolchain-package-build/label/ec2-package-ubuntu-16-04/toolchain/source/llvm/llvm-3.9.1.src/projects/compiler-rt/lib/asan/asan_preinit.cc;
  choosing DEBUG

The fix for that issue is to remove the build type detection heuristic
and instead just write a file with the build type as part of the build process.

Testing:
Before this change I was able to reproduce locally every 5-10 test
iterations. After this change I haven't seen it reproduce.

Change-Id: Ia4ed949cac99b9925f72e19e4adaa2ead370b536
---
M .gitignore
M CMakeLists.txt
M bin/clean-cmake.sh
M infra/python/deps/requirements.txt
M tests/common/environ.py
M tests/custom_cluster/test_admission_controller.py
6 files changed, 61 insertions(+), 143 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/8652/2
--
To view, visit http://gerrit.cloudera.org:8080/8652
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia4ed949cac99b9925f72e19e4adaa2ead370b536
Gerrit-Change-Number: 8652
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-6241: timeout in admission control test under ASAN

2017-11-27 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8652 )

Change subject: IMPALA-6241: timeout in admission control test under ASAN
..


Patch Set 1:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/8652/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8652/1//COMMIT_MSG@10
PS1, Line 10:
: While testing that fix, I discovered that the ASAN build 
detection logic
: was failing silently
> Sorry to hear this wasn't working. Thanks for making an improvement.
It looks like it was my bug that caused it to fail actually :) (it's 
unfortunate that the silent nature of the failure caused us to go this long 
without noticing it though).


http://gerrit.cloudera.org:8080/#/c/8652/1/infra/python/deps/requirements.txt
File infra/python/deps/requirements.txt:

http://gerrit.cloudera.org:8080/#/c/8652/1/infra/python/deps/requirements.txt@45
PS1, Line 45: monkeypatch == 0.1rc3
> I don't think anything else is using this, so it can be removed.
Done


http://gerrit.cloudera.org:8080/#/c/8652/1/tests/common/environ.py
File tests/common/environ.py:

http://gerrit.cloudera.org:8080/#/c/8652/1/tests/common/environ.py@a198
PS1, Line 198:
> I guess the bug is that this should have been elif.
Oops, that is my bug then - I didn't see this when looking at the code. It 
doesn't seem like the changed approach really directly addresses the bug then.


http://gerrit.cloudera.org:8080/#/c/8652/1/tests/common/environ.py@19
PS1, Line 19: import pytest
> Unused; remove.
Done


http://gerrit.cloudera.org:8080/#/c/8652/1/tests/common/environ.py@112
PS1, Line 112: is_dev
> Do you want to add ubsan to this method?
Done


http://gerrit.cloudera.org:8080/#/c/8652/1/tests/common/environ.py@122
PS1, Line 122: runs_slowly
> Do you want to add ubsan to this method?
Done



--
To view, visit http://gerrit.cloudera.org:8080/8652
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4ed949cac99b9925f72e19e4adaa2ead370b536
Gerrit-Change-Number: 8652
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 28 Nov 2017 01:20:08 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2181: Add query option levels for display

2017-11-22 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8447 )

Change subject: IMPALA-2181: Add query option levels for display
..


Patch Set 14:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/8447/18/be/src/service/query-options.h
File be/src/service/query-options.h:

http://gerrit.cloudera.org:8080/#/c/8447/18/be/src/service/query-options.h@113
PS18, Line 113:
per discussion on the JIRA, I think this should be under DEVELOPMENT until it's 
documented.


http://gerrit.cloudera.org:8080/#/c/8447/18/shell/impala_client.py
File shell/impala_client.py:

http://gerrit.cloudera.org:8080/#/c/8447/18/shell/impala_client.py@100
PS18, Line 100: _indent_level, output):
This check shouldn't be necessary, right? Unless you're doing what I did and 
running impala-shell against the wrong thrift output because you're too lazy to 
rebuild Impala :).


http://gerrit.cloudera.org:8080/#/c/8447/18/shell/impala_shell.py
File shell/impala_shell.py:

http://gerrit.cloudera.org:8080/#/c/8447/18/shell/impala_shell.py@83
PS18, Line 83:
correspond


http://gerrit.cloudera.org:8080/#/c/8447/18/shell/impala_shell.py@242
PS18, Line 242:   self._print_option_group(advanced_options)
This is subtle - can you add a comment? If I understand correctly, this is 
detecting that the impala server didn't send back any advanced query options, 
so we're assuming it's an old one.


http://gerrit.cloudera.org:8080/#/c/8447/14/tests/hs2/test_hs2.py
File tests/hs2/test_hs2.py:

http://gerrit.cloudera.org:8080/#/c/8447/14/tests/hs2/test_hs2.py@53
PS14, Line 53: fetch_results_req.operationHandle = 
execute_statement_resp.operationHandle
> This function is actually a copy paste: I moved it out from test_session_op
Seems ok then.



--
To view, visit http://gerrit.cloudera.org:8080/8447
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1
Gerrit-Change-Number: 8447
Gerrit-PatchSet: 14
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 22 Nov 2017 19:06:09 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6241: timeout in admission control test under ASAN

2017-11-28 Thread Tim Armstrong (Code Review)
Hello Michael Brown, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8652

to look at the new patch set (#4).

Change subject: IMPALA-6241: timeout in admission control test under ASAN
..

IMPALA-6241: timeout in admission control test under ASAN

The fix for IMPALA-6241 is to increase the timeout for all slow builds.

While testing that fix, I discovered that the ASAN build detection logic
was failing silently, resulting in it assuming that it was testing a
DEBUG build. The error was:

  Unexpected DW_AT_name in first CU:
  
/data/jenkins/workspace/verify-impala-toolchain-package-build/label/ec2-package-ubuntu-16-04/toolchain/source/llvm/llvm-3.9.1.src/projects/compiler-rt/lib/asan/asan_preinit.cc;
  choosing DEBUG

The fix for that issue is to remove the build type detection heuristic
and instead just write a file with the build type as part of the build process.

Testing:
Before this change I was able to reproduce locally every 5-10 test
iterations. After this change I haven't seen it reproduce.

Change-Id: Ia4ed949cac99b9925f72e19e4adaa2ead370b536
---
M .gitignore
M CMakeLists.txt
M bin/clean-cmake.sh
M infra/python/deps/requirements.txt
M tests/common/environ.py
M tests/custom_cluster/test_admission_controller.py
6 files changed, 62 insertions(+), 133 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/8652/4
--
To view, visit http://gerrit.cloudera.org:8080/8652
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia4ed949cac99b9925f72e19e4adaa2ead370b536
Gerrit-Change-Number: 8652
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6241: timeout in admission control test under ASAN

2017-11-28 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8652 )

Change subject: IMPALA-6241: timeout in admission control test under ASAN
..


Patch Set 4: Code-Review+2

I discovered that some of the the code I thought was dead was actually imported 
by another test. Reinstated it.


--
To view, visit http://gerrit.cloudera.org:8080/8652
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4ed949cac99b9925f72e19e4adaa2ead370b536
Gerrit-Change-Number: 8652
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 28 Nov 2017 23:56:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1144: Fix exception when cancelling query in Impala-shell with CTRL-C

2017-11-28 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8549 )

Change subject: IMPALA-1144: Fix exception when cancelling query in 
Impala-shell with CTRL-C
..


Patch Set 8: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6cefaf1dae78baae238289816a7cb9d210fb38e2
Gerrit-Change-Number: 8549
Gerrit-PatchSet: 8
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 29 Nov 2017 00:12:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1144: Fix exception when cancelling query in Impala-shell with CTRL-C

2017-11-28 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8549 )

Change subject: IMPALA-1144: Fix exception when cancelling query in 
Impala-shell with CTRL-C
..


Patch Set 9: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6cefaf1dae78baae238289816a7cb9d210fb38e2
Gerrit-Change-Number: 8549
Gerrit-PatchSet: 9
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 29 Nov 2017 00:12:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6187: Fix missing conjuncts evaluation with empty projection

2017-11-28 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8623 )

Change subject: IMPALA-6187: Fix missing conjuncts evaluation with empty 
projection
..


Patch Set 2:

(1 comment)

Would it make sense to leave the existing behaviour and file the potential bug 
with NULL tuples as a separate JIRA? It sounds like the problem and solution 
for this bug are clearer but the other problem is very unclear.

http://gerrit.cloudera.org:8080/#/c/8623/2/be/src/exec/hdfs-scanner.cc
File be/src/exec/hdfs-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8623/2/be/src/exec/hdfs-scanner.cc@226
PS2, Line 226: DCHECK_EQ(scan_node_->tuple_desc()->byte_size(), 0);
> I agree writing a dummy non-NULL pointer seems most correct, but have we de
I was able to construct a query (based on an existing test) that had a 
TupleIsNull() predicate referencing the tuple from scan that doesn't 
materialise anything:

  SELECT /* +straight_join */ COUNT(t1.id)
  FROM functional.alltypessmall t1
  LEFT OUTER JOIN (
SELECT /* +straight_join */ IFNULL(t2.int_col, 1) AS c
FROM functional.alltypessmall t2
LEFT OUTER JOIN functional.alltypestiny t3 ON t2.id < 1000
  ) v ON t1.int_col = v.c;

The relevant part of the plan is:

| 04:HASH JOIN [LEFT OUTER JOIN, PARTITIONED]   
  |
| |  hash predicates: t1.int_col = if(TupleIsNull(1, 2), NULL, 
ifnull(t2.int_col, 1)) |
| |  fk/pk conjuncts: assumed fk/pk 
  |
| |  mem-estimate=1.94MB mem-reservation=1.94MB spill-buffer=64.00KB
  |
| |  tuple-ids=0,1N,2N row-size=16B cardinality=100 
  |
| | 
  |
| |--08:EXCHANGE [HASH(if(TupleIsNull(1, 2), NULL, ifnull(t2.int_col, 1)))] 
  |
| |  |  mem-estimate=0B mem-reservation=0B  
  |
| |  |  tuple-ids=1,2N row-size=8B cardinality=100  
  |
| |  |  
  |
| |  F01:PLAN FRAGMENT [RANDOM] hosts=3 instances=3 
  |
| |  Per-Host Resources: mem-estimate=32.00MB mem-reservation=0B
  |
| |  03:NESTED LOOP JOIN [LEFT OUTER JOIN, BROADCAST]   
  |
| |  |  join predicates: t2.id < 1000   
  |
| |  |  mem-estimate=0B mem-reservation=0B  
  |
| |  |  tuple-ids=1,2N row-size=8B cardinality=100  
  |
| |  |  
  |
| |  |--06:EXCHANGE [BROADCAST] 
  |
| |  |  |  mem-estimate=0B mem-reservation=0B   
  |
| |  |  |  tuple-ids=2 row-size=0B cardinality=8
  |
| |  |  |   
  |
| |  |  F02:PLAN FRAGMENT [RANDOM] hosts=3 instances=3  
  |
| |  |  Per-Host Resources: mem-estimate=32.00MB mem-reservation=0B 
  |
| |  |  02:SCAN HDFS [functional.alltypestiny t3, RANDOM]   
  |
| |  | partitions=4/4 files=4 size=460B 
  |
| |  | stats-rows=8 extrapolated-rows=disabled  
  |
| |  | table stats: rows=8 size=unavailable 
  |
| |  | column stats: all
  |
| |  | mem-estimate=32.00MB mem-reservation=0B  
  |
| |  | tuple-ids=2 row-size=0B cardinality=8
  |
| |  |  
  |
| |  01:SCAN HDFS [functional.alltypessmall t2, RANDOM] 
  |
| | partitions=4/4 files=4 size=6.32KB  
  |
| | stats-rows=100 extrapolated-rows=disabled   
  |
| | table stats: rows=100 size=unavailable  
  |
| | column stats: all   
  |
| | mem-estimate=32.00MB mem-reservation=0B 
  |
| | tuple-ids=1 row-size=8B cardinality=100 
  |


So it does look like the scenario is somehow possible. I don't really 
understand exactly why it is needed here.



--
To view, visit 

[Impala-ASF-CR] IMPALA-4132: Use -fno-omit-frame-pointer

2017-11-28 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8612 )

Change subject: IMPALA-4132: Use -fno-omit-frame-pointer
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8612/1/be/CMakeLists.txt
File be/CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/8612/1/be/CMakeLists.txt@42
PS1, Line 42: #  -fno-omit-frame-pointers: Keep frame pointer for functions. 
Produces better
> From my side no such consideration was made.
I didn't do any deep research into it. It's unclear to me what tooling supports 
that, whether it gets stripped out when binaries are stripped, etc. It may also 
be a good option, but I'm in favour of moving forward with the frame pointer 
approach since we know that it is widely supported and doesn't appear to have a 
significant perf impact.



--
To view, visit http://gerrit.cloudera.org:8080/8612
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib7d0d88ba015a847356ed0274fe91017b98cb941
Gerrit-Change-Number: 8612
Gerrit-PatchSet: 1
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 29 Nov 2017 00:08:29 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1474: Add a metric for running queries

2017-12-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has abandoned this change. ( http://gerrit.cloudera.org:8080/7228 
)

Change subject: IMPALA-1474: Add a metric for running queries
..


Abandoned

Seems like this is probably redundant now. Can always reopen.
--
To view, visit http://gerrit.cloudera.org:8080/7228
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: I41bc88d0e91427ca2fd70136dc6e06fa7a2663da
Gerrit-Change-Number: 7228
Gerrit-PatchSet: 1
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 


[Impala-ASF-CR] IMPALA-5987: LZ4 Codec silently produces bogus compressed data for large inputs

2017-12-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8748 )

Change subject: IMPALA-5987: LZ4 Codec silently produces bogus compressed data 
for large inputs
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8748
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb0bc4ed98c5d7b628b791aa90ead36347b9fbb8
Gerrit-Change-Number: 8748
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 04 Dec 2017 18:01:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6232: Disable file handle cache by default

2017-12-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8750 )

Change subject: IMPALA-6232: Disable file handle cache by default
..


Patch Set 1: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8750/1/be/src/runtime/io/disk-io-mgr.cc
File be/src/runtime/io/disk-io-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/8750/1/be/src/runtime/io/disk-io-mgr.cc@96
PS1, Line 96: // or smaller, depending on the replication factor for this file 
or the path name.
Maybe add a TODO to re-enable it?


http://gerrit.cloudera.org:8080/#/c/8750/1/be/src/runtime/io/disk-io-mgr.cc@109
PS1, Line 109: DEFINE_uint64(unused_file_handle_timeout_sec, 270, "Maximum 
time, in seconds, that an "
Maybe also mention here why you chose 270 (I see there's an explanation in the 
commit message already)



--
To view, visit http://gerrit.cloudera.org:8080/8750
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iea7f943f63b72b42286a9e8b9987308baa79d7b0
Gerrit-Change-Number: 8750
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 04 Dec 2017 19:22:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3804: Re-enable per-scan filtering for sequence-based scanners

2017-12-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8684 )

Change subject: IMPALA-3804: Re-enable per-scan filtering for sequence-based 
scanners
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8684/3/be/src/exec/base-sequence-scanner.cc
File be/src/exec/base-sequence-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8684/3/be/src/exec/base-sequence-scanner.cc@56
PS3, Line 56: ScanRangeMetadata* header_metadata = new 
ScanRangeMetadata(*metadata);
We need to add this to an ObjectPool so it's freed when the query is torn down 
(see what is done for the other place where these are allocated).



--
To view, visit http://gerrit.cloudera.org:8080/8684
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4b38c26bcbe67f83efcc65a1723d766626ae3d3e
Gerrit-Change-Number: 8684
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 04 Dec 2017 19:28:59 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2248: Make idle session timeout a query option

2017-12-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8490 )

Change subject: IMPALA-2248: Make idle_session_timeout a query option
..


Patch Set 10:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/8490/10//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8490/10//COMMIT_MSG@7
PS10, Line 7: IMPALA-2248: Make idle_session_timeout a query option
> * Yeah, currently it doesn't have any effect for the shell. As you said, it
Sounds reasonable. We'll have to make sure to document the limitations clearly 
- we can file a Doc JIRA as a follow-up.


http://gerrit.cloudera.org:8080/#/c/8490/10/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/8490/10/be/src/service/client-request-state.cc@207
PS10, Line 207:   UpdateSessionTimeout(atoi(value.c_str()));
> At this point the value can be invalid, e.g. the client wants to set the ti
Right, but if the client provides an invalid value, I think it will just get 
swallowed up here. E.g. I tried set idle_session_timeout="foo"; and it seems to 
fail silently.

I'd expect it return the "Invalid idle session timeout: " error to the client. 
I think we should add tests to set.test  for validation of the option.


http://gerrit.cloudera.org:8080/#/c/8490/11/fe/src/test/java/org/apache/impala/service/JdbcTest.java
File fe/src/test/java/org/apache/impala/service/JdbcTest.java:

http://gerrit.cloudera.org:8080/#/c/8490/11/fe/src/test/java/org/apache/impala/service/JdbcTest.java@616
PS11, Line 616: ew Long(
nit: is the new Long() necessary? Seems like it should be automatically cast


http://gerrit.cloudera.org:8080/#/c/8490/11/fe/src/test/java/org/apache/impala/service/JdbcTest.java@626
PS11, Line 626:   int sleepPeriod = (int)(timeout * timeoutTolerance * 
1000) + 500;
I think the wakeup timing could do with a brief explanation. Is the idea is 
that each iteration might let one session expire and then renews the timeout 
for the remaining sessions?


http://gerrit.cloudera.org:8080/#/c/8490/11/fe/src/test/java/org/apache/impala/service/JdbcTest.java@634
PS11, Line 634: new Long
same here


http://gerrit.cloudera.org:8080/#/c/8490/11/fe/src/test/java/org/apache/impala/service/JdbcTest.java@636
PS11, Line 636: Boolean
bool? Not sure why we need to use the boxed type


http://gerrit.cloudera.org:8080/#/c/8490/11/fe/src/test/java/org/apache/impala/service/JdbcTest.java@639
PS11, Line 639:   try(ResultSet rs = 
connection.createStatement().executeQuery("SELECT 1+2")) {
nit: whitespace


http://gerrit.cloudera.org:8080/#/c/8490/11/fe/src/test/java/org/apache/impala/util/Metrics.java
File fe/src/test/java/org/apache/impala/util/Metrics.java:

http://gerrit.cloudera.org:8080/#/c/8490/11/fe/src/test/java/org/apache/impala/util/Metrics.java@43
PS11, Line 43:   public Object getMetric(String metric) throws Exception {
A brief comment would be helpful, mainly to explain what it returns - when 
reading the test I was initially confused about the casting to (Long).



--
To view, visit http://gerrit.cloudera.org:8080/8490
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I32e2775f80da387b0df4195fe2c5435b3f8e585e
Gerrit-Change-Number: 8490
Gerrit-PatchSet: 10
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 05 Dec 2017 01:25:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6265 Query cancellation test enhancements

2017-12-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8713 )

Change subject: IMPALA-6265 Query cancellation test enhancements
..


Patch Set 2:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/8713/2/tests/shell/test_shell_commandline.py
File tests/shell/test_shell_commandline.py:

http://gerrit.cloudera.org:8080/#/c/8713/2/tests/shell/test_shell_commandline.py@349
PS2, Line 349:   def wait_query_until_desired_state(self, stmt, state, 
max_retry=10):
Thanks, this looks good. The name is a little non-idiomatic. Maybe something 
like wait_for_query_state()?


http://gerrit.cloudera.org:8080/#/c/8713/2/tests/shell/test_shell_commandline.py@362
PS2, Line 362: return "The found in flight query is not the one under 
test: " + \
Maybe just raise an exception? Makes it hard for tests to accidentally forget 
to check the return value


http://gerrit.cloudera.org:8080/#/c/8713/2/tests/shell/test_shell_commandline.py@366
PS2, Line 366:   ++retry_count
I don't think ++ works in python. It is syntactically valid but doesn't 
increment the variable. I think "retry_count += 1" is the most concise way of 
doing this.


http://gerrit.cloudera.org:8080/#/c/8713/2/tests/shell/test_shell_commandline.py@368
PS2, Line 368: return "Query didn't reach desired state: " + state
Same here - maybe raise an exception or just assert?


http://gerrit.cloudera.org:8080/#/c/8713/2/tests/shell/test_shell_commandline.py@385
PS2, Line 385: get_statement_from_args
It looks like we have the original statement in the tests so it seems better to 
pass the statement through  wait_query_until_desired_state() instead of trying 
to extract it from the command line.


http://gerrit.cloudera.org:8080/#/c/8713/2/tests/shell/test_shell_commandline.py@386
PS2, Line 386: ImapalShell
ImpalaShell



--
To view, visit http://gerrit.cloudera.org:8080/8713
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie0bff485a872df7be8efd784314a6ca91aaadd11
Gerrit-Change-Number: 8713
Gerrit-PatchSet: 2
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 04 Dec 2017 17:39:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation

2017-12-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#14). ( 
http://gerrit.cloudera.org:8080/8414 )

Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
..

IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation

In preparation for switching the I/O mgr to the buffer pool, this
removes and cleans up a lot of code so that the switchover patch starts
from a cleaner slate.

* Remove the free buffer cache (which will be replaced by buffer pool's
  own caching).
* Make memory limit exceeded error checking synchronous (in anticipation
  of having to propagate buffer pool errors synchronously).
* Simplify error propagation - remove the (ineffectual) code that
  enqueued BufferDescriptors containing error statuses.
* Document locking scheme better in a few places, make it part of the
  function signature when it seemed reasonable.
* Move ReturnBuffer() to ScanRange, because it is intrinsically
  connected with the lifecycle of a scan range.
* Separate external ReturnBuffer() and internal CleanUpBuffer()
  interfaces - previously callers of ReturnBuffer() were fudging
  the num_buffers_in_reader accounting to make the external interface work.
* Eliminate redundant state in ScanRange: 'eosr_returned_' and
  'is_cancelled_'.
* Clarify the logic around calling Close() for the last
  BufferDescriptor.
  -> There appeared to be an implicit assumption that buffers would be
 freed in the order they were returned from the scan range, so that
 the "eos" buffer was returned last. Instead just count the number
 of outstanding buffers to detect the last one.
  -> Touching the is_cancelled_ field without holding a lock was hard to
 reason about - violated locking rules and it was unclear that it
 was race-free.
* Remove DiskIoMgr::Read() to simplify interface. It is trivial to
  inline at the callsites.

This will probably regress performance somewhat because of the cache
removal, so my plan is to merge it around the same time as switching
the I/O mgr to allocate from the buffer pool. I'm keeping the patches
separate to make reviewing easier.

Testing:
* Ran exhaustive tests
* Ran the disk-io-mgr-stress-test overnight

Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/scanner-context.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/io/disk-io-mgr-stress.cc
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/runtime/io/request-context.cc
M be/src/runtime/io/request-context.h
M be/src/runtime/io/request-ranges.h
M be/src/runtime/io/scan-range.cc
M be/src/runtime/mem-tracker.h
M be/src/runtime/test-env.cc
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
19 files changed, 575 insertions(+), 904 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/8414/14
--
To view, visit http://gerrit.cloudera.org:8080/8414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1
Gerrit-Change-Number: 8414
Gerrit-PatchSet: 14
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6290: limit ScannerContext to 1 buffer at a time.

2017-12-12 Thread Tim Armstrong (Code Review)
Hello Lars Volker,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8814

to look at the new patch set (#3).

Change subject: IMPALA-6290: limit ScannerContext to 1 buffer at a time.
..

IMPALA-6290: limit ScannerContext to 1 buffer at a time.

This is a prerequisite for constraining the number of buffers per scan
range. Before this patch, calling ReadBytes(), SkipBytes(), etc could
cause an arbitrary number of I/O buffers to accumulate in
'completed_io_buffers_'. E.g. if we allocated 3 * 8MB I/O buffers for
a range and then called ReadBytes(30MB), we would hit resource
exhaustion as soon as 3 buffers were accumulated in
'completed_io_buffers_'.

The fix is to avoid accumulating any buffers in 'completed_io_buffers_'.
Instead of adding them to 'completed_io_buffers_', completed buffers
are just returned to the I/O manager. It turned out that this did not
weaken the ScannerContext's guarantees about memory lifetime, because
ScannerContext::GetBytesInternal() cleared 'boundary_buffer_' each
time it was called regardless. I checked that this behaviour wasn't
a bug by inspecting the scanner code. I could not find any cases
where scanners depended on returned memory remaining valid beyond
the next Read*()/Get*()/Skip*() call on the stream.

This change makes that lifetime explicit in the comments. A
side-effect of this fix is that scanners do not need to call
ReleaseCompletedResources() in CommitRows() and means that the
ScannerContext only ever needs to hold one I/O buffer at a time.

This change also reimplements SkipBytes() to avoid it accumulating
memory in the boundary buffer for large skip sizes.

Also clarifies some of the invariants in ScannerContext. E.g. some
places assumed io_buffer_ != NULL, but that is no longer needed.

Testing:
Ran core tests with ASAN and exhaustive tests with DEBUG.

Change-Id: I74c5960a75f7d88b0e1de4199af731fb13e592f0
---
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/parquet-column-readers.cc
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M be/src/exec/scanner-context.inline.h
M 
testdata/workloads/functional-query/queries/DataErrorsTest/hdfs-scan-node-errors.test
10 files changed, 241 insertions(+), 174 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/8814/3
--
To view, visit http://gerrit.cloudera.org:8080/8814
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I74c5960a75f7d88b0e1de4199af731fb13e592f0
Gerrit-Change-Number: 8814
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-5654: Disallow setting Kudu table name in CREATE TABLE

2017-12-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8820 )

Change subject: IMPALA-5654: Disallow setting Kudu table name in CREATE TABLE
..


Patch Set 3:

(3 comments)

Took a quick look - I had some concerns with the solution at a high level

http://gerrit.cloudera.org:8080/#/c/8820/3/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
File fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java:

http://gerrit.cloudera.org:8080/#/c/8820/3/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java@299
PS3, Line 299: if (!isReAnalyzed_) setManagedKuduTableName();
Alex and Dimitris are more authoritative on this than me, but adding a new 
re-analyzed sub-state to all statements seems like too global a solution to 
solve this local problem.

I think we could solve this locally by either keeping track of whether 
KEY_TABLE_NAME was present in the original table properties, or by checking 
whether KEY_TABLE_NAME is the same as the managed table name (which is arguably 
less good because it allows the property to be specified if it's exactly the 
right value).


http://gerrit.cloudera.org:8080/#/c/8820/3/tests/query_test/test_kudu.py
File tests/query_test/test_kudu.py:

http://gerrit.cloudera.org:8080/#/c/8820/3/tests/query_test/test_kudu.py@a861
PS3, Line 861:
I think we still want the test coverage from the test. It seems like the first 
part needs to use an external table instead.


http://gerrit.cloudera.org:8080/#/c/8820/3/tests/query_test/test_kudu.py@a881
PS3, Line 881:
I this second part of the test is still valid - we want to make sure that "show 
create table" doesn't show the table name.



--
To view, visit http://gerrit.cloudera.org:8080/8820
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ieca037498abf8f5fde67b77e824b720482cdbe6f
Gerrit-Change-Number: 8820
Gerrit-PatchSet: 3
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 12 Dec 2017 16:34:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6290: limit ScannerContext to 1 buffer at a time.

2017-12-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8814 )

Change subject: IMPALA-6290: limit ScannerContext to 1 buffer at a time.
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8814/2/be/src/exec/scanner-context.h
File be/src/exec/scanner-context.h:

http://gerrit.cloudera.org:8080/#/c/8814/2/be/src/exec/scanner-context.h@65
PS2, Line 65: buffer
need to clarify "buffer". Should probably stick to "memory returned" like the 
first sentence.



--
To view, visit http://gerrit.cloudera.org:8080/8814
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74c5960a75f7d88b0e1de4199af731fb13e592f0
Gerrit-Change-Number: 8814
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 12 Dec 2017 18:06:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6284: Mark the intermediate decimal avg struct as packed

2017-12-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8836 )

Change subject: IMPALA-6284: Mark the intermediate decimal avg struct as packed
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8836/1/be/src/exprs/aggregate-functions-ir.cc
File be/src/exprs/aggregate-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/8836/1/be/src/exprs/aggregate-functions-ir.cc@388
PS1, Line 388:   DCHECK_EQ(dst->len, sizeof(DecimalAvgState));
Did you test on a debug build? I think there's a constant in the FE that needs 
to be updated too (DECIMAL_AVG_INTERMEDIATE_SIZE). This DCHECK is meant to 
assert that the fe and be sizes are the same.



--
To view, visit http://gerrit.cloudera.org:8080/8836
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id25ec6e20dde3f50fb37a22135b355ad251809e0
Gerrit-Change-Number: 8836
Gerrit-PatchSet: 1
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 14 Dec 2017 17:50:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6284: Mark the intermediate decimal avg struct as packed

2017-12-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8836 )

Change subject: IMPALA-6284: Mark the intermediate decimal avg struct as packed
..


Patch Set 1:

UBSAN might also be able to detect problems like this (although I haven't 
investigated)


--
To view, visit http://gerrit.cloudera.org:8080/8836
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id25ec6e20dde3f50fb37a22135b355ad251809e0
Gerrit-Change-Number: 8836
Gerrit-PatchSet: 1
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 14 Dec 2017 17:54:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5315: Cast to timestamp fails for YYYY-M-D format

2017-12-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7009 )

Change subject: IMPALA-5315: Cast to timestamp fails for -M-D format
..


Patch Set 10:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/7009/10//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/7009/10//COMMIT_MSG@39
PS10, Line 39: Change-Id: Ib9a184a09d7e7783f04d47588537612c2ecec28f
I think we need some additional tests for bad data and some query tests that 
scan tables with mixed-format timestamps to test that it correctly switches to 
a different context.

I was able to crash things with the current patch fairly easily. Repro:

drop table if exists timestamp_strs;
create table timestamp_strs as select '2001-1-2' as str;
insert into timestamp_strs select '2001-10-2';
insert into timestamp_strs select * from timestamp_strs;
insert into timestamp_strs select * from timestamp_strs;
insert into timestamp_strs select * from timestamp_strs;
insert into timestamp_strs select * from timestamp_strs;
insert into timestamp_strs select '2001-foo-bar';
select str, cast(str as timestamp) from timestamp_strs;


http://gerrit.cloudera.org:8080/#/c/7009/10/be/src/runtime/timestamp-parse-util.h
File be/src/runtime/timestamp-parse-util.h:

http://gerrit.cloudera.org:8080/#/c/7009/10/be/src/runtime/timestamp-parse-util.h@267
PS10, Line 267:   static DateTimeFormatContext LAZY_CTX;
Having this as a static variable doesn't seem right, since it will be shared 
between multiple threads, which will updated it. I don't see a particularly 
easy place to put the last context so that it's local to the thread/column 
that's executing Parse().

I'm wondering if it's just easiest to have lazy_ctx be a local variable that we 
create on the fly - we'd take a 50% perf hit for the lazy formats according to 
your benchmark but it's not clear to me that the optimisation is worth the 
extra complexity of having to manage LAZY_CTX.


http://gerrit.cloudera.org:8080/#/c/7009/10/be/src/runtime/timestamp-parse-util.cc
File be/src/runtime/timestamp-parse-util.cc:

http://gerrit.cloudera.org:8080/#/c/7009/10/be/src/runtime/timestamp-parse-util.cc@378
PS10, Line 378: LIKELY
LIKELY doesn't seem right here - I think we can just remove it.


http://gerrit.cloudera.org:8080/#/c/7009/10/be/src/runtime/timestamp-parse-util.cc@381
PS10, Line 381: if (LAZY_CTX.fmt != NULL) dt_ctx = _CTX;
nit: we use braces on both branches if there is an else.


http://gerrit.cloudera.org:8080/#/c/7009/10/be/src/runtime/timestamp-parse-util.cc@383
PS10, Line 383: LAZY_CTX.Reset(str, len);
nit: missing indentation.



--
To view, visit http://gerrit.cloudera.org:8080/7009
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib9a184a09d7e7783f04d47588537612c2ecec28f
Gerrit-Change-Number: 7009
Gerrit-PatchSet: 10
Gerrit-Owner: Vincent Tran 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vincent Tran 
Gerrit-Comment-Date: Thu, 14 Dec 2017 23:21:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6284: Mark the intermediate decimal avg struct as packed

2017-12-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8836 )

Change subject: IMPALA-6284: Mark the intermediate decimal avg struct as packed
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8836
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id25ec6e20dde3f50fb37a22135b355ad251809e0
Gerrit-Change-Number: 8836
Gerrit-PatchSet: 2
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 14 Dec 2017 23:21:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6300: Fix decimal modulo overflow

2017-12-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8833 )

Change subject: IMPALA-6300: Fix decimal modulo overflow
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8833/1/be/src/runtime/decimal-value.inline.h
File be/src/runtime/decimal-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/8833/1/be/src/runtime/decimal-value.inline.h@261
PS1, Line 261: #ifdef DEBUG
Is there some way we can avoid having different control flow on release and 
debug builds. E.g. a SafeMultiply overload that takes a bool may_overflow param?



--
To view, visit http://gerrit.cloudera.org:8080/8833
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I27c7f25f68353c19c315e1639311ec06b2dea686
Gerrit-Change-Number: 8833
Gerrit-PatchSet: 1
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Thu, 14 Dec 2017 23:38:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6290: limit ScannerContext to 1 buffer at a time.

2017-12-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8814 )

Change subject: IMPALA-6290: limit ScannerContext to 1 buffer at a time.
..


Patch Set 3:

(14 comments)

http://gerrit.cloudera.org:8080/#/c/8814/3/be/src/exec/hdfs-scanner.h
File be/src/exec/hdfs-scanner.h:

http://gerrit.cloudera.org:8080/#/c/8814/3/be/src/exec/hdfs-scanner.h@97
PS3, Line 97: /// This class also encapsulates row batch management.  
Subclasses should call CommitRows()
> nit: long line (and the new Gerrit UI breaks it now, making it look bad).
Done


http://gerrit.cloudera.org:8080/#/c/8814/3/be/src/exec/scanner-context.h
File be/src/exec/scanner-context.h:

http://gerrit.cloudera.org:8080/#/c/8814/3/be/src/exec/scanner-context.h@143
PS3, Line 143: /// (the scan range could be longer than the file).
> Can we extend this comment with what the implication are? Is the stream sti
Done


http://gerrit.cloudera.org:8080/#/c/8814/3/be/src/exec/scanner-context.h@146
PS3, Line 146: /// If true, the stream has reached the end of the file.
> Can we extend this comment with what the implication are? Is the stream sti
Done


http://gerrit.cloudera.org:8080/#/c/8814/3/be/src/exec/scanner-context.h@161
PS3, Line 161: bool ReadBoolean(bool* boolean, Status* status);
> Should we add WARN_UNUSED_RESULT while you're here?
Done. The calling convention of these functions is a bit weir d to avoid 
overhead of status, because returning false implies !status->ok() and 
vice-versa, so some caller were ignoring the return value. I kept the calling 
convention and adjusted the callers to use a pattern that won't trip the 
warning.


http://gerrit.cloudera.org:8080/#/c/8814/3/be/src/exec/scanner-context.h@193
PS3, Line 193: /// Release resources from previous reads in this stream. If 
'done' is true all
> This only releases buffers that are completely read, right? In that case, w
Updated this and the other ReleaseCompleteResources() comment.


http://gerrit.cloudera.org:8080/#/c/8814/3/be/src/exec/scanner-context.h@224
PS3, Line 224: initial
> "initial" implies that there are more, no?
We create extra scan ranges to read past the end of the stream. Agree the 
adjective is confusing so reworded ot be less ambiguous.


http://gerrit.cloudera.org:8080/#/c/8814/3/be/src/exec/scanner-context.h@254
PS3, Line 254: // We always want output_buffer_bytes_left_ to be non-NULL, 
so we can avoid a NULL check
> nit: long line
Done


http://gerrit.cloudera.org:8080/#/c/8814/3/be/src/exec/scanner-context.h@298
PS3, Line 298: 2
> Huh?
Fat finger error.


http://gerrit.cloudera.org:8080/#/c/8814/3/be/src/exec/scanner-context.h@320
PS3, Line 320: Always
 :   /// returns the current I/O buffer to the I/O manager.
> This only seems true when the buffer has been read/skipped completely.
Done


http://gerrit.cloudera.org:8080/#/c/8814/3/be/src/exec/scanner-context.cc
File be/src/exec/scanner-context.cc:

http://gerrit.cloudera.org:8080/#/c/8814/3/be/src/exec/scanner-context.cc@101
PS3, Line 101: buffer_range->ReturnBuffer(move(io_buffer_));
> Can we reset io_buffer_pos_ here, too? It looks to me below like we're usin
I don't think we should be inferring eos from the state of 'io_buffer_' - where 
did you see that? I think we should only be checking it immediately after the 
GetNext() calls.

I factored out this logic into a ReturnIoBuffer() that resets the related 
members though.


http://gerrit.cloudera.org:8080/#/c/8814/3/be/src/exec/scanner-context.cc@182
PS3, Line 182:   if (eosr()) return Status::OK();
> Can you explain here (or in the header at either this function or scan_rang
eosr() mean that all the data in the range was returned to the client. 
scan_range_eosr_ is true and eosr() is false when the last buffer is in the 
scan range but hasn't been fully returned to the client.


http://gerrit.cloudera.org:8080/#/c/8814/3/be/src/exec/scanner-context.cc@226
PS3, Line 226: Status ScannerContext::Stream::GetBytesInternal(int64_t 
requested_len,
> This method still looks very confusing to me. Can you think of ways to make
I added some comments to try and make it clearer. I think the current code 
structure is reasonable with more explanation - it first ensures that it has 
all the data, then figures out how to return it.

I don't think separating the case 1/2 branches earlier results in clearer code, 
because if we don't have a current io_buffer_, we need to fetch the next I/O 
buffer to figure out if it has all the required bytes.


http://gerrit.cloudera.org:8080/#/c/8814/3/be/src/exec/scanner-context.cc@228
PS3, Line 228:   DCHECK_GT(requested_len, boundary_buffer_bytes_left_);
> Can you also add a DCHECK to document and assert requested_len > io_buffer_
Done. That precondition isn't exactly right but I added one.


http://gerrit.cloudera.org:8080/#/c/8814/3/be/src/exec/scanner-context.cc@260
PS3, Line 260: num_bytes
> This is the 

[Impala-ASF-CR] IMPALA-6290: limit ScannerContext to 1 buffer at a time.

2017-12-14 Thread Tim Armstrong (Code Review)
Hello Lars Volker,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8814

to look at the new patch set (#4).

Change subject: IMPALA-6290: limit ScannerContext to 1 buffer at a time.
..

IMPALA-6290: limit ScannerContext to 1 buffer at a time.

This is a prerequisite for constraining the number of buffers per scan
range. Before this patch, calling ReadBytes(), SkipBytes(), etc could
cause an arbitrary number of I/O buffers to accumulate in
'completed_io_buffers_'. E.g. if we allocated 3 * 8MB I/O buffers for
a range and then called ReadBytes(30MB), we would hit resource
exhaustion as soon as 3 buffers were accumulated in
'completed_io_buffers_'.

The fix is to avoid accumulating any buffers in 'completed_io_buffers_'.
Instead of adding them to 'completed_io_buffers_', completed buffers
are just returned to the I/O manager. It turned out that this did not
weaken the ScannerContext's guarantees about memory lifetime, because
ScannerContext::GetBytesInternal() cleared 'boundary_buffer_' each
time it was called regardless. I checked that this behaviour wasn't
a bug by inspecting the scanner code. I could not find any cases
where scanners depended on returned memory remaining valid beyond
the next Read*()/Get*()/Skip*() call on the stream.

This change makes that lifetime explicit in the comments. A
side-effect of this fix is that scanners do not need to call
ReleaseCompletedResources() in CommitRows() and means that the
ScannerContext only ever needs to hold one I/O buffer at a time.

This change also reimplements SkipBytes() to avoid it accumulating
memory in the boundary buffer for large skip sizes.

Also clarifies some of the invariants in ScannerContext. E.g. some
places assumed io_buffer_ != NULL, but that is no longer needed.

Testing:
Ran core tests with ASAN and exhaustive tests with DEBUG.

Change-Id: I74c5960a75f7d88b0e1de4199af731fb13e592f0
---
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/parquet-column-readers.cc
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M be/src/exec/scanner-context.inline.h
M 
testdata/workloads/functional-query/queries/DataErrorsTest/hdfs-scan-node-errors.test
10 files changed, 295 insertions(+), 201 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/8814/4
--
To view, visit http://gerrit.cloudera.org:8080/8814
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I74c5960a75f7d88b0e1de4199af731fb13e592f0
Gerrit-Change-Number: 8814
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4664: Unexpected string conversion in Shell

2017-12-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8762 )

Change subject: IMPALA-4664: Unexpected string conversion in Shell
..


Patch Set 11: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8762
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifdce9781d1d97596c188691b62a141b9bd137610
Gerrit-Change-Number: 8762
Gerrit-PatchSet: 11
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Andre Araujo 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 14 Dec 2017 22:39:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6177: Cleanup incomplete handcrafted IRs before finalizing module

2017-12-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8541 )

Change subject: IMPALA-6177: Cleanup incomplete handcrafted IRs before 
finalizing module
..


Patch Set 10: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/8541
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If975cfb3906482b36dd6ede32ca81de6fcee1d7f
Gerrit-Change-Number: 8541
Gerrit-PatchSet: 10
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 14 Dec 2017 22:38:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation

2017-12-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8414 )

Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
..


Patch Set 15: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1
Gerrit-Change-Number: 8414
Gerrit-PatchSet: 15
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 14 Dec 2017 22:53:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6290: limit ScannerContext to 1 buffer at a time

2017-12-15 Thread Tim Armstrong (Code Review)
Hello Lars Volker,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8814

to look at the new patch set (#5).

Change subject: IMPALA-6290: limit ScannerContext to 1 buffer at a time
..

IMPALA-6290: limit ScannerContext to 1 buffer at a time

This is a prerequisite for constraining the number of buffers per scan
range. Before this patch, calling ReadBytes(), SkipBytes(), etc could
cause an arbitrary number of I/O buffers to accumulate in
'completed_io_buffers_'. E.g. if we allocated 3 * 8MB I/O buffers for
a range and then called ReadBytes(30MB), we would hit resource
exhaustion as soon as 3 buffers were accumulated in
'completed_io_buffers_'.

The fix is to avoid accumulating any buffers in 'completed_io_buffers_'.
Instead of adding them to 'completed_io_buffers_', completed buffers
are just returned to the I/O manager. It turned out that this did not
weaken the ScannerContext's guarantees about memory lifetime, because
ScannerContext::GetBytesInternal() cleared 'boundary_buffer_' each
time it was called regardless. I checked that this behaviour wasn't
a bug by inspecting the scanner code. I could not find any cases
where scanners depended on returned memory remaining valid beyond
the next Read*()/Get*()/Skip*() call on the stream.

This change makes that lifetime explicit in the comments. A
side-effect of this fix is that scanners do not need to call
ReleaseCompletedResources() in CommitRows() and means that the
ScannerContext only ever needs to hold one I/O buffer at a time.

This change also reimplements SkipBytes() to avoid it accumulating
memory in the boundary buffer for large skip sizes.

Also clarifies some of the invariants in ScannerContext. E.g. some
places assumed io_buffer_ != NULL, but that is no longer needed.

Testing:
Ran core tests with ASAN and exhaustive tests with DEBUG.

Change-Id: I74c5960a75f7d88b0e1de4199af731fb13e592f0
---
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/parquet-column-readers.cc
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M be/src/exec/scanner-context.inline.h
M 
testdata/workloads/functional-query/queries/DataErrorsTest/hdfs-scan-node-errors.test
10 files changed, 295 insertions(+), 202 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/8814/5
--
To view, visit http://gerrit.cloudera.org:8080/8814
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I74c5960a75f7d88b0e1de4199af731fb13e592f0
Gerrit-Change-Number: 8814
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6290: limit ScannerContext to 1 buffer at a time.

2017-12-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8814 )

Change subject: IMPALA-6290: limit ScannerContext to 1 buffer at a time.
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8814/4/be/src/exec/scanner-context.h
File be/src/exec/scanner-context.h:

http://gerrit.cloudera.org:8080/#/c/8814/4/be/src/exec/scanner-context.h@231
PS4, Line 231: ScannerContext
> scan range?
Done


http://gerrit.cloudera.org:8080/#/c/8814/4/be/src/exec/scanner-context.cc
File be/src/exec/scanner-context.cc:

http://gerrit.cloudera.org:8080/#/c/8814/4/be/src/exec/scanner-context.cc@238
PS4, Line 238:   // First this loop ensures that we've got all of the requested 
bytes in either
> nit: Assuming that this comment covers the while loop to the empty line aft
Wording was confusing, I meant what you said. Reworded to hopefully be less 
confusing.

Also moved the case1/2 comment below since I think this comment is sufficient 
to explain the loop now.



--
To view, visit http://gerrit.cloudera.org:8080/8814
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74c5960a75f7d88b0e1de4199af731fb13e592f0
Gerrit-Change-Number: 8814
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 15 Dec 2017 16:40:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3703: Store query context in thread-local variables

2017-12-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8621 )

Change subject: IMPALA-3703: Store query context in thread-local variables
..


Patch Set 12: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8621/12/be/src/common/thread-debug-info-test.cc
File be/src/common/thread-debug-info-test.cc:

http://gerrit.cloudera.org:8080/#/c/8621/12/be/src/common/thread-debug-info-test.cc@30
PS12, Line 30:
Not a big deal, but there's a bit more vertical whitespace in these tests than 
we usually prefer.



--
To view, visit http://gerrit.cloudera.org:8080/8621
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I566f7f1db5117c498e86e0bd05b33bdcff624609
Gerrit-Change-Number: 8621
Gerrit-PatchSet: 12
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 15 Dec 2017 00:54:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5654: Disallow setting Kudu table name in CREATE TABLE

2017-12-14 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8820 )

Change subject: IMPALA-5654: Disallow setting Kudu table name in CREATE TABLE
..


Patch Set 4:

(1 comment)

The code and tests look good to me, had one outstanding question that came out 
of a discussion.

http://gerrit.cloudera.org:8080/#/c/8820/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8820/4//COMMIT_MSG@7
PS4, Line 7: IMPALA-5654: Disallow setting Kudu table name in CREATE TABLE
Gabor mentioned to me offline that we might want to catch it in alter table too 
(unsure if that is already disallowed).



--
To view, visit http://gerrit.cloudera.org:8080/8820
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ieca037498abf8f5fde67b77e824b720482cdbe6f
Gerrit-Change-Number: 8820
Gerrit-PatchSet: 4
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 15 Dec 2017 01:00:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6290: limit ScannerContext to 1 buffer at a time

2017-12-15 Thread Tim Armstrong (Code Review)
Hello Michael Ho, Lars Volker,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8814

to look at the new patch set (#6).

Change subject: IMPALA-6290: limit ScannerContext to 1 buffer at a time
..

IMPALA-6290: limit ScannerContext to 1 buffer at a time

This is a prerequisite for constraining the number of buffers per scan
range. Before this patch, calling ReadBytes(), SkipBytes(), etc could
cause an arbitrary number of I/O buffers to accumulate in
'completed_io_buffers_'. E.g. if we allocated 3 * 8MB I/O buffers for
a range and then called ReadBytes(30MB), we would hit resource
exhaustion as soon as 3 buffers were accumulated in
'completed_io_buffers_'.

The fix is to avoid accumulating any buffers in 'completed_io_buffers_'.
Instead of adding them to 'completed_io_buffers_', completed buffers
are just returned to the I/O manager. It turned out that this did not
weaken the ScannerContext's guarantees about memory lifetime, because
ScannerContext::GetBytesInternal() cleared 'boundary_buffer_' each
time it was called regardless. I checked that this behaviour wasn't
a bug by inspecting the scanner code. I could not find any cases
where scanners depended on returned memory remaining valid beyond
the next Read*()/Get*()/Skip*() call on the stream.

This change makes that lifetime explicit in the comments. A
side-effect of this fix is that scanners do not need to call
ReleaseCompletedResources() in CommitRows() and means that the
ScannerContext only ever needs to hold one I/O buffer at a time.

This change also reimplements SkipBytes() to avoid it accumulating
memory in the boundary buffer for large skip sizes.

Also clarifies some of the invariants in ScannerContext. E.g. some
places assumed io_buffer_ != NULL, but that is no longer needed.

Testing:
Ran core tests with ASAN and exhaustive tests with DEBUG.

Change-Id: I74c5960a75f7d88b0e1de4199af731fb13e592f0
---
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/parquet-column-readers.cc
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M be/src/exec/scanner-context.inline.h
M 
testdata/workloads/functional-query/queries/DataErrorsTest/hdfs-scan-node-errors.test
10 files changed, 294 insertions(+), 201 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/8814/6
--
To view, visit http://gerrit.cloudera.org:8080/8814
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I74c5960a75f7d88b0e1de4199af731fb13e592f0
Gerrit-Change-Number: 8814
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6290: limit ScannerContext to 1 buffer at a time

2017-12-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8814 )

Change subject: IMPALA-6290: limit ScannerContext to 1 buffer at a time
..


Patch Set 6: Code-Review+1

Rebased onto master. There wasn't really a dependency between this and the 
patch it was on top of.


--
To view, visit http://gerrit.cloudera.org:8080/8814
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74c5960a75f7d88b0e1de4199af731fb13e592f0
Gerrit-Change-Number: 8814
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 15 Dec 2017 21:44:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6318: Adjustment for hanging query cancellation test

2017-12-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8852 )

Change subject: IMPALA-6318: Adjustment for hanging query cancellation test
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8852/3/tests/shell/util.py
File tests/shell/util.py:

http://gerrit.cloudera.org:8080/#/c/8852/3/tests/shell/util.py@168
PS3, Line 168: stdout = None if omit_stdout else PIPE
Looking at the Popen docs, I think this means that stdout gets sent to the 
stdout of the current process, which I think means that we're capturing it as 
part of the test output. Is this what we want?



--
To view, visit http://gerrit.cloudera.org:8080/8852
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I082c83b91b6d0c527de92c7992f0dc9d1b290433
Gerrit-Change-Number: 8852
Gerrit-PatchSet: 3
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 15 Dec 2017 16:28:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6290: limit ScannerContext to 1 buffer at a time.

2017-12-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8814 )

Change subject: IMPALA-6290: limit ScannerContext to 1 buffer at a time.
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8814/4/be/src/exec/scanner-context.cc
File be/src/exec/scanner-context.cc:

http://gerrit.cloudera.org:8080/#/c/8814/4/be/src/exec/scanner-context.cc@232
PS4, Line 232:   // 1. if the read starts at an I/O buffer boundary and 
'requested_len' is smaller than
This was not quite right - we also return a pointer into the I/O buffer if the 
previous read was from a boundary buffer, ending partway into an I/O buffer



--
To view, visit http://gerrit.cloudera.org:8080/8814
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74c5960a75f7d88b0e1de4199af731fb13e592f0
Gerrit-Change-Number: 8814
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 15 Dec 2017 16:43:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4664: Unexpected string conversion in Shell

2017-12-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8762 )

Change subject: IMPALA-4664: Unexpected string conversion in Shell
..


Patch Set 15: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8762
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifdce9781d1d97596c188691b62a141b9bd137610
Gerrit-Change-Number: 8762
Gerrit-PatchSet: 15
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Andre Araujo 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 15 Dec 2017 17:56:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2640: Make a given command case-sensitive

2017-12-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8762 )

Change subject: IMPALA-2640: Make a given command case-sensitive
..


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8762/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8762/6//COMMIT_MSG@7
PS6, Line 7: IMPALA-2640: Make a given command case-sensitive
> In the fix of IMPALA-4664(https://gerrit.cloudera.org/#/c/8639/4/shell/impa
I think the lower-case column description is intended.

The tests you added in 
https://gerrit.cloudera.org/#/c/8639/4/tests/shell/test_shell_interactive.py 
all appear to be fixed by this patch set, so I think we should add those tests 
to this patch.


http://gerrit.cloudera.org:8080/#/c/8762/9/shell/impala_shell.py
File shell/impala_shell.py:

http://gerrit.cloudera.org:8080/#/c/8762/9/shell/impala_shell.py@555
PS9, Line 555:
> I added some comment for the coping. The PSF license is already included in
We still need to add a line in LICENSE.txt like:

  Parts of shell/impala_shell.py: Python Software License V2



--
To view, visit http://gerrit.cloudera.org:8080/8762
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifdce9781d1d97596c188691b62a141b9bd137610
Gerrit-Change-Number: 8762
Gerrit-PatchSet: 6
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Andre Araujo 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 13 Dec 2017 23:09:45 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2640: Make a given command case-sensitive

2017-12-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8762 )

Change subject: IMPALA-2640: Make a given command case-sensitive
..


Patch Set 6:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/8762/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8762/6//COMMIT_MSG@7
PS6, Line 7: IMPALA-2640: Make a given command case-sensitive
Unless I'm missing something, doesn't this solve IMPALA-4664 too? I checked out 
this commit and played around and I get:

[localhost:21000] > sElect'FOO';
Query: sElect 'FOO'
Query submitted at: 2017-12-12 15:30:44 (Coordinator: 
http://tarmstrong-box:25000)
Query progress can be monitored at: 
http://tarmstrong-box:25000/query_plan?query_id=c049cc68f0842616:4419587c
+---+
| 'foo' |
+---+
| FOO   |
+---+
Fetched 1 row(s) in 0.00s


http://gerrit.cloudera.org:8080/#/c/8762/6/shell/impala_shell.py
File shell/impala_shell.py:

http://gerrit.cloudera.org:8080/#/c/8762/6/shell/impala_shell.py@572
PS6, Line 572: return func(arg, cmd_)
> Done.
It's a matter of taste but I think passing in cmd as an argument is clearer - 
fewer member variables and implicit argument parsing seems easier to follow. I 
agree it still feels a bit weird in some ways. Anyway, I think I'm ok with the 
current version.


http://gerrit.cloudera.org:8080/#/c/8762/9/shell/impala_shell.py
File shell/impala_shell.py:

http://gerrit.cloudera.org:8080/#/c/8762/9/shell/impala_shell.py@305
PS9, Line 305:  
self.set_query_options)
Long line


http://gerrit.cloudera.org:8080/#/c/8762/9/shell/impala_shell.py@310
PS9, Line 310:! 
Long line.


http://gerrit.cloudera.org:8080/#/c/8762/9/shell/impala_shell.py@555
PS9, Line 555:
Can you include attribution for where the copied code came from (e.g. see 
be/src/runtime/string-search.h). It also needs to be added to LICENSE.txt. It 
looks like it's licensed with the PSF license V2, which we already include in 
LICENSE.txt.


http://gerrit.cloudera.org:8080/#/c/8762/9/shell/impala_shell.py@571
PS9, Line 571: etur
_ on the end of a local variable is a bit weird to me - can we call it command 
instead?



--
To view, visit http://gerrit.cloudera.org:8080/8762
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifdce9781d1d97596c188691b62a141b9bd137610
Gerrit-Change-Number: 8762
Gerrit-PatchSet: 6
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Andre Araujo 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 12 Dec 2017 23:31:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6177: Cleanup incomplete handcrafted IRs before finalizing module

2017-12-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8541 )

Change subject: IMPALA-6177: Cleanup incomplete handcrafted IRs before 
finalizing module
..


Patch Set 9: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8541/8/be/src/codegen/llvm-codegen-test.cc
File be/src/codegen/llvm-codegen-test.cc:

http://gerrit.cloudera.org:8080/#/c/8541/8/be/src/codegen/llvm-codegen-test.cc@167
PS8, Line 167: }
> that would crash llvm(and hence the process) during FinalizeModule
We could use the magic IMPALA_ASSERT_DEBUG_DEATH() macro if we want to test 
that code path. It might not be worth testing if it's an invalid use of the API.



--
To view, visit http://gerrit.cloudera.org:8080/8541
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If975cfb3906482b36dd6ede32ca81de6fcee1d7f
Gerrit-Change-Number: 8541
Gerrit-PatchSet: 9
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 13 Dec 2017 00:01:24 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5191: Standardize column alias behavior

2017-12-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8801 )

Change subject: IMPALA-5191: Standardize column alias behavior
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8801/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8801/5//COMMIT_MSG@14
PS5, Line 14: SELECT int_col / 2 AS x
Do we have existing end-to-end tests for queries of these forms? Would be good 
to confirm that the frontend produces plans that are executable and correct.


http://gerrit.cloudera.org:8080/#/c/8801/5/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/8801/5/fe/src/main/java/org/apache/impala/analysis/Expr.java@817
PS5, Line 817:  boolean preserveRootType, boolean substituteChildren
I find it a bit hard to read code calling methods with multiple boolean 
arguments since it's easy to get the flags mixed up at the callsite.

I'm ok with this as-is, but we could consider adding public methods instead of 
adding flags. E.g. trySubsitute() vs trySubsituteRoot() or 
trySubstituteRecursive() vs trySubstituteRoot().



--
To view, visit http://gerrit.cloudera.org:8080/8801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0f82483b486acf6953876cfa672b0d034f3709a8
Gerrit-Change-Number: 8801
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 13 Dec 2017 21:50:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6222: Add details to error msg on failure to get min reservation

2017-12-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8781 )

Change subject: IMPALA-6222: Add details to error msg on failure to get min 
reservation
..


Patch Set 3: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8781/3/be/src/runtime/mem-tracker.cc
File be/src/runtime/mem-tracker.cc:

http://gerrit.cloudera.org:8080/#/c/8781/3/be/src/runtime/mem-tracker.cc@324
PS3, Line 324:   std::priority_queue, 
vector>,
nit: add a using up the top of the file or in common/names.h instead of using 
the std:: prefixes.


http://gerrit.cloudera.org:8080/#/c/8781/3/be/src/runtime/mem-tracker.cc@340
PS3, Line 340: int limit) {
nit: weird wrapping.



--
To view, visit http://gerrit.cloudera.org:8080/8781
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic4675fe923b33fdc4ddefd1872e6d6b803993d74
Gerrit-Change-Number: 8781
Gerrit-PatchSet: 3
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 13 Dec 2017 22:51:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6184: Clean up aftr ScalarExprEvaluator::Clone() fails

2017-11-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8572 )

Change subject: IMPALA-6184: Clean up aftr ScalarExprEvaluator::Clone() fails
..


Patch Set 1:

(2 comments)

Just a couple of questions.

http://gerrit.cloudera.org:8080/#/c/8572/1/be/src/exprs/scalar-expr-evaluator.cc
File be/src/exprs/scalar-expr-evaluator.cc:

http://gerrit.cloudera.org:8080/#/c/8572/1/be/src/exprs/scalar-expr-evaluator.cc@193
PS1, Line 193: Status status =
Kinda subtle, can you add a comment like the one above?

// Always add the evaluator to the vector so it can be cleaned up.


http://gerrit.cloudera.org:8080/#/c/8572/1/testdata/workloads/functional-query/queries/QueryTest/udf-errors.test
File testdata/workloads/functional-query/queries/QueryTest/udf-errors.test:

http://gerrit.cloudera.org:8080/#/c/8572/1/testdata/workloads/functional-query/queries/QueryTest/udf-errors.test@85
PS1, Line 85: on (bad_expr2(rand()) = (t2.bool_col && t1.bool_col));
Is this one guaranteed to fail? It seems like if we're unlucky, rand() could 
not return one of the values that BadExpr fails on.



--
To view, visit http://gerrit.cloudera.org:8080/8572
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I45ffd722d0a69ad05ae3c748cf504c7f1a959a1d
Gerrit-Change-Number: 8572
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 16 Nov 2017 15:50:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4835 (prep only): create io subfolder and namespace

2017-11-17 Thread Tim Armstrong (Code Review)
Hello Joe McDonnell, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8424

to look at the new patch set (#9).

Change subject: IMPALA-4835 (prep only): create io subfolder and namespace
..

IMPALA-4835 (prep only): create io subfolder and namespace

Instead of using the DiskIoMgr class as a namespace, which prevents
forward-declaration of inner classes, create an impala::io namespace
and unnested the inner class.

This is done in anticipation of DiskIoMgr depending on BufferPool. This
helps avoid a circular dependency between DiskIoMgr, TmpFileMgr and
BufferPool headers that could not be broken with forward declarations.

Testing:
Ran core tests.

Change-Id: If807f93a47d8027a43e56dd80b1b535d0bb74e1b
---
M be/CMakeLists.txt
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scan-node-mt.h
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/kudu-scan-node.cc
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M be/src/runtime/CMakeLists.txt
D be/src/runtime/disk-io-mgr.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
A be/src/runtime/io/CMakeLists.txt
R be/src/runtime/io/disk-io-mgr-internal.h
R be/src/runtime/io/disk-io-mgr-stress-test.cc
R be/src/runtime/io/disk-io-mgr-stress.cc
R be/src/runtime/io/disk-io-mgr-stress.h
R be/src/runtime/io/disk-io-mgr-test.cc
R be/src/runtime/io/disk-io-mgr.cc
A be/src/runtime/io/disk-io-mgr.h
R be/src/runtime/io/handle-cache.h
R be/src/runtime/io/handle-cache.inline.h
R be/src/runtime/io/request-context.cc
R be/src/runtime/io/request-context.h
A be/src/runtime/io/request-ranges.h
R be/src/runtime/io/scan-range.cc
M be/src/runtime/row-batch.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.h
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
38 files changed, 1,417 insertions(+), 1,310 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/8424/9
-- 
To view, visit http://gerrit.cloudera.org:8080/8424
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If807f93a47d8027a43e56dd80b1b535d0bb74e1b
Gerrit-Change-Number: 8424
Gerrit-PatchSet: 9
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation

2017-11-17 Thread Tim Armstrong (Code Review)
Hello Tianyi Wang, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8414

to look at the new patch set (#10).

Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
..

IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation

In preparation for switching the I/O mgr to the buffer pool, this
removes and cleans up a lot of code so that the switchover patch starts
from a cleaner slate.

* Remove the free buffer cache (which will be replaced by buffer pool's
  own caching).
* Make memory limit exceeded error checking synchronous (in anticipation
  of having to propagate buffer pool errors synchronously).
* Simplify error propagation - remove the (ineffectual) code that
  enqueued BufferDescriptors containing error statuses.
* Document locking scheme better in a few places, make it part of the
  function signature when it seemed reasonable.
* Move ReturnBuffer() to ScanRange, because it is intrinsically
  connected with the lifecycle of a scan range.
* Separate external ReturnBuffer() and internal CleanUpBuffer()
  interfaces - previously callers of ReturnBuffer() were fudging
  the num_buffers_in_reader accounting to make the external interface work.
* Eliminate redundant state in ScanRange: 'eosr_returned_' and
  'is_cancelled_'.
* Clarify the logic around calling Close() for the last
  BufferDescriptor.
  -> There appeared to be an implicit assumption that buffers would be
 freed in the order they were returned from the scan range, so that
 the "eos" buffer was returned last. Instead just count the number
 of outstanding buffers to detect the last one.
  -> Touching the is_cancelled_ field without holding a lock was hard to
 reason about - violated locking rules and it was unclear that it
 was race-free.

This will probably regress performance somewhat because of the cache
removal, so my plan is to merge it around the same time as switching
the I/O mgr to allocate from the buffer pool. I'm keeping the patches
separate to make reviewing easier.

Testing:
* Ran exhaustive tests
* Ran the disk-io-mgr-stress-test overnight

Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/scanner-context.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/io/disk-io-mgr-stress.cc
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/runtime/io/request-context.cc
M be/src/runtime/io/request-context.h
M be/src/runtime/io/request-ranges.h
M be/src/runtime/io/scan-range.cc
M be/src/runtime/mem-tracker.h
M be/src/runtime/test-env.cc
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
19 files changed, 552 insertions(+), 861 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/8414/10
--
To view, visit http://gerrit.cloudera.org:8080/8414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1
Gerrit-Change-Number: 8414
Gerrit-PatchSet: 10
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6206: Fix data load failure with -notests

2017-11-17 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8580 )

Change subject: IMPALA-6206: Fix data load failure with -notests
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8580/2/be/src/udf_samples/CMakeLists.txt
File be/src/udf_samples/CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/8580/2/be/src/udf_samples/CMakeLists.txt@44
PS2, Line 44: # Data loading depends on building these targets
This feels a little weird since those targets aren't really part of the 
ImpalaUdf library (or in that directory). How about added a UdfSamples target 
and adding it to MAKE_TARGETS in bin/make_impala.sh.



--
To view, visit http://gerrit.cloudera.org:8080/8580
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idaa193989d77d56b72db05ad54e1fb0a345938fb
Gerrit-Change-Number: 8580
Gerrit-PatchSet: 2
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 17 Nov 2017 17:15:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4835 (prep only): create io subfolder and namespace

2017-11-17 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8424 )

Change subject: IMPALA-4835 (prep only): create io subfolder and namespace
..


Patch Set 9:

rebased and tweaked the #define guard to include _IO so that text scanner 
plugins like Impala-lzo can detect whether it is pre/post refactor.


--
To view, visit http://gerrit.cloudera.org:8080/8424
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If807f93a47d8027a43e56dd80b1b535d0bb74e1b
Gerrit-Change-Number: 8424
Gerrit-PatchSet: 9
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 17 Nov 2017 17:47:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4835 (prep only): create io subfolder and namespace

2017-11-17 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8424 )

Change subject: IMPALA-4835 (prep only): create io subfolder and namespace
..


Patch Set 9: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8424
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If807f93a47d8027a43e56dd80b1b535d0bb74e1b
Gerrit-Change-Number: 8424
Gerrit-PatchSet: 9
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 17 Nov 2017 17:47:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6128: Spill-to-disk Encryption(AES-CFB + SHA256) is slow

2017-11-20 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8510 )

Change subject: IMPALA-6128: Spill-to-disk Encryption(AES-CFB + SHA256) is slow
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8510/7/be/src/util/openssl-util.cc
File be/src/util/openssl-util.cc:

http://gerrit.cloudera.org:8080/#/c/8510/7/be/src/util/openssl-util.cc@147
PS7, Line 147:   return (SSLeay() >= OPENSSL_VERSION_1_0_1);
When we re-do this patch we should also add a test option to force use of CFB 
mode and make sure that the unit test tests both encryption modes.



--
To view, visit http://gerrit.cloudera.org:8080/8510
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib97939f2334838263364b53ef3413871638bf53e
Gerrit-Change-Number: 8510
Gerrit-PatchSet: 7
Gerrit-Owner: Xianda Ke 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Mike Yoder 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Xianda Ke 
Gerrit-Comment-Date: Mon, 20 Nov 2017 17:57:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6220: Revert "IMPALA-6128: Spill-to-disk Encryption(AES-CFB + SHA256) is slow"

2017-11-20 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8597 )

Change subject: IMPALA-6220: Revert "IMPALA-6128: Spill-to-disk 
Encryption(AES-CFB + SHA256) is slow"
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8597
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id31d5fcfec5c6d777d4acee5c1be2d4fc4605efb
Gerrit-Change-Number: 8597
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Xianda Ke 
Gerrit-Comment-Date: Mon, 20 Nov 2017 17:00:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1144: Fix exception when cancelling query in Impala-shell with CTRL-C

2017-11-20 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8549 )

Change subject: IMPALA-1144: Fix exception when cancelling query in 
Impala-shell with CTRL-C
..


Patch Set 5:

(10 comments)

Overall approach seems good to me, but had quite a few comments about comments 
and naming.

http://gerrit.cloudera.org:8080/#/c/8549/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8549/2//COMMIT_MSG@25
PS2, Line 25: Change-Id: I6cefaf1dae78baae238289816a7cb9d210fb38e2
> Thanks for the nice example, Phil!
It seems ok that it doesn't reproduce it every time - it's inherently racy. We 
could probably have a deterministic unit test by mocking out enough things, but 
I think I prefer the end-to-end shell test.

It would be good to also test on release and ASAN builds to make sure that it 
doesn't immediately fail when the daemon is slow or fast.


http://gerrit.cloudera.org:8080/#/c/8549/5/shell/impala_client.py
File shell/impala_client.py:

http://gerrit.cloudera.org:8080/#/c/8549/5/shell/impala_client.py@58
PS5, Line 58: QueryCancelledException
Can we call this QueryCancelledByShellException or 
ExpectedQueryCancellationException or something like that to distinguish from 
the case when the query was cancelled unexpectedly (e.g. by an admin from the 
impala debug ui).


http://gerrit.cloudera.org:8080/#/c/8549/5/shell/impala_client.py@60
PS5, Line 60:   self.value = value
The value might not be necessary if we're always going to pass in the same 
string. The existing exceptions do this but seems like an odd pattern. We could 
just print that string instead of stringifying the exception.


http://gerrit.cloudera.org:8080/#/c/8549/5/shell/impala_client.py@83
PS5, Line 83: self.is_query_cancelled = False
Can you add a brief comment explaining how this works. I.e. that it's set to 
true by the cancellation signal handler and suppresses any errors after 
cancellation.


http://gerrit.cloudera.org:8080/#/c/8549/5/shell/impala_client.py@435
PS5, Line 435: except BeeswaxService.QueryNotFoundException:
It's weird that we don't need the logic for QueryNotFoundException. Does Impala 
even throw this? I'm ok with leaving this in since it's part of the beeswax 
protocol but we should also check for cancellation for consistency.


http://gerrit.cloudera.org:8080/#/c/8549/5/shell/impala_shell.py
File shell/impala_shell.py:

http://gerrit.cloudera.org:8080/#/c/8549/5/shell/impala_shell.py@1026
PS5, Line 1026:   print_to_stderr(str(e))
It might be to just not print anything - we don't always throw this exception 
when the query is cancelled so the message only gets printed sometimes.


http://gerrit.cloudera.org:8080/#/c/8549/5/tests/shell/test_shell_commandline.py
File tests/shell/test_shell_commandline.py:

http://gerrit.cloudera.org:8080/#/c/8549/5/tests/shell/test_shell_commandline.py@333
PS5, Line 333: 'select * from v, v v2, v v3, v v4, v v5, v v6, v v7, v 
v8, v v9;"'
Maybe add a few more tables to the cross join to make sure that it runs for a 
very long time. We don't want a flaky test if it sometimes finishes quicker.


http://gerrit.cloudera.org:8080/#/c/8549/5/tests/shell/test_shell_commandline.py@342
PS5, Line 342:'tpch.customer c3 order by c1.c_name"'
You could also do something like:
  order by c1.c_name, sleep(1)

That would make it pause 1ms per row and slow it down further.


http://gerrit.cloudera.org:8080/#/c/8549/5/tests/shell/test_shell_commandline.py@346
PS5, Line 346: query_txt
Isn't this the command-line arguments?


http://gerrit.cloudera.org:8080/#/c/8549/5/tests/shell/test_shell_commandline.py@349
PS5, Line 349: sleep(3)
Can you add a comment to the function explaining that it waits 3 seconds before 
cancelling and expects the query to be cancelled.

Is there any reason why 3 seconds is the right amount of time? Currently it 
just seems like a magic number. Would be good to reduce it if possible, since 
it will make the tests very slow (other shell tests have this problem already).



--
To view, visit http://gerrit.cloudera.org:8080/8549
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6cefaf1dae78baae238289816a7cb9d210fb38e2
Gerrit-Change-Number: 8549
Gerrit-PatchSet: 5
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 

[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation

2017-11-20 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8414 )

Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
..


Patch Set 9:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/disk-io-mgr.h
File be/src/runtime/io/disk-io-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/disk-io-mgr.h@a222
PS11, Line 222:
  :
  :
> did you mean to just delete that?
Done


http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/disk-io-mgr.h@457
PS11, Line 457: XCEED
> queued?
Done


http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/disk-io-mgr.cc
File be/src/runtime/io/disk-io-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/disk-io-mgr.cc@333
PS11, Line 333: CANCELLED
> seems like the right thing to do but are you sure we weren't depending on p
Before this patch the only place when RequestContext::Cancel() was called with 
a non-CANCELLED status was with MEM_LIMIT_EXCEEDED. This patch removes that 
call and the need for any asynchronous error propagation via RequestContext, so 
can't introduce any new bugs in this area.

Error propagation bugs are still possible in client code if they cancel the 
context before propagating the original error, but I believe the HDFS scan node 
code is written to avoid this.


http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/disk-io-mgr.cc@581
PS11, Line 581:   DCHECK(!buffer->is_cached()) << "HDFS cache reads don't go 
through this code path.";
> maybe put that dcheck upstream too (like ReadRange() or even earlier) to he
There is already a DCHECK in ReadRange() that catches this, but its 
non-obvious. I added a string to it to make it clearer what it's checking.


http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/disk-io-mgr.cc@665
PS11, Line 665: DCHECK
> nit: _EQ
The strong typed enums prevent that unfortunately. I did change this to print 
out the integer value on failure.


http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/request-context.h
File be/src/runtime/io/request-context.h:

http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/request-context.h@448
PS11, Line 448: again.
> disk threads?
Done


http://gerrit.cloudera.org:8080/#/c/8414/9/be/src/runtime/io/request-context.h
File be/src/runtime/io/request-context.h:

http://gerrit.cloudera.org:8080/#/c/8414/9/be/src/runtime/io/request-context.h@27
PS9, Line 27: For most I/O manager clients it is an
: /// opaque pointer, but some clients may need to include this 
heade
> is that still true?
Done


http://gerrit.cloudera.org:8080/#/c/8414/9/be/src/runtime/io/request-context.h@93
PS9, Line 93:   void Cancel();
> how did you decide to make this a first class thing in the RequestContext,
Yeah, I'd probably call it DiskIoMgrClient or something like that if I was 
writing this from scratch.

The ones I moved were the trivial methods that were only called during 
RequestContext set-up and tear-down. The other DiskIoMgr methods are called 
from more places and are more about the I/O operations than setting up the 
context.

It isn't really principled now that I think about it.


http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/request-context.cc
File be/src/runtime/io/request-context.cc:

http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/request-context.cc@42
PS11, Line 42: << static_cast(range->external_buffer_tag_);
> oh, I guess DCHECK_EQ doesn't work with enum or something?
Yeah, not with the strongly-typed "enum class" enums. It works with weakly 
typed enums because they get implicitly converted to integers.


http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/request-context.cc@75
PS11, Line 75: reader
> what about the writing case? is that relevant?
Yep, this needed updating.


http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/request-context.cc@76
PS11, Line 76: GetNext
> GetNext()
Done


http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/request-context.cc@79
PS11, Line 79: it
> does this referring to the context or something else?
Done


http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/request-context.cc@90
PS11, Line 90: cancelled Status
> does that mean the CANCELLED status or the status that lead to cancellation
The RequestContext can't be cancelled with an arbitrary status. If there was an 
earlier error for a ScanRange, the cancelled status doesn't overwrite the prior 
status.


http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/request-context.cc@91
PS11, Line 91:  the cancelled state.
> what does that mean?
reworded.


http://gerrit.cloudera.org:8080/#/c/8414/11/be/src/runtime/io/request-context.cc@95
PS11, Line 95: decrements the number of disk queues the context
 : // is on.
> which code is this talking about? Oh, I guess that's 

[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation

2017-11-20 Thread Tim Armstrong (Code Review)
Hello Tianyi Wang, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8414

to look at the new patch set (#12).

Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
..

IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation

In preparation for switching the I/O mgr to the buffer pool, this
removes and cleans up a lot of code so that the switchover patch starts
from a cleaner slate.

* Remove the free buffer cache (which will be replaced by buffer pool's
  own caching).
* Make memory limit exceeded error checking synchronous (in anticipation
  of having to propagate buffer pool errors synchronously).
* Simplify error propagation - remove the (ineffectual) code that
  enqueued BufferDescriptors containing error statuses.
* Document locking scheme better in a few places, make it part of the
  function signature when it seemed reasonable.
* Move ReturnBuffer() to ScanRange, because it is intrinsically
  connected with the lifecycle of a scan range.
* Separate external ReturnBuffer() and internal CleanUpBuffer()
  interfaces - previously callers of ReturnBuffer() were fudging
  the num_buffers_in_reader accounting to make the external interface work.
* Eliminate redundant state in ScanRange: 'eosr_returned_' and
  'is_cancelled_'.
* Clarify the logic around calling Close() for the last
  BufferDescriptor.
  -> There appeared to be an implicit assumption that buffers would be
 freed in the order they were returned from the scan range, so that
 the "eos" buffer was returned last. Instead just count the number
 of outstanding buffers to detect the last one.
  -> Touching the is_cancelled_ field without holding a lock was hard to
 reason about - violated locking rules and it was unclear that it
 was race-free.
* Remove DiskIoMgr::Read() to simplify interface. It is trivial to
  inline at the callsites.

This will probably regress performance somewhat because of the cache
removal, so my plan is to merge it around the same time as switching
the I/O mgr to allocate from the buffer pool. I'm keeping the patches
separate to make reviewing easier.

Testing:
* Ran exhaustive tests
* Ran the disk-io-mgr-stress-test overnight

Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/scanner-context.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/io/disk-io-mgr-stress.cc
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/runtime/io/request-context.cc
M be/src/runtime/io/request-context.h
M be/src/runtime/io/request-ranges.h
M be/src/runtime/io/scan-range.cc
M be/src/runtime/mem-tracker.h
M be/src/runtime/test-env.cc
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
19 files changed, 575 insertions(+), 904 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/8414/12
--
To view, visit http://gerrit.cloudera.org:8080/8414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1
Gerrit-Change-Number: 8414
Gerrit-PatchSet: 12
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-2181: Add query option levels for display

2017-11-20 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8447 )

Change subject: IMPALA-2181: Add query option levels for display
..


Patch Set 15:

(17 comments)

http://gerrit.cloudera.org:8080/#/c/8447/14//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8447/14//COMMIT_MSG@10
PS14, Line 10: diplayed
displayed


http://gerrit.cloudera.org:8080/#/c/8447/14//COMMIT_MSG@12
PS14, Line 12: called SET ALL shows all the options grouped by their option 
levels.
Mention that there are also server-side versions of these commands?


http://gerrit.cloudera.org:8080/#/c/8447/14/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/8447/14/be/src/service/impala-server.cc@1236
PS14, Line 1236:   else {
Nit:
  } else {


http://gerrit.cloudera.org:8080/#/c/8447/15/be/src/service/query-options.h
File be/src/service/query-options.h:

http://gerrit.cloudera.org:8080/#/c/8447/15/be/src/service/query-options.h@45
PS15, Line 45:   QUERY_OPT_FN(disable_cached_reads, DISABLE_CACHED_READS, 
TQueryOptionLevel::REGULAR)\
Looks like disable_cached_reads is deprecated: see IMPALA-2963 . I missed that 
on an earlier pass


http://gerrit.cloudera.org:8080/#/c/8447/15/be/src/service/query-options.h@89
PS15, Line 89:   QUERY_OPT_FN(enable_expr_rewrites, ENABLE_EXPR_REWRITES, 
TQueryOptionLevel::REGULAR)\
I think enable_expr_rewrites should be advanced. It's only disabled as a 
workaround to planner bugs.


http://gerrit.cloudera.org:8080/#/c/8447/15/be/src/service/query-options.h@91
PS15, Line 91:   QUERY_OPT_FN(parquet_dictionary_filtering, 
PARQUET_DICTIONARY_FILTERING, TQueryOptionLevel::REGULAR)\
parquet_dictionary_filtering should probably be advanced, there's typically not 
much reason to disable it unless working around a bug.


http://gerrit.cloudera.org:8080/#/c/8447/15/be/src/service/query-options.h@93
PS15, Line 93:   QUERY_OPT_FN(parquet_read_statistics, PARQUET_READ_STATISTICS, 
TQueryOptionLevel::REGULAR)\
Same for parquet_read_statistics.


http://gerrit.cloudera.org:8080/#/c/8447/14/be/src/service/query-options.h
File be/src/service/query-options.h:

http://gerrit.cloudera.org:8080/#/c/8447/14/be/src/service/query-options.h@39
PS14, Line 39:   QUERY_OPT_FN(abort_on_default_limit_exceeded, 
ABORT_ON_DEFAULT_LIMIT_EXCEEDED, TQueryOptionLevel::DEPRECATED)\
nit: can we wrap these to 90 chars?


http://gerrit.cloudera.org:8080/#/c/8447/14/be/src/service/query-options.cc
File be/src/service/query-options.cc:

http://gerrit.cloudera.org:8080/#/c/8447/14/be/src/service/query-options.cc@41
PS14, Line 41: using namespace beeswax;
nit: probably best to individually import the names so it's easier to figure 
out what got imported.


http://gerrit.cloudera.org:8080/#/c/8447/14/common/thrift/ImpalaService.thrift
File common/thrift/ImpalaService.thrift:

http://gerrit.cloudera.org:8080/#/c/8447/14/common/thrift/ImpalaService.thrift@294
PS14, Line 294: typedef map 
TQueryOptionLevels
Does this need to be a typedef here? I don't see any references in any thrift 
files. Maybe we can just use the map type directly where its needed in the 
backend


http://gerrit.cloudera.org:8080/#/c/8447/14/common/thrift/beeswax.thrift
File common/thrift/beeswax.thrift:

http://gerrit.cloudera.org:8080/#/c/8447/14/common/thrift/beeswax.thrift@111
PS14, Line 111:   4: optional TQueryOptionLevel level
Comment that this was added for impala-shell?


http://gerrit.cloudera.org:8080/#/c/8447/14/fe/src/main/java/org/apache/impala/analysis/SetStmt.java
File fe/src/main/java/org/apache/impala/analysis/SetStmt.java:

http://gerrit.cloudera.org:8080/#/c/8447/14/fe/src/main/java/org/apache/impala/analysis/SetStmt.java@55
PS14, Line 55:   if (isSetAll_) {
Nit: could be one line
 if (isSetAll_) return ...;


http://gerrit.cloudera.org:8080/#/c/8447/14/fe/src/main/java/org/apache/impala/analysis/SetStmt.java@76
PS14, Line 76:   request.setIs_set_all(true);
nit: could be one line


http://gerrit.cloudera.org:8080/#/c/8447/14/shell/impala_client.py
File shell/impala_client.py:

http://gerrit.cloudera.org:8080/#/c/8447/14/shell/impala_client.py@97
PS14, Line 97:   self.query_option_levels[option.key.upper()] = option.level
Does this still work if we're talking to an older Impala server that doesn't 
return the level?

Update: looks like it doesn't:
  Traceback (most recent call last):
  File "/home/tarmstrong/Impala/incubator-impala/shell/impala_shell.py", line 15
27, in 
shell = ImpalaShell(options, query_options)
  File "/home/tarmstrong/Impala/incubator-impala/shell/impala_shell.py", line 20
5, in __init__
self.do_connect(options.impalad)
  File "/home/tarmstrong/Impala/incubator-impala/shell/impala_shell.py", line 73
1, in do_connect
self.imp_client.build_default_query_options_dict()
  File 

[Impala-ASF-CR] IMPALA-6217: fix DCHECK in Parquet fuzz test

2017-11-18 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/8594 )

Change subject: IMPALA-6217: fix DCHECK in Parquet fuzz test
..

IMPALA-6217: fix DCHECK in Parquet fuzz test

The IMPALA-4177 change accidentally removed a Status check that could be
hit with a corrupt parquet file.

Change-Id: I6ceca7de31f602b75d744dacbdf37afa75983344
---
M be/src/exec/parquet-column-readers.cc
1 file changed, 1 insertion(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/8594/2
--
To view, visit http://gerrit.cloudera.org:8080/8594
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6ceca7de31f602b75d744dacbdf37afa75983344
Gerrit-Change-Number: 8594
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6217: fix DCHECK in Parquet fuzz test

2017-11-19 Thread Tim Armstrong (Code Review)
Hello Lars Volker,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8594

to look at the new patch set (#3).

Change subject: IMPALA-6217: fix DCHECK in Parquet fuzz test
..

IMPALA-6217: fix DCHECK in Parquet fuzz test

The IMPALA-4177 change accidentally removed a Status check that could be
hit with a corrupt parquet file.

Testing:
Ran TestScannersFuzzing in a loop for 2 days.

Change-Id: I6ceca7de31f602b75d744dacbdf37afa75983344
---
M be/src/exec/parquet-column-readers.cc
1 file changed, 1 insertion(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/8594/3
--
To view, visit http://gerrit.cloudera.org:8080/8594
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6ceca7de31f602b75d744dacbdf37afa75983344
Gerrit-Change-Number: 8594
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-4132: Use -fno-omit-frame-pointer

2017-11-21 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8612 )

Change subject: IMPALA-4132: Use -fno-omit-frame-pointer
..


Patch Set 1: Code-Review+1

(1 comment)

The perf results looked acceptable to me - there didn't seem to be any 
significant regressions.

http://gerrit.cloudera.org:8080/#/c/8612/1/be/CMakeLists.txt
File be/CMakeLists.txt:

http://gerrit.cloudera.org:8080/#/c/8612/1/be/CMakeLists.txt@42
PS1, Line 42: pointers
nit: extra s



--
To view, visit http://gerrit.cloudera.org:8080/8612
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib7d0d88ba015a847356ed0274fe91017b98cb941
Gerrit-Change-Number: 8612
Gerrit-PatchSet: 1
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 21 Nov 2017 18:38:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] [DOCS] Correct bit patterns in comments for shiftright() examples

2017-11-21 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8624 )

Change subject: [DOCS] Correct bit patterns in comments for shiftright() 
examples
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8624
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0d1bc00e4c8d1ae3352e18a8df95745d9c9b5ea6
Gerrit-Change-Number: 8624
Gerrit-PatchSet: 1
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 21 Nov 2017 20:02:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4964: Fix Decimal modulo overflow

2017-11-21 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8329 )

Change subject: IMPALA-4964: Fix Decimal modulo overflow
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8329
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5420201d4440d421e33e443df005cdcc16b8a6cd
Gerrit-Change-Number: 8329
Gerrit-PatchSet: 4
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Wed, 22 Nov 2017 00:44:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4835 (prep only): create io subfolder and namespace

2017-11-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8424 )

Change subject: IMPALA-4835 (prep only): create io subfolder and namespace
..


Patch Set 8:

rebased and resolved conflicts with other patches


--
To view, visit http://gerrit.cloudera.org:8080/8424
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If807f93a47d8027a43e56dd80b1b535d0bb74e1b
Gerrit-Change-Number: 8424
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 16 Nov 2017 18:46:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation

2017-11-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8414 )

Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
..


Patch Set 8:

Ended up doing some further changes to make the DecrementRequestThread*(), 
locking and error propagation a bit more comprehensible.


--
To view, visit http://gerrit.cloudera.org:8080/8414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1
Gerrit-Change-Number: 8414
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 16 Nov 2017 19:08:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6080: clean up table descriptor handling

2017-11-16 Thread Tim Armstrong (Code Review)
Hello Tianyi Wang, anujphadke, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8330

to look at the new patch set (#3).

Change subject: IMPALA-6080: clean up table descriptor handling
..

IMPALA-6080: clean up table descriptor handling

* Add DescriptorTbl::CreateHdfsTableDescriptor to avoid having to
   create an entire DescriptorTbl during INSERT finalization (when only
   a descriptor for the output table is needed)
* Remove TQueryExecRequest.desc_tbl, there's already a home for it in
   TQueryContext.desc_tbl

This required fixing a problem in the planner test infrastructure
where the TQueryCtx was reused for planning multiple times despite
being modified during planning.

This is based on Marcel Kornacker's coordinator cleanup
patch.

Testing:
Ran core tests.

Change-Id: Id427dab0c196b556bd8b2d64ec618403d5cbd4d6
---
M be/src/runtime/coordinator.cc
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M common/thrift/Frontend.thrift
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
6 files changed, 116 insertions(+), 110 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/8330/3
--
To view, visit http://gerrit.cloudera.org:8080/8330
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id427dab0c196b556bd8b2d64ec618403d5cbd4d6
Gerrit-Change-Number: 8330
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: anujphadke 


[Impala-ASF-CR] IMPALA-6080: clean up table descriptor handling

2017-11-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8330 )

Change subject: IMPALA-6080: clean up table descriptor handling
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8330/2/be/src/benchmarks/expr-benchmark.cc
File be/src/benchmarks/expr-benchmark.cc:

http://gerrit.cloudera.org:8080/#/c/8330/2/be/src/benchmarks/expr-benchmark.cc@57
PS2, Line 57: #include "service/frontend.h"
> I don't think this include is needed.
Done


http://gerrit.cloudera.org:8080/#/c/8330/2/be/src/runtime/descriptors.h
File be/src/runtime/descriptors.h:

http://gerrit.cloudera.org:8080/#/c/8330/2/be/src/runtime/descriptors.h@79
PS2, Line 79: struct LlvmTupleStruct {
> Is this struct used?
Done



--
To view, visit http://gerrit.cloudera.org:8080/8330
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id427dab0c196b556bd8b2d64ec618403d5cbd4d6
Gerrit-Change-Number: 8330
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-Comment-Date: Thu, 16 Nov 2017 18:20:09 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding

2017-11-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8267 )

Change subject: IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding
..


Patch Set 15: Code-Review+2

Rebased


--
To view, visit http://gerrit.cloudera.org:8080/8267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I35de0cf80c86f501c4a39270afc8fb8111552ac6
Gerrit-Change-Number: 8267
Gerrit-PatchSet: 15
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 16 Nov 2017 17:52:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4835 (prep only): create io subfolder and namespace

2017-11-16 Thread Tim Armstrong (Code Review)
Hello Joe McDonnell, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8424

to look at the new patch set (#8).

Change subject: IMPALA-4835 (prep only): create io subfolder and namespace
..

IMPALA-4835 (prep only): create io subfolder and namespace

Instead of using the DiskIoMgr class as a namespace, which prevents
forward-declaration of inner classes, create an impala::io namespace
and unnested the inner class.

This is done in anticipation of DiskIoMgr depending on BufferPool. This
helps avoid a circular dependency between DiskIoMgr, TmpFileMgr and
BufferPool headers that could not be broken with forward declarations.

Testing:
Ran core tests.

Change-Id: If807f93a47d8027a43e56dd80b1b535d0bb74e1b
---
M be/CMakeLists.txt
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scan-node-mt.h
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/kudu-scan-node.cc
M be/src/exec/scanner-context.cc
M be/src/exec/scanner-context.h
M be/src/runtime/CMakeLists.txt
D be/src/runtime/disk-io-mgr.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
A be/src/runtime/io/CMakeLists.txt
R be/src/runtime/io/disk-io-mgr-internal.h
R be/src/runtime/io/disk-io-mgr-stress-test.cc
R be/src/runtime/io/disk-io-mgr-stress.cc
R be/src/runtime/io/disk-io-mgr-stress.h
R be/src/runtime/io/disk-io-mgr-test.cc
R be/src/runtime/io/disk-io-mgr.cc
A be/src/runtime/io/disk-io-mgr.h
R be/src/runtime/io/handle-cache.h
R be/src/runtime/io/handle-cache.inline.h
R be/src/runtime/io/request-context.cc
R be/src/runtime/io/request-context.h
A be/src/runtime/io/request-ranges.h
R be/src/runtime/io/scan-range.cc
M be/src/runtime/row-batch.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.h
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
38 files changed, 1,416 insertions(+), 1,310 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/8424/8
-- 
To view, visit http://gerrit.cloudera.org:8080/8424
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If807f93a47d8027a43e56dd80b1b535d0bb74e1b
Gerrit-Change-Number: 8424
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation

2017-11-16 Thread Tim Armstrong (Code Review)
Hello Tianyi Wang, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8414

to look at the new patch set (#8).

Change subject: IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation
..

IMPALA-4835: Part 1: simplify I/O mgr mem mgmt and cancellation

In preparation for switching the I/O mgr to the buffer pool, this
removes and cleans up a lot of code so that the switchover patch starts
from a cleaner slate.

* Remove the free buffer cache (which will be replaced by buffer pool's
  own caching).
* Make memory limit exceeded error checking synchronous (in anticipation
  of having to propagate buffer pool errors synchronously).
* Simplify error propagation - remove the (ineffectual) code that
  enqueued BufferDescriptors containing error statuses.
* Document locking scheme better in a few places, make it part of the
  function signature when it seemed reasonable.
* Move ReturnBuffer() to ScanRange, because it is intrinsically
  connected with the lifecycle of a scan range.
* Separate external ReturnBuffer() and internal CleanUpBuffer()
  interfaces - previously callers of ReturnBuffer() were fudging
  the num_buffers_in_reader accounting to make the external interface work.
* Eliminate redundant state in ScanRange: 'eosr_returned_' and
  'is_cancelled_'.
* Clarify the logic around calling Close() for the last
  BufferDescriptor.
  -> There appeared to be an implicit assumption that buffers would be
 freed in the order they were returned from the scan range, so that
 the "eos" buffer was returned last. Instead just count the number
 of outstanding buffers to detect the last one.
  -> Touching the is_cancelled_ field without holding a lock was hard to
 reason about - violated locking rules and it was unclear that it
 was race-free.

This will probably regress performance somewhat because of the cache
removal, so my plan is to merge it around the same time as switching
the I/O mgr to allocate from the buffer pool. I'm keeping the patches
separate to make reviewing easier.

Testing:
* Ran exhaustive tests
* Ran the disk-io-mgr-stress-test overnight

Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/scanner-context.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/io/disk-io-mgr-stress.cc
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/runtime/io/request-context.cc
M be/src/runtime/io/request-context.h
M be/src/runtime/io/request-ranges.h
M be/src/runtime/io/scan-range.cc
M be/src/runtime/mem-tracker.h
M be/src/runtime/test-env.cc
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
19 files changed, 537 insertions(+), 847 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/8414/8
--
To view, visit http://gerrit.cloudera.org:8080/8414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If5cb42437d11c13bc4a55c3ab426b66777332bd1
Gerrit-Change-Number: 8414
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

2017-11-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
..


Patch Set 4: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8309/3/be/src/runtime/decimal-value.inline.h
File be/src/runtime/decimal-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/8309/3/be/src/runtime/decimal-value.inline.h@243
PS3, Line 243: // of the numbers can be zero and one must be positive and the 
other one negative.
> The way it is currently implemented, I don't see an advantage in doing that
I looked at it again and it seems fine as is.


http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h
File be/src/runtime/decimal-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/8309/2/be/src/runtime/decimal-value.inline.h@310
PS2, Line 310: DCHECK(!*overflow) << "Cannot overflow unless result is 
Decimal16Value";
> Modified the comment of AdjustToSameScale() slightly. I was  debating chang
wfm



-- 
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 4
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Thu, 16 Nov 2017 23:39:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5019: Decimal V2 addition

2017-11-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8309 )

Change subject: IMPALA-5019: Decimal V2 addition
..


Patch Set 4:

I'm happy with this - did anyone else want to take a pass over it?


--
To view, visit http://gerrit.cloudera.org:8080/8309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I401049c56d910eb1546a178c909c923b01239336
Gerrit-Change-Number: 8309
Gerrit-PatchSet: 4
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Thu, 16 Nov 2017 23:40:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1474: Add a metric for running queries

2017-11-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/7228 )

Change subject: IMPALA-1474: Add a metric for running queries
..


Patch Set 1:

I think it probably does. There's some miscellaneous cleanup in this patch but 
I think they implement the same functionality, more or less.


--
To view, visit http://gerrit.cloudera.org:8080/7228
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I41bc88d0e91427ca2fd70136dc6e06fa7a2663da
Gerrit-Change-Number: 7228
Gerrit-PatchSet: 1
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 16 Nov 2017 21:33:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6184: Clean up after ScalarExprEvaluator::Clone() fails

2017-11-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8572 )

Change subject: IMPALA-6184: Clean up after ScalarExprEvaluator::Clone() fails
..


Patch Set 2: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8572/1/testdata/workloads/functional-query/queries/QueryTest/udf-errors.test
File testdata/workloads/functional-query/queries/QueryTest/udf-errors.test:

http://gerrit.cloudera.org:8080/#/c/8572/1/testdata/workloads/functional-query/queries/QueryTest/udf-errors.test@85
PS1, Line 85: on (bad_expr2(rand()) = (t2.bool_col && t1.bool_col));
> PS2 has been updated to fail more deterministically. rand() with the same s
Yeah I wasn't sure if it could behave differently if the scan ranges were 
scheduled differently or something.



--
To view, visit http://gerrit.cloudera.org:8080/8572
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I45ffd722d0a69ad05ae3c748cf504c7f1a959a1d
Gerrit-Change-Number: 8572
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 16 Nov 2017 21:19:29 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1575: part 2: yield admission control resources

2017-11-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8581 )

Change subject: IMPALA-1575: part 2: yield admission control resources
..


Patch Set 1:

This is essentially the same patch as before, except with the statestore update 
frequency in the test tweaked to avoid the flaky timeout problem we were 
hitting on release builds.

The problem is that queries were sometimes in the queue long enough (60s) to 
hit the default admission control timeout. The reason it was so slow is that 
the test frequently waits for 10 statestore updates or 5 seconds.


--
To view, visit http://gerrit.cloudera.org:8080/8581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib1fae8dc1c4b0eca7bfa8fadae4a56ef2b37947a
Gerrit-Change-Number: 8581
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 17 Nov 2017 00:57:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6128: Spill-to-disk Encryption(AES-CFB + SHA256) is slow

2017-11-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8510 )

Change subject: IMPALA-6128: Spill-to-disk Encryption(AES-CFB + SHA256) is slow
..


Patch Set 4: Code-Review+1

Change makes sense to me. I don't know the story with what OpenSSL versions we 
have to interoperate with, so I'll let Sailesh weigh in on that.


--
To view, visit http://gerrit.cloudera.org:8080/8510
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib97939f2334838263364b53ef3413871638bf53e
Gerrit-Change-Number: 8510
Gerrit-PatchSet: 4
Gerrit-Owner: Xianda Ke 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Mike Yoder 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Xianda Ke 
Gerrit-Comment-Date: Thu, 16 Nov 2017 23:55:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6084: Avoid using of global namespace for llvm

2017-11-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8489 )

Change subject: IMPALA-6084: Avoid using of global namespace for llvm
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8489/5/be/src/exprs/scalar-expr.cc
File be/src/exprs/scalar-expr.cc:

http://gerrit.cloudera.org:8080/#/c/8489/5/be/src/exprs/scalar-expr.cc@260
PS5, Line 260: // Compute the alignment of this value. Values should be 
self-aligned for
> Thanks for the fix. The weird formatting comes from my vi setting. Anyway,
No problem, it's easy to miss in a large patch.



--
To view, visit http://gerrit.cloudera.org:8080/8489
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7098baff5f746bcc2965583ca99f6188997b733a
Gerrit-Change-Number: 8489
Gerrit-PatchSet: 5
Gerrit-Owner: Kim Jin Chul 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Kim Jin Chul 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 16 Nov 2017 00:19:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6206: Fix data load failure with -notests

2017-11-17 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8580 )

Change subject: IMPALA-6206: Fix data load failure with -notests
..


Patch Set 3: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/8580
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idaa193989d77d56b72db05ad54e1fb0a345938fb
Gerrit-Change-Number: 8580
Gerrit-PatchSet: 3
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Fri, 17 Nov 2017 23:46:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6292: Fix incorrect DCHECK in decimal subtraction

2017-12-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8796 )

Change subject: IMPALA-6292: Fix incorrect DCHECK in decimal subtraction
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8796
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I42d42ad85efe32b7a0db0d2353385939fee09934
Gerrit-Change-Number: 8796
Gerrit-PatchSet: 1
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 08 Dec 2017 01:23:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6285: Don't print stack trace on RPC errors.

2017-12-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8788 )

Change subject: IMPALA-6285: Don't print stack trace on RPC errors.
..


Patch Set 2: Code-Review+2

(2 comments)

I'm assuming Sailesh is happy with this based on his previous comment.

http://gerrit.cloudera.org:8080/#/c/8788/2/be/src/runtime/client-cache.h
File be/src/runtime/client-cache.h:

http://gerrit.cloudera.org:8080/#/c/8788/2/be/src/runtime/client-cache.h@314
PS2, Line 314: return Status::Expected(msg);
This idiom is a bit inefficient - i think it results in three copies of the 
string - first to construct it, then to copy it into the ErrorMsg, then to copy 
it into a new ErrorMsg in the status. It probably doesn't really matter though.


http://gerrit.cloudera.org:8080/#/c/8788/2/be/src/runtime/client-cache.h@329
PS2, Line 329:   VLOG_QUERY << msg.msg() << " rpc: " << 
typeid(*response).name();
Should we also include the typeid() in the Status? Or is thta overkill?



--
To view, visit http://gerrit.cloudera.org:8080/8788
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia83294494442ef21f7934f92ba9112e80d81fa58
Gerrit-Change-Number: 8788
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 07 Dec 2017 17:59:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder

2017-12-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8034 )

Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
..


Patch Set 18: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8034
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299
Gerrit-Change-Number: 8034
Gerrit-PatchSet: 18
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 08 Dec 2017 00:57:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6265 Query cancellation test enhancements

2017-12-05 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8713 )

Change subject: IMPALA-6265 Query cancellation test enhancements
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8713
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie0bff485a872df7be8efd784314a6ca91aaadd11
Gerrit-Change-Number: 8713
Gerrit-PatchSet: 4
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 05 Dec 2017 18:08:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6265 Query cancellation test enhancements

2017-12-05 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8713 )

Change subject: IMPALA-6265 Query cancellation test enhancements
..


Patch Set 3: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8713/2/tests/shell/test_shell_commandline.py
File tests/shell/test_shell_commandline.py:

http://gerrit.cloudera.org:8080/#/c/8713/2/tests/shell/test_shell_commandline.py@385
PS2, Line 385: test_get_log_once(self,
> I'm not sure I get this comment :)
Yep, that looks better.



--
To view, visit http://gerrit.cloudera.org:8080/8713
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie0bff485a872df7be8efd784314a6ca91aaadd11
Gerrit-Change-Number: 8713
Gerrit-PatchSet: 3
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 05 Dec 2017 18:08:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6268: KerberosOnAndOff/RpcMgrKerberizedTest.MultipleServices failing

2017-12-05 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8766 )

Change subject: IMPALA-6268: 
KerberosOnAndOff/RpcMgrKerberizedTest.MultipleServices failing
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8766
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6c750850ff916617a06e3cfac330072d8e2179e8
Gerrit-Change-Number: 8766
Gerrit-PatchSet: 1
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 05 Dec 2017 19:25:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3804: Re-enable per-scan filtering for sequence-based scanners

2017-12-05 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8684 )

Change subject: IMPALA-3804: Re-enable per-scan filtering for sequence-based 
scanners
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8684
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4b38c26bcbe67f83efcc65a1723d766626ae3d3e
Gerrit-Change-Number: 8684
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Tue, 05 Dec 2017 18:13:46 +
Gerrit-HasComments: No


  1   2   3   4   5   6   7   8   9   10   >