[native-toolchain-CR] Remove CYRUS SASL

2020-03-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/15466 )

Change subject: Remove CYRUS_SASL
..

Remove CYRUS_SASL

CYRUS_SASL requires libdb4 which is not available in rhel8.

Change-Id: I7f043c0c32dd26f3b4b7d7b16749ce310860d9c2
Reviewed-on: http://gerrit.cloudera.org:8080/15466
Reviewed-by: Tim Armstrong 
Tested-by: Tim Armstrong 
---
M buildall.sh
1 file changed, 0 insertions(+), 9 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/15466
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I7f043c0c32dd26f3b4b7d7b16749ce310860d9c2
Gerrit-Change-Number: 15466
Gerrit-PatchSet: 3
Gerrit-Owner: Hector Acosta 
Gerrit-Reviewer: Tim Armstrong 


[native-toolchain-CR] Fix bison compilation with glibc 2.28

2020-03-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/15467 )

Change subject: Fix bison compilation with glibc 2.28
..

Fix bison compilation with glibc 2.28

Change-Id: Ie07da9fcebde4ae5003885f442d8856537f96f3a
Reviewed-on: http://gerrit.cloudera.org:8080/15467
Reviewed-by: Tim Armstrong 
Tested-by: Tim Armstrong 
---
M .gitignore
M buildall.sh
A source/bison/bison-3.0.4-patches/110-glibc-change-work-around.patch
3 files changed, 35 insertions(+), 3 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/15467
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ie07da9fcebde4ae5003885f442d8856537f96f3a
Gerrit-Change-Number: 15467
Gerrit-PatchSet: 3
Gerrit-Owner: Hector Acosta 
Gerrit-Reviewer: Tim Armstrong 


[native-toolchain-CR] Add support for centos8

2020-03-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/15465 )

Change subject: Add support for centos8
..

Add support for centos8

This commit adds a new centos8 docker image. Most of this is pretty 
straightforward with
the exception of having to explicitly set our default python. This needs to 
happen
early in the postinstall process since other tools (aws cli) depend on it.

Change-Id: Idc15c202f61e251761fd0b1dc9aa0b15c27b3363
Reviewed-on: http://gerrit.cloudera.org:8080/15465
Reviewed-by: Tim Armstrong 
Reviewed-by: Laszlo Gaal 
Tested-by: Tim Armstrong 
---
M docker/all/assert-dependencies-present.py
M docker/all/postinstall.sh
A docker/redhat8.df
M in-docker.py
4 files changed, 66 insertions(+), 3 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, but someone else must approve; Verified
  Laszlo Gaal: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/15465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Idc15c202f61e251761fd0b1dc9aa0b15c27b3363
Gerrit-Change-Number: 15465
Gerrit-PatchSet: 3
Gerrit-Owner: Hector Acosta 
Gerrit-Reviewer: Hector Acosta 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Tim Armstrong 


[native-toolchain-CR] Remove centos6, ubuntu12, ubuntu14 platforms

2020-03-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/15464 )

Change subject: Remove centos6, ubuntu12, ubuntu14 platforms
..

Remove centos6, ubuntu12, ubuntu14 platforms

This commit removes unused platform redhat6, and EOL platforms ubuntu12 and 
ubuntu14.

Change-Id: Icef9293fc528bce3d60956cf3b879cf71e933403
Reviewed-on: http://gerrit.cloudera.org:8080/15464
Reviewed-by: Tim Armstrong 
Tested-by: Tim Armstrong 
---
M Makefile
D docker/redhat6.df
D docker/ubuntu1404.df
M in-docker.py
4 files changed, 0 insertions(+), 111 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/15464
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Icef9293fc528bce3d60956cf3b879cf71e933403
Gerrit-Change-Number: 15464
Gerrit-PatchSet: 3
Gerrit-Owner: Hector Acosta 
Gerrit-Reviewer: Tim Armstrong 


[native-toolchain-CR] Compile thrift using the python in our toolchain.

2020-03-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/15468 )

Change subject: Compile thrift using the python in our toolchain.
..

Compile thrift using the python in our toolchain.

Change-Id: Iec332462ba7f9eaa699247f546d2b6ba1faabd60
Reviewed-on: http://gerrit.cloudera.org:8080/15468
Reviewed-by: Tim Armstrong 
Tested-by: Tim Armstrong 
---
M buildall.sh
M source/python/build.sh
M source/thrift/build.sh
3 files changed, 10 insertions(+), 3 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/15468
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Iec332462ba7f9eaa699247f546d2b6ba1faabd60
Gerrit-Change-Number: 15468
Gerrit-PatchSet: 4
Gerrit-Owner: Hector Acosta 
Gerrit-Reviewer: Tim Armstrong 


[native-toolchain-CR] Remove centos6, ubuntu12, ubuntu14 platforms

2020-03-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15464 )

Change subject: Remove centos6, ubuntu12, ubuntu14 platforms
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/15464
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icef9293fc528bce3d60956cf3b879cf71e933403
Gerrit-Change-Number: 15464
Gerrit-PatchSet: 2
Gerrit-Owner: Hector Acosta 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Mar 2020 05:22:43 +
Gerrit-HasComments: No


[native-toolchain-CR] Compile thrift using the python in our toolchain.

2020-03-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15468 )

Change subject: Compile thrift using the python in our toolchain.
..


Patch Set 3: Verified+1

I did a successful build for all non-RHEL8 platforms with build ID 
178-a93f184dd5 (I ran into some infra issues for RHEL8). So this shouldn't 
regress things at the least. We can get a RHEL8 build later and confirm.


--
To view, visit http://gerrit.cloudera.org:8080/15468
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec332462ba7f9eaa699247f546d2b6ba1faabd60
Gerrit-Change-Number: 15468
Gerrit-PatchSet: 3
Gerrit-Owner: Hector Acosta 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Mar 2020 05:22:30 +
Gerrit-HasComments: No


[native-toolchain-CR] Remove CYRUS SASL

2020-03-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15466 )

Change subject: Remove CYRUS_SASL
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/15466
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f043c0c32dd26f3b4b7d7b16749ce310860d9c2
Gerrit-Change-Number: 15466
Gerrit-PatchSet: 2
Gerrit-Owner: Hector Acosta 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Mar 2020 05:22:39 +
Gerrit-HasComments: No


[native-toolchain-CR] Add support for centos8

2020-03-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15465 )

Change subject: Add support for centos8
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/15465
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idc15c202f61e251761fd0b1dc9aa0b15c27b3363
Gerrit-Change-Number: 15465
Gerrit-PatchSet: 2
Gerrit-Owner: Hector Acosta 
Gerrit-Reviewer: Hector Acosta 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Mar 2020 05:22:41 +
Gerrit-HasComments: No


[native-toolchain-CR] Fix bison compilation with glibc 2.28

2020-03-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15467 )

Change subject: Fix bison compilation with glibc 2.28
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/15467
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie07da9fcebde4ae5003885f442d8856537f96f3a
Gerrit-Change-Number: 15467
Gerrit-PatchSet: 2
Gerrit-Owner: Hector Acosta 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Mar 2020 05:22:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9549: Handle catalogd startup delays when using local catalog

2020-03-25 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15561 )

Change subject: IMPALA-9549: Handle catalogd startup delays when using local 
catalog
..


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/15561/1/tests/custom_cluster/test_catalog_wait.py
File tests/custom_cluster/test_catalog_wait.py:

http://gerrit.cloudera.org:8080/#/c/15561/1/tests/custom_cluster/test_catalog_wait.py@86
PS1, Line 86: @SkipIfBuildType.not_dev_build
> flake8: E302 expected 2 blank lines, found 1
Done


http://gerrit.cloudera.org:8080/#/c/15561/1/tests/custom_cluster/test_catalog_wait.py@106
PS1, Line 106: ;
> flake8: E703 statement ends with a semicolon
Done


http://gerrit.cloudera.org:8080/#/c/15561/1/tests/custom_cluster/test_catalog_wait.py@116
PS1, Line 116: ;
> flake8: E703 statement ends with a semicolon
Done


http://gerrit.cloudera.org:8080/#/c/15561/1/tests/custom_cluster/test_catalog_wait.py@128
PS1, Line 128: ;
> flake8: E703 statement ends with a semicolon
Done



--
To view, visit http://gerrit.cloudera.org:8080/15561
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1b5a94c59f25927a169dcb58f310ce6b1044
Gerrit-Change-Number: 15561
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Thu, 26 Mar 2020 04:02:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9549: Handle catalogd startup delays when using local catalog

2020-03-25 Thread Joe McDonnell (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15561

to look at the new patch set (#2).

Change subject: IMPALA-9549: Handle catalogd startup delays when using local 
catalog
..

IMPALA-9549: Handle catalogd startup delays when using local catalog

Impalads should be tolerant of delays in catalogd startup.
Currently, when running with the local catalog
(use_local_catalog=true), impalad startup can fail when catalogd
startup is delayed. What happens is that ImpalaServer's constructor
calls ImpalaServer::UpdateCatalogMetrics(), which maintains two
metrics counting the number of tables and databases. This is before
the code in ImpalaServer::Start() that waits for the catalogd to
start (added by IMPALA-4704), so there is no guarantee that catalogd
is running. The UpdateCatalogMetrics() call ends up calling getDbs()
in the frontend catalog. LocalCatalog::getDbs() tries to load the
databases (and thus contact catalogd), and this call will fail if
catalogd is not running. This fails startup.

use_local_catalog=false is immune to this only because it does not
contact catalogd in Catalog::getDbs().

This moves the UpdateCatalogMetrics() call from the ImpalaServer
constructor to ImpalaServer::Start() after the impalad has already
waited for the catalogd to start up. It also limits the call to
run only in coordinators.

Testing:
 - Added a test to custom_cluster.test_catalog_wait to delay catalogd
   start up by 60 seconds and verify that the impalads successfully
   start up. This test fails prior to this change.
 - Hand tested to verify that the metrics that are maintained by
   UpdateCatalogMetrics() are not meaningfully changed. They are not.
   Running coordinators and executors have the right count after they
   successfully start up.

Change-Id: I1b5a94c59f25927a169dcb58f310ce6b1044
---
M be/src/catalog/catalog-server.cc
M be/src/common/global-flags.cc
M be/src/service/impala-server.cc
M tests/custom_cluster/test_catalog_wait.py
4 files changed, 65 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/15561/2
--
To view, visit http://gerrit.cloudera.org:8080/15561
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1b5a94c59f25927a169dcb58f310ce6b1044
Gerrit-Change-Number: 15561
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-9549: Handle catalogd startup delays when using local catalog

2020-03-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15561 )

Change subject: IMPALA-9549: Handle catalogd startup delays when using local 
catalog
..


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/15561/1/tests/custom_cluster/test_catalog_wait.py
File tests/custom_cluster/test_catalog_wait.py:

http://gerrit.cloudera.org:8080/#/c/15561/1/tests/custom_cluster/test_catalog_wait.py@86
PS1, Line 86: @SkipIfBuildType.not_dev_build
flake8: E302 expected 2 blank lines, found 1


http://gerrit.cloudera.org:8080/#/c/15561/1/tests/custom_cluster/test_catalog_wait.py@106
PS1, Line 106: ;
flake8: E703 statement ends with a semicolon


http://gerrit.cloudera.org:8080/#/c/15561/1/tests/custom_cluster/test_catalog_wait.py@116
PS1, Line 116: ;
flake8: E703 statement ends with a semicolon


http://gerrit.cloudera.org:8080/#/c/15561/1/tests/custom_cluster/test_catalog_wait.py@128
PS1, Line 128: ;
flake8: E703 statement ends with a semicolon



--
To view, visit http://gerrit.cloudera.org:8080/15561
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1b5a94c59f25927a169dcb58f310ce6b1044
Gerrit-Change-Number: 15561
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 26 Mar 2020 03:59:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9549: Handle catalogd startup delays when using local catalog

2020-03-25 Thread Joe McDonnell (Code Review)
Joe McDonnell has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/15561


Change subject: IMPALA-9549: Handle catalogd startup delays when using local 
catalog
..

IMPALA-9549: Handle catalogd startup delays when using local catalog

Impalads should be tolerant of delays in catalogd startup.
Currently, when running with the local catalog
(use_local_catalog=true), impalad startup can fail when catalogd
startup is delayed. What happens is that ImpalaServer's constructor
calls ImpalaServer::UpdateCatalogMetrics(), which maintains two
metrics counting the number of tables and databases. This is before
the code in ImpalaServer::Start() that waits for the catalogd to
start (added by IMPALA-4704), so there is no guarantee that catalogd
is running. The UpdateCatalogMetrics() call ends up calling getDbs()
in the frontend catalog. LocalCatalog::getDbs() tries to load the
databases (and thus contact catalogd), and this call will fail if
catalogd is not running. This fails startup.

use_local_catalog=false is immune to this only because it does not
contact catalogd in Catalog::getDbs().

This moves the UpdateCatalogMetrics() call from the ImpalaServer
constructor to ImpalaServer::Start() after the impalad has already
waited for the catalogd to start up. It also limits the call to
run only in coordinators.

Testing:
 - Added a test to custom_cluster.test_catalog_wait to delay catalogd
   start up by 60 seconds and verify that the impalads successfully
   start up. This test fails prior to this change.
 - Hand tested to verify that the metrics that are maintained by
   UpdateCatalogMetrics() are not meaningfully changed. They are not.
   Running coordinators and executors have the right count after they
   successfully start up.

Change-Id: I1b5a94c59f25927a169dcb58f310ce6b1044
---
M be/src/catalog/catalog-server.cc
M be/src/common/global-flags.cc
M be/src/service/impala-server.cc
M tests/custom_cluster/test_catalog_wait.py
4 files changed, 64 insertions(+), 5 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/15561/1
--
To view, visit http://gerrit.cloudera.org:8080/15561
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I1b5a94c59f25927a169dcb58f310ce6b1044
Gerrit-Change-Number: 15561
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 


[Impala-ASF-CR] IMPALA-3766: optionally compress spilled data

2020-03-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15454 )

Change subject: IMPALA-3766: optionally compress spilled data
..


Patch Set 14:

(8 comments)

Thanks for the feedback, sorry to push out a rebase, I hope it isn't too bad 
looking at the PS13->14 diff

http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/bufferpool/buffer-pool-counters.h
File be/src/runtime/bufferpool/buffer-pool-counters.h:

http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/bufferpool/buffer-pool-counters.h@55
PS11, Line 55: g for writes to di
> nit: Total bytes written to disk. (May be compressed)
thanks, that's less ambiguous


http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/bufferpool/buffer-pool.cc
File be/src/runtime/bufferpool/buffer-pool.cc:

http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/bufferpool/buffer-pool.cc@736
PS11, Line 736: [this, page](
> if the write op fails then write_io_ops will not be incremented whereas pre
It wasn't an intentional change, but I think it is an edge case that doesn't 
really matter (if the query failed, you probably don't care about about the 
exact number of writes).


http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.h
File be/src/runtime/tmp-file-mgr.h:

http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.h@161
PS11, Line 161: using 4kb or smaller
> nit: HOLE_PUNCH_BLOCK_SIZE_BYTES
Done


http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.h@180
PS11, Line 180: /// compression is used.
> nit: although its obvious but maybe mention that -1 means compression_level
Done


http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.h@291
PS11, Line 291: // called once. Must be called with 'lock_' held.
  :   Status CreateFiles() WARN_UNUSED
> update comment
thanks for catching


http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.h@482
PS11, Line 482: return be
> is this before or after compression?
Done


http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc
File be/src/runtime/tmp-file-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@65
PS11, Line 65: "(Advanced) Limit on the total bytes of compression buffers 
that will be used for "
> nit:maybe mention that this limit is shared across all queries.
Good point, updated.


http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@454
PS11, Line 454: int64_t num_bytes, TmpFile** tmp_file, int64_t* 
file_offset) {
  :   lock_guard lock(lock_);
  :   int64_t scratch_range_bytes =
  :
> shouldn't this check be done after the free_ranges_ recycle underneath? if
Yes, i'm not sure what I was thinking when I changed this.



--
To view, visit http://gerrit.cloudera.org:8080/15454
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9c08ff9504097f0fee8c32316c5c150136abe659
Gerrit-Change-Number: 15454
Gerrit-PatchSet: 14
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Mar 2020 03:35:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3766: optionally compress spilled data

2020-03-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15454 )

Change subject: IMPALA-3766: optionally compress spilled data
..


Patch Set 14:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5548/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/15454
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9c08ff9504097f0fee8c32316c5c150136abe659
Gerrit-Change-Number: 15454
Gerrit-PatchSet: 14
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Mar 2020 03:32:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3766: optionally compress spilled data

2020-03-25 Thread Tim Armstrong (Code Review)
Hello Sahil Takiar, Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15454

to look at the new patch set (#14).

Change subject: IMPALA-3766: optionally compress spilled data
..

IMPALA-3766: optionally compress spilled data

Enabled via --disk_spill_compression_codec, which uses
the same syntax as the compression_codec query option.
Recommended codecs are LZ4 and ZSTD. ZSTD supports
specifying a compression level.

The compression is done in TmpFileMgr using a temporary
buffer. Allocation of disk space is reworked slightly
so that the allocation can happen after compression.

The default power-of-two disk block sizes would lead
to a lot of internal fragmentation, so a new strategy
for free space management, similar to that used in
the data cache, can be used with
--disk_spill_punch_holes=true. TmpFileMgr will allocate
a range of the actual compressed size and punch holes
in the file for each range that is no longer needed.

UncompressedWriteIoBytes is added to the buffer pool
profiles, so that you can see what degree of compression
is achieved. Typically I saw ratios of 2-3x for LZ4 and
ZSTD (with LZ4 toward the lower end and ZSTD toward
the higher end).

Limitations:
The management of the compression buffer memory could
be improved. Ideally it would be integrated with the
buffer pool and use the buffer pool allocator instead
of being done "on the side". We would probably want to
do this before making this the default, for resource
management and performance reasons (doing a malloc()
directly does not use the caching supported by the
buffer pool).

Testing:
* Run buffer pool spilling tests with different combinations of
  the new options.
* Extend existing TmpFileMgr tests for file space allocation to
  run with hole punching enabled.
* Switch a couple of spilling tests to use the new option.
* Add a metrics test to check for scratch leaks.
* Enable the new options by default for end-to-end dockerized
  tests to get additional coverage.
* Add a unit test where allocating compression memory fails,
  both on the read and write path.
* Ran a single-node stress test on TPC-DS SF 1 and TPC-H SF 10
  The peak compression buffer usage was ~40MB.

Perf:
I ran this spilling query using an SSD as the scratch disk:

  set mem_limit=200m;
  select count(distinct l_partkey) from
  tpch30_parquet.lineitem;

The time taken for the second run of each query was:
No compression: 19.59s
LZ4: 18.56s
ZSTD: 20.59s

Change-Id: I9c08ff9504097f0fee8c32316c5c150136abe659
---
M be/src/runtime/bufferpool/buffer-pool-counters.h
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/tmp-file-mgr-internal.h
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
M be/src/service/query-options.cc
M be/src/util/parse-util.cc
M be/src/util/parse-util.h
M bin/jenkins/dockerized-impala-run-tests.sh
M tests/custom_cluster/test_scratch_disk.py
M tests/verifiers/metric_verifier.py
15 files changed, 849 insertions(+), 246 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/15454/14
--
To view, visit http://gerrit.cloudera.org:8080/15454
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9c08ff9504097f0fee8c32316c5c150136abe659
Gerrit-Change-Number: 15454
Gerrit-PatchSet: 14
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 


[native-toolchain-CR] Compile thrift using the python in our toolchain.

2020-03-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15468 )

Change subject: Compile thrift using the python in our toolchain.
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/15468
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec332462ba7f9eaa699247f546d2b6ba1faabd60
Gerrit-Change-Number: 15468
Gerrit-PatchSet: 3
Gerrit-Owner: Hector Acosta 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Mar 2020 02:40:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3766: optionally compress spilled data

2020-03-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15454 )

Change subject: IMPALA-3766: optionally compress spilled data
..


Patch Set 12:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/5602/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/15454
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9c08ff9504097f0fee8c32316c5c150136abe659
Gerrit-Change-Number: 15454
Gerrit-PatchSet: 12
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Mar 2020 01:39:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP IMPALA-9434: Implement Robin Hood Hash Table.

2020-03-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15511 )

Change subject: WIP IMPALA-9434: Implement Robin Hood Hash Table.
..


Patch Set 4:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/5601/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/15511
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I28eeccd7f9ccae39e31972391f971901bcbfe986
Gerrit-Change-Number: 15511
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Mar 2020 01:37:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3766: optionally compress spilled data

2020-03-25 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15454 )

Change subject: IMPALA-3766: optionally compress spilled data
..


Patch Set 13:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/bufferpool/buffer-pool-counters.h
File be/src/runtime/bufferpool/buffer-pool-counters.h:

http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/bufferpool/buffer-pool-counters.h@55
PS11, Line 55: g for writes to di
nit: Total bytes written to disk. (May be compressed)


http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/bufferpool/buffer-pool.cc
File be/src/runtime/bufferpool/buffer-pool.cc:

http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/bufferpool/buffer-pool.cc@736
PS11, Line 736: [this, page](
if the write op fails then write_io_ops will not be incremented whereas 
previously it was. Is that the expected behavior? Dont have a preference either 
way, just wanted to point out the diff in behavior.


http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.h
File be/src/runtime/tmp-file-mgr.h:

http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.h@161
PS11, Line 161:  using 4kb or smaller
nit: HOLE_PUNCH_BLOCK_SIZE_BYTES


http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.h@180
PS11, Line 180: THdfsCompression::type compr
nit: although its obvious but maybe mention that -1 means compression_level_ is 
not used.


http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.h@291
PS11, Line 291: // occurs. Returns an error only if no temporary files are 
usable or the scratch
  :   /// limit is exceeded. Must be c
update comment


http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.h@482
PS11, Line 482:
is this before or after compression?


http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc
File be/src/runtime/tmp-file-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@65
PS11, Line 65: "(Advanced) Limit on the total bytes of compression buffers 
that will be used for "
nit:maybe mention that this limit is shared across all queries.


http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@454
PS11, Line 454: int64_t num_bytes, TmpFile** tmp_file, int64_t* 
file_offset) {
  :   lock_guard lock(lock_);
  :   int64_t scratch_range_bytes =
  :
shouldn't this check be done after the free_ranges_ recycle underneath? if 
punch_holes() is false and we already have an allocated file that we can 
recycle then we dont need to add to current_bytes_allocated_ and we wont hit 
this limit.



--
To view, visit http://gerrit.cloudera.org:8080/15454
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9c08ff9504097f0fee8c32316c5c150136abe659
Gerrit-Change-Number: 15454
Gerrit-PatchSet: 13
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Mar 2020 01:36:16 +
Gerrit-HasComments: Yes


[native-toolchain-CR] Compile thrift using the python in our toolchain.

2020-03-25 Thread Hector Acosta (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15468

to look at the new patch set (#3).

Change subject: Compile thrift using the python in our toolchain.
..

Compile thrift using the python in our toolchain.

Change-Id: Iec332462ba7f9eaa699247f546d2b6ba1faabd60
---
M buildall.sh
M source/python/build.sh
M source/thrift/build.sh
3 files changed, 10 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/68/15468/3
--
To view, visit http://gerrit.cloudera.org:8080/15468
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iec332462ba7f9eaa699247f546d2b6ba1faabd60
Gerrit-Change-Number: 15468
Gerrit-PatchSet: 3
Gerrit-Owner: Hector Acosta 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] WIP IMPALA-9434: Implement Robin Hood Hash Table.

2020-03-25 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15511 )

Change subject: WIP IMPALA-9434: Implement Robin Hood Hash Table.
..


Patch Set 4:

(6 comments)

Patch set 4 submitted.

This patch address some of comments from patch set 3, and add initial 
microbenchmarking code.
After some exploration, I come to conclusion that quadratic probing does not 
fit well with robin hood hashing. We might be better off by dropping quadratic 
probing entirely and commit on robin hood linear probing only.

http://gerrit.cloudera.org:8080/#/c/15511/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15511/3//COMMIT_MSG@12
PS3, Line 12: bucket rebalancing after every insert.
> You told me before about short-circuiting the lookup but I didn't quite get
I've been looking at this. It looks like short-circuiting lookup only possible 
in linear probing mode because it exploit that linear probing behavior. 
Therefore, we need to drop quadratic probing mode first before we can fully 
implement all improvement that is possible in Robin Hood, linear probing.


http://gerrit.cloudera.org:8080/#/c/15511/3//COMMIT_MSG@14
PS3, Line 14: Instead of proactively swapping elements during insertion, the
> Also I think when probing for lookup we might consider trying the "start in
I'm still thinking how to do single pass insert. From what I understand, we 
should tweak the insertion pattern that is using Iterator. For example, 
FindProbeRow(), FindBuildRowBucket() should commit early whether they indeed 
plan to do insert or just testing existence.


http://gerrit.cloudera.org:8080/#/c/15511/3/be/src/exec/hash-table.inline.h
File be/src/exec/hash-table.inline.h:

http://gerrit.cloudera.org:8080/#/c/15511/3/be/src/exec/hash-table.inline.h@115
PS3, Line 115: new_distance <= target_d
> Ah, I see what you mean. Somehow I was under impression that element that i
Done


http://gerrit.cloudera.org:8080/#/c/15511/3/be/src/exec/hash-table.inline.h@138
PS3, Line 138: // swapped back into temporary location at index 
curr_idx.
> Sorry, this example is bad, because the final balanced bucked should be [-1
This check, unfortunately, need to stay because it is possible to swap with 
empty bucket in the case of quadratic probing. We should iron this out if we 
decide to drop quadratic probing.


http://gerrit.cloudera.org:8080/#/c/15511/3/be/src/exec/hash-table.inline.h@162
PS3, Line 162:   } else {
> Wonder if we should consider storing the bucket distance (calculated during
Patch Set 4 address O(1) distance measurement for linear probing.

For quadratic probing, because the probe steps follow Triangular number 
pattern, we supposedly can do ~O(1)  by doing its inverse function
https://en.wikipedia.org/wiki/Triangular_number#Triangular_roots_and_tests_for_triangular_numbers

However, I'm leaning towards dropping quadratic probing entirely, so I keep it 
as loop for quadratic probing.


http://gerrit.cloudera.org:8080/#/c/15511/3/be/src/exec/hash-table.inline.h@374
PS3, Line 374: DCHECK(node_ != NULL);
> > And hash-table suppose to be thread-safe for read access.
Add rebalancing at the end of Iterator::SetTuple and it pass core tests.



--
To view, visit http://gerrit.cloudera.org:8080/15511
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I28eeccd7f9ccae39e31972391f971901bcbfe986
Gerrit-Change-Number: 15511
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Mar 2020 01:25:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE SENTRY to true

2020-03-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15214 )

Change subject: IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE_SENTRY 
to true
..


Patch Set 13:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5600/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15214
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9690a926953a8d3c3872277680b4be0551546c68
Gerrit-Change-Number: 15214
Gerrit-PatchSet: 13
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 26 Mar 2020 01:10:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3766: optionally compress spilled data

2020-03-25 Thread Tim Armstrong (Code Review)
Hello Sahil Takiar, Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15454

to look at the new patch set (#12).

Change subject: IMPALA-3766: optionally compress spilled data
..

IMPALA-3766: optionally compress spilled data

Enabled via --disk_spill_compression_codec, which uses
the same syntax as the compression_codec query option.
Recommended codecs are LZ4 and ZSTD. ZSTD supports
specifying a compression level.

The compression is done in TmpFileMgr using a temporary
buffer. Allocation of disk space is reworked slightly
so that the allocation can happen after compression.

The default power-of-two disk block sizes would lead
to a lot of internal fragmentation, so a new strategy
for free space management, similar to that used in
the data cache, can be used with
--disk_spill_punch_holes=true. TmpFileMgr will allocate
a range of the actual compressed size and punch holes
in the file for each range that is no longer needed.

UncompressedWriteIoBytes is added to the buffer pool
profiles, so that you can see what degree of compression
is achieved. Typically I saw ratios of 2-3x for LZ4 and
ZSTD (with LZ4 toward the lower end and ZSTD toward
the higher end).

Limitations:
The management of the compression buffer memory could
be improved. Ideally it would be integrated with the
buffer pool and use the buffer pool allocator instead
of being done "on the side". We would probably want to
do this before making this the default, for resource
management and performance reasons (doing a malloc()
directly does not use the caching supported by the
buffer pool).

Testing:
* Run buffer pool spilling tests with different combinations of
  the new options.
* Extend existing TmpFileMgr tests for file space allocation to
  run with hole punching enabled.
* Switch a couple of spilling tests to use the new option.
* Add a metrics test to check for scratch leaks.
* Enable the new options by default for end-to-end dockerized
  tests to get additional coverage.
* Add a unit test where allocating compression memory fails,
  both on the read and write path.
* Ran a single-node stress test on TPC-DS SF 1 and TPC-H SF 10
  The peak compression buffer usage was ~40MB.

Perf:
I ran this spilling query using an SSD as the scratch disk:

  set mem_limit=200m;
  select count(distinct l_partkey) from
  tpch30_parquet.lineitem;

The time taken for the second run of each query was:
No compression: 19.59s
LZ4: 18.56s
ZSTD: 20.59s

Change-Id: I9c08ff9504097f0fee8c32316c5c150136abe659
---
M be/src/runtime/bufferpool/buffer-pool-counters.h
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/tmp-file-mgr-internal.h
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/runtime/tmp-file-mgr.h
M be/src/service/query-options.cc
M be/src/util/parse-util.cc
M be/src/util/parse-util.h
M bin/jenkins/dockerized-impala-run-tests.sh
M tests/custom_cluster/test_scratch_disk.py
M tests/verifiers/metric_verifier.py
15 files changed, 845 insertions(+), 249 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/15454/12
--
To view, visit http://gerrit.cloudera.org:8080/15454
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9c08ff9504097f0fee8c32316c5c150136abe659
Gerrit-Change-Number: 15454
Gerrit-PatchSet: 12
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3766: optionally compress spilled data

2020-03-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15454 )

Change subject: IMPALA-3766: optionally compress spilled data
..


Patch Set 9:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/15454/9/be/src/runtime/bufferpool/buffer-pool-counters.h
File be/src/runtime/bufferpool/buffer-pool-counters.h:

http://gerrit.cloudera.org:8080/#/c/15454/9/be/src/runtime/bufferpool/buffer-pool-counters.h@59
PS9, Line 59:   RuntimeProfile::Counter* uncompressed_bytes_written;
> Do we need some tests for the new counters as well?
I added tests for the bytes written counters. I didn't do that for the timers 
since the SCOPED_TIMER macro makes it fairly trivial - I manually inspected a 
profile.


http://gerrit.cloudera.org:8080/#/c/15454/9/be/src/runtime/bufferpool/buffer-pool-counters.h@59
PS9, Line 59:   RuntimeProfile::Counter* uncompressed_bytes_written;
> I think adding the timers makes a lot of sense.
I moved the uncompressed bytes to the backend level and made sure there were 
compression and encryption timers to both. The concern about unaccounted CPU 
time in an operator was the deciding factor for the second one.


http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr-internal.h
File be/src/runtime/tmp-file-mgr-internal.h:

http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr-internal.h@94
PS11, Line 94: This is also the
 :   /// offset of the next chunk to allocate in the file.
> maybe move this to the beginning of the comment block?
I kinda think now that this sentence doesn't add much, so i deleted it instead 
of moving it.


http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc
File be/src/runtime/tmp-file-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@586
PS11, Line 586:  DCHECK(io_mgr_buffer->eosr());
  :   DCHECK_LE(io_mgr_buffer->len(), read_buffer.len());
  :   if (io_mgr_
> at this point, the compressed_.buffer() should be a nullptr, right? should
I'm not sure what you mean, it has to be non-NULL if it's reading a compressed 
range. I guess read_buffer does need to be non-null, so I added a DCHECK for 
that


http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@644
PS11, Line 644: return status;
  : }
  :
> not sure I understand what this method is for, but why don't we need to dec
This method is used if we wrote out a page from memory to disk, but didn't 
evict the page, and now are wanting to pin the page again. In that case we 
don't need to read the data from disk,but we *do* need to decrypt it if the 
buffer was encrypted in-place. With compression all the compreession and 
encryption was done in the separate buffer and the original one was unmodified.

Updated the comment to be less vague.


http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@779
PS11, Line 779:   write_range_->SetData(buffer_to_write.data(), 
buffer_to_write.len());
> if this fails, do we still need to call FreeCompressedBuffer()?
That's a good point. Adding FreeCompressedbuffer() on all the error paths was 
getting error-prone, so I switched to using a scoped exit trigger.


http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@807
PS11, Line 807:
> might be cleaner to have return true/false - true if the compression succee
Done


http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@833
PS11, Line 833:   file_ = file;
This was wrong since we are now allocating MaxOutputLen() - in an earlier patch 
it allocated a slightly smaller buffer.


http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@835
PS11, Line 835:   Status status = io_ctx->AddWriteRange(write_range_.get());
> is it possible for ProcessBlock to fail, but still write some data to the c
That's a good point. I just switched to calling .Release() to be symmetrical 
with TryAllocate(), I don't think the indirection helped here.

I'm not sure I can test this code path. The only case where I think 
ProcessBlock() could fail on the compression path is if the output buffer is 
too small, which we avoid by allocating MaxOutputLen(). So I'm not sure if this 
is actually reachable unless there's a weird internal error in a compressor



--
To view, visit http://gerrit.cloudera.org:8080/15454
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9c08ff9504097f0fee8c32316c5c150136abe659
Gerrit-Change-Number: 15454
Gerrit-PatchSet: 9
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Mar 2020 00:56:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP IMPALA-9434: Implement Robin Hood Hash Table.

2020-03-25 Thread Riza Suminto (Code Review)
Hello David Rorke, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15511

to look at the new patch set (#4).

Change subject: WIP IMPALA-9434: Implement Robin Hood Hash Table.
..

WIP IMPALA-9434: Implement Robin Hood Hash Table.

Robin hood hashing reduces the variances of probe lengths by
continually rebalancing elements. This patch is intended to be the
first step towards full robin hood hash table implementation by doing
bucket rebalancing after every insert.

Instead of proactively swapping elements during insertion, the
algorithm does the rebalancing by first letting the new element probe
an empty bucket and claim it through PrepareBucketForInsert(). After
the bucket at index curr_idx claimed, the rebalancing algorithm in
function Rebalance() walk through the probe sequence again to find
richer or empty bucket to swap with bucket[curr_idx]. The search and
swapping continue until it swap with empty bucket or does not find any
richer bucket along the probe sequence. It effectively treat
bucket[curr_idx] as temporary variable during the rebalancing process.
Rebalance() return the new index of the recently claimed bucket after
rebalancing process done.

The decision to not swap elements proactively is because the probe
function is used for both insertion and lookup-only. Keeping probing
routine decoupled from insertion routine allow us to invoke rebalance
only after bucket claim, not on lookup-only. This patch also maintain
compatibility with quadratic probing. However, it block us from
leveraging some robin hood technique that only possible with linear
probing such as lookup short-circuit (fast lookup on nonexistent
keys).

Testing:
- Pass core tests

Change-Id: I28eeccd7f9ccae39e31972391f971901bcbfe986
---
M be/src/benchmarks/CMakeLists.txt
A be/src/benchmarks/hash-table-benchmark.cc
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/hash-table.inline.h
M be/src/exprs/scalar-expr.h
6 files changed, 610 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/15511/4
--
To view, visit http://gerrit.cloudera.org:8080/15511
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I28eeccd7f9ccae39e31972391f971901bcbfe986
Gerrit-Change-Number: 15511
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9548: UdfExecutorTest failures after HIVE-22893

2020-03-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/15544 )

Change subject: IMPALA-9548: UdfExecutorTest failures after HIVE-22893
..

IMPALA-9548: UdfExecutorTest failures after HIVE-22893

HIVE-22893: "Enhance data size estimation for fields computed by UDFs"
modified o.a.h.hive.ql.udf.UDFSubstr and added a dependency on a few new
Hive classes located in the hive-exec jar. These classes include:

o.a.h.hive.ql.plan.ColStatistics
o.a.h.hive.ql.stats.estimator.StatEstimator
o.a.h.hive.ql.stats.estimator.StatEstimatorProvider

The test UdfExecutorTest#HiveStringsTest loads the class UDFSubstr and
thus needs to load the aforementioned stats classes as well.

shaded-deps/pom.xml selectively pulls in certain classes from the
hive-exec jar, and excludes all others. This patch simply addes the
necessary stats classes to load UDFSubstr. Thus, fixing
UdfExecutorTest.

Testing:
* Ran core tests with CDP_BUILD_NUMBER=2244454 and USE_CDP_HIVE=true,
  validated that UdfExecutorTest now passes

Change-Id: I5b3d5df08c2d48d21293d5a5308eb453f40184bf
Reviewed-on: http://gerrit.cloudera.org:8080/15544
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M shaded-deps/pom.xml
1 file changed, 3 insertions(+), 0 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/15544
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I5b3d5df08c2d48d21293d5a5308eb453f40184bf
Gerrit-Change-Number: 15544
Gerrit-PatchSet: 4
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9548: UdfExecutorTest failures after HIVE-22893

2020-03-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15544 )

Change subject: IMPALA-9548: UdfExecutorTest failures after HIVE-22893
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/15544
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5b3d5df08c2d48d21293d5a5308eb453f40184bf
Gerrit-Change-Number: 15544
Gerrit-PatchSet: 3
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Mar 2020 00:48:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE SENTRY to true

2020-03-25 Thread Fang-Yu Rao (Code Review)
Fang-Yu Rao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15214 )

Change subject: IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE_SENTRY 
to true
..


Patch Set 13:

> Patch Set 11:
>
> (1 comment)
>
> There is only one failing test and it is caused by the change.

Thanks Csaba! I have changed the way we generate the hash value of a query 
string as suggested in that link.


--
To view, visit http://gerrit.cloudera.org:8080/15214
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9690a926953a8d3c3872277680b4be0551546c68
Gerrit-Change-Number: 15214
Gerrit-PatchSet: 13
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 26 Mar 2020 00:38:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE SENTRY to true

2020-03-25 Thread Fang-Yu Rao (Code Review)
Fang-Yu Rao has uploaded a new patch set (#13). ( 
http://gerrit.cloudera.org:8080/15214 )

Change subject: IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE_SENTRY 
to true
..

IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE_SENTRY to true

This patch bumps up the version of Guava libraries from 14.0.1 to
28.1-jre. Due to some changes in Guava's API's, we modify the call
sites accordingly.

Moreover, in order to instruct the Java classes under the directory of
$IMPALA_HOME/common/yarn-extras to use the new Guava libraries, we
explicitly added a dependency in the corresponding pom.xml file.

On the other hand, we set DISABLE_SENTRY to true regardless of
$USE_CDP_HIVE since Sentry's Guava version has not been bumped up yet
and thus run-sentry-service.sh cannot be successfully executed. Recall
that by setting DISABLE_SENTRY to true we also disable every
Sentry-related test, which is fine from now on since Impala 3.4 was
recently branched. The plan is to drop support for Sentry in the Impala
4 line.

Change-Id: I9690a926953a8d3c3872277680b4be0551546c68
---
M bin/impala-config.sh
M common/yarn-extras/pom.xml
M 
common/yarn-extras/src/main/java/org/apache/impala/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java
M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java
M fe/src/main/java/org/apache/impala/analysis/AggregateInfoBase.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticInfo.java
M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/BoolLiteral.java
M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java
M fe/src/main/java/org/apache/impala/analysis/CastExpr.java
M fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
M fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java
M fe/src/main/java/org/apache/impala/analysis/NullLiteral.java
M fe/src/main/java/org/apache/impala/analysis/NumericLiteral.java
M fe/src/main/java/org/apache/impala/analysis/Path.java
M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/analysis/StringLiteral.java
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
M fe/src/main/java/org/apache/impala/catalog/Column.java
M fe/src/main/java/org/apache/impala/catalog/ColumnStats.java
M fe/src/main/java/org/apache/impala/catalog/DataSource.java
M fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/TableLoader.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java
M fe/src/main/java/org/apache/impala/planner/DataPartition.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java
M fe/src/main/java/org/apache/impala/planner/ScanNode.java
M fe/src/main/java/org/apache/impala/planner/SortNode.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/util/JvmPauseMonitor.java
M fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java
M impala-parent/pom.xml
48 files changed, 103 insertions(+), 94 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/15214/13
--
To view, visit http://gerrit.cloudera.org:8080/15214
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9690a926953a8d3c3872277680b4be0551546c68
Gerrit-Change-Number: 15214
Gerrit-PatchSet: 13
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-8005: Randomize partitioning exchanges.

2020-03-25 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15497 )

Change subject: IMPALA-8005: Randomize partitioning exchanges.
..


Patch Set 4: Code-Review+1

(1 comment)

LGTM.

http://gerrit.cloudera.org:8080/#/c/15497/3/be/src/runtime/data-stream-test.cc
File be/src/runtime/data-stream-test.cc:

http://gerrit.cloudera.org:8080/#/c/15497/3/be/src/runtime/data-stream-test.cc@742
PS3, Line 742:   ASSERT_EQ(result, true);
> If I'm understanding this correctly, I think the probability of a false mat
Yes, agree the probability of all receivers having a match for all received 
values is pretty small.   I was suggesting to simplify the test but am good 
with the current end-to-end test.



--
To view, visit http://gerrit.cloudera.org:8080/15497
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1936e6cc3e8d66420a5a9301f49221ca38f3e468
Gerrit-Change-Number: 15497
Gerrit-PatchSet: 4
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Thu, 26 Mar 2020 00:30:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3.

2020-03-25 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15524 )

Change subject: IMPALA-3343, IMPALA-9489: Make impala-shell compatible with 
python 3.
..


Patch Set 4:

(6 comments)

first pass, i'm not really familiar with our python infra, so it might take me 
a while to go through this properly; left a few preliminary comments

http://gerrit.cloudera.org:8080/#/c/15524/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15524/4//COMMIT_MSG@17
PS4, Line 17: With regard to validating the patch, my assumption is that simply 
passing
:   the existing set of e2e shell tests is sufficient to confirm 
that the shell
:   is functioning properly. No new tests were added.
given that the main pain here was dealing with the encodings, would it make 
sense to add additional tests that deal with this?


http://gerrit.cloudera.org:8080/#/c/15524/4//COMMIT_MSG@119
PS4, Line 119:   No effort has been made at this point to come up with a way to 
integrate
do we have plans to do this?


http://gerrit.cloudera.org:8080/#/c/15524/4/shell/impala_shell.py
File shell/impala_shell.py:

http://gerrit.cloudera.org:8080/#/c/15524/4/shell/impala_shell.py@600
PS4, Line 600: try:
 :   args = self.sanitise_input(args.decode('utf-8'))  # python2
 : except AttributeError:
 :   args = self.sanitise_input(args)  # python3
can we use the 'if sys.version_info.major == 2:' check here?


http://gerrit.cloudera.org:8080/#/c/15524/4/shell/impala_shell.py@1254
PS4, Line 1254:   # if filter(self.DML_REGEX.match, tokens): is_dml = True
  :   if any(self.DML_REGEX.match(t) for t in tokens):
was the first line suppose to be deleted?


http://gerrit.cloudera.org:8080/#/c/15524/4/shell/impala_shell.py@1913
PS4, Line 1913: print(shell.CANCELLATION_MESSAGE, file=sys.stderr)
  : shell._reconnect_cancellation()
  :   else:
  : print("Socket error %s: %s" % (e.errno, e))
what is this change for?


http://gerrit.cloudera.org:8080/#/c/15524/4/tests/shell/test_shell_interactive.py
File tests/shell/test_shell_interactive.py:

http://gerrit.cloudera.org:8080/#/c/15524/4/tests/shell/test_shell_interactive.py@803
PS4, Line 803: we expect users to also
 :   # any install 3rd upstream libraries from PyPI.
nit: revise



--
To view, visit http://gerrit.cloudera.org:8080/15524
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idb004d352fe230a890a6b6356496ba76c2fab615
Gerrit-Change-Number: 15524
Gerrit-PatchSet: 4
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 25 Mar 2020 21:51:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9466: impala-shell client retry for hs2-http protocol

2020-03-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15378 )

Change subject: IMPALA-9466: impala-shell client retry for hs2-http protocol
..


Patch Set 15:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5599/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15378
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0da9e9e8d34a340eaf763397cc095ff6260d65d5
Gerrit-Change-Number: 15378
Gerrit-PatchSet: 15
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Wed, 25 Mar 2020 21:38:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9466: impala-shell client retry for hs2-http protocol

2020-03-25 Thread Abhishek Rawat (Code Review)
Abhishek Rawat has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15378 )

Change subject: IMPALA-9466: impala-shell client retry for hs2-http protocol
..


Patch Set 15:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/15378/14/shell/impala_client.py
File shell/impala_client.py:

http://gerrit.cloudera.org:8080/#/c/15378/14/shell/impala_client.py@946
PS14, Line 946:  suppress_error_on_cance
> nit: make this a global string
Done


http://gerrit.cloudera.org:8080/#/c/15378/14/shell/impala_client.py@966
PS14, Line 966:
> nit: same comment as above
Done


http://gerrit.cloudera.org:8080/#/c/15378/14/tests/custom_cluster/test_hs2_fault_injection.py
File tests/custom_cluster/test_hs2_fault_injection.py:

http://gerrit.cloudera.org:8080/#/c/15378/14/tests/custom_cluster/test_hs2_fault_injection.py@21
PS14, Line 21: import requests
 :
> nit: insert newline between these two lines
Done


http://gerrit.cloudera.org:8080/#/c/15378/14/tests/custom_cluster/test_hs2_fault_injection.py@124
PS14, Line 124:
> since this is an internal testing method, it should be prefixed with __
Done



--
To view, visit http://gerrit.cloudera.org:8080/15378
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0da9e9e8d34a340eaf763397cc095ff6260d65d5
Gerrit-Change-Number: 15378
Gerrit-PatchSet: 15
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Wed, 25 Mar 2020 20:56:46 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9466: impala-shell client retry for hs2-http protocol

2020-03-25 Thread Abhishek Rawat (Code Review)
Abhishek Rawat has uploaded a new patch set (#15). ( 
http://gerrit.cloudera.org:8080/15378 )

Change subject: IMPALA-9466: impala-shell client retry for hs2-http protocol
..

IMPALA-9466: impala-shell client retry for hs2-http protocol

Added retries for idempotent rpcs:
OpenSession, PingImpalaHS2Service, GetResultSetMetadata,
CloseImpalaOperation (non dmls), CancelOperation, GetOperationStatus,
GetRuntimeProfile, GetExecSummary, GetLog

Retries were also added to the 'set all' query execution and subsequent
result fetch in the ImpalaHS2Client._open_session()

The retries are only supported for hs2-http protocol and enabled by
default. At most there are 3 retries for a failed rpc. There is a sleep
duration of 'n' seconds after nth retry.

Only failed rpcs due to an error in the http transport are retried and
if an rpc failed because the server returned an error in the rpc
response then such scenarios are not retriable.

Improved error diagnostics by dumping stack trace when ImpalaShell.
_execute_stmt() gets an 'Unknown Exception'.

Testing:
- Added a custom_cluster test which injects fault into the http transport
and checks expected behavior from the various rpcs. Some of these tests
leave the session in an open state and so these tests are not suitable
for the e2e test framework which have metric verifiers expecting related
metrics to be 0 at the end of the test.
- Manually tested real world scenarios with impala-shell client
communicating with an impala coordinator via a fault injecting istio mesh.
- Manually tested dropping connections on an nginx ingress gateway by sending
SIGTERM to all worker processes.

Change-Id: I0da9e9e8d34a340eaf763397cc095ff6260d65d5
---
M shell/impala_client.py
M shell/impala_shell.py
M shell/shell_exceptions.py
A tests/custom_cluster/test_hs2_fault_injection.py
4 files changed, 505 insertions(+), 51 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/15378/15
--
To view, visit http://gerrit.cloudera.org:8080/15378
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0da9e9e8d34a340eaf763397cc095ff6260d65d5
Gerrit-Change-Number: 15378
Gerrit-PatchSet: 15
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 


[Impala-ASF-CR] IMPALA-9466: impala-shell client retry for hs2-http protocol

2020-03-25 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15378 )

Change subject: IMPALA-9466: impala-shell client retry for hs2-http protocol
..


Patch Set 14: Code-Review+1

(4 comments)

A few more nits, otherwise LGTM.

http://gerrit.cloudera.org:8080/#/c/15378/14/shell/impala_client.py
File shell/impala_client.py:

http://gerrit.cloudera.org:8080/#/c/15378/14/shell/impala_client.py@946
PS14, Line 946: "TAPPLICATION_EXCEPTION"
nit: make this a global string


http://gerrit.cloudera.org:8080/#/c/15378/14/shell/impala_client.py@966
PS14, Line 966: "SERVER_ERROR"
nit: same comment as above


http://gerrit.cloudera.org:8080/#/c/15378/14/tests/custom_cluster/test_hs2_fault_injection.py
File tests/custom_cluster/test_hs2_fault_injection.py:

http://gerrit.cloudera.org:8080/#/c/15378/14/tests/custom_cluster/test_hs2_fault_injection.py@21
PS14, Line 21: import requests
 : from shell.ImpalaHttpClient import ImpalaHttpClient
nit: insert newline between these two lines


http://gerrit.cloudera.org:8080/#/c/15378/14/tests/custom_cluster/test_hs2_fault_injection.py@124
PS14, Line 124: expect_msg_retry
since this is an internal testing method, it should be prefixed with __
same with expect_msg_no_retry



--
To view, visit http://gerrit.cloudera.org:8080/15378
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0da9e9e8d34a340eaf763397cc095ff6260d65d5
Gerrit-Change-Number: 15378
Gerrit-PatchSet: 14
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Wed, 25 Mar 2020 20:12:34 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8005: Randomize partitioning exchanges.

2020-03-25 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15497 )

Change subject: IMPALA-8005: Randomize partitioning exchanges.
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15497/3/be/src/runtime/data-stream-test.cc
File be/src/runtime/data-stream-test.cc:

http://gerrit.cloudera.org:8080/#/c/15497/3/be/src/runtime/data-stream-test.cc@742
PS3, Line 742: if (map_query_1[i] != map_query_2[i]) {
> It depends on what the core purpose of the test is.  I am wondering if it i
If I'm understanding this correctly, I think the probability of a false match 
should be very small.

The odds of a value being hashed to the same receiver for both query1 and 
query2 is roughly 1/#receivers = 1/4. A false positive for this test would 
require all the values on all the receivers to be the same from query1 to 
query2. So, the probability of a false match is (1/4)^(#distinct values). It 
looks like the Sender uses a distinct value for each send, so there should be 
about 1024 distinct values (we should double check that). (1/4)^1024 is very 
small.

Doing the check on the sender side would also work. I like the end-to-end 
aspect of this, but a direct comparison on the sender side would also be ok.



--
To view, visit http://gerrit.cloudera.org:8080/15497
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1936e6cc3e8d66420a5a9301f49221ca38f3e468
Gerrit-Change-Number: 15497
Gerrit-PatchSet: 3
Gerrit-Owner: Anurag Mantripragada 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Wed, 25 Mar 2020 19:57:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9548: UdfExecutorTest failures after HIVE-22893

2020-03-25 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15544 )

Change subject: IMPALA-9548: UdfExecutorTest failures after HIVE-22893
..


Patch Set 3:

GVO failed due to an unrelated error. Filed IMPALA-9552


--
To view, visit http://gerrit.cloudera.org:8080/15544
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5b3d5df08c2d48d21293d5a5308eb453f40184bf
Gerrit-Change-Number: 15544
Gerrit-PatchSet: 3
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 25 Mar 2020 19:52:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9548: UdfExecutorTest failures after HIVE-22893

2020-03-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15544 )

Change subject: IMPALA-9548: UdfExecutorTest failures after HIVE-22893
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/15544
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5b3d5df08c2d48d21293d5a5308eb453f40184bf
Gerrit-Change-Number: 15544
Gerrit-PatchSet: 3
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 25 Mar 2020 19:49:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9548: UdfExecutorTest failures after HIVE-22893

2020-03-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15544 )

Change subject: IMPALA-9548: UdfExecutorTest failures after HIVE-22893
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5547/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/15544
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5b3d5df08c2d48d21293d5a5308eb453f40184bf
Gerrit-Change-Number: 15544
Gerrit-PatchSet: 3
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 25 Mar 2020 19:49:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9548: UdfExecutorTest failures after HIVE-22893

2020-03-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15544 )

Change subject: IMPALA-9548: UdfExecutorTest failures after HIVE-22893
..


Patch Set 2: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5546/


--
To view, visit http://gerrit.cloudera.org:8080/15544
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5b3d5df08c2d48d21293d5a5308eb453f40184bf
Gerrit-Change-Number: 15544
Gerrit-PatchSet: 2
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 25 Mar 2020 19:41:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3.

2020-03-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15524 )

Change subject: IMPALA-3343, IMPALA-9489: Make impala-shell compatible with 
python 3.
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/15524/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15524/4//COMMIT_MSG@121
PS4, Line 121:   processes.
> It's true -- I actually had one more JIRA filed for that:
Maybe it's getting a new sqlparse version from somewhere?


http://gerrit.cloudera.org:8080/#/c/15524/4/shell/impala_shell.py
File shell/impala_shell.py:

http://gerrit.cloudera.org:8080/#/c/15524/4/shell/impala_shell.py@1790
PS4, Line 1790: if isinstance(options.output_delimiter, str):
> Ah, interesting. I guess I was just working from our existing test cases --
I'm OK with deferring it, I don't know if it actually works on master. I doubt 
anyone is relying on it.



--
To view, visit http://gerrit.cloudera.org:8080/15524
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idb004d352fe230a890a6b6356496ba76c2fab615
Gerrit-Change-Number: 15524
Gerrit-PatchSet: 4
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 25 Mar 2020 18:26:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE SENTRY to true

2020-03-25 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15214 )

Change subject: IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE_SENTRY 
to true
..


Patch Set 11:

(1 comment)

There is only one failing test and it is caused by the change.

http://gerrit.cloudera.org:8080/#/c/15214/11/fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java
File fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java:

http://gerrit.cloudera.org:8080/#/c/15214/11/fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java@765
PS11, Line 765: hasher.putString(queryStr, Charset.forName("UTF-8"));
We investigated the failed test with Fang-Yu and it is caused by returning 
different hash values here after the change, see 
https://stackoverflow.com/questions/21287714/hashing-issue-between-guava-versions/21291270#21291270



--
To view, visit http://gerrit.cloudera.org:8080/15214
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9690a926953a8d3c3872277680b4be0551546c68
Gerrit-Change-Number: 15214
Gerrit-PatchSet: 11
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 25 Mar 2020 18:19:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE SENTRY to true

2020-03-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15214 )

Change subject: IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE_SENTRY 
to true
..


Patch Set 11: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5545/


--
To view, visit http://gerrit.cloudera.org:8080/15214
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9690a926953a8d3c3872277680b4be0551546c68
Gerrit-Change-Number: 15214
Gerrit-PatchSet: 11
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 25 Mar 2020 18:07:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3766: optionally compress spilled data

2020-03-25 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15454 )

Change subject: IMPALA-3766: optionally compress spilled data
..


Patch Set 9:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/15454/9/be/src/runtime/bufferpool/buffer-pool-counters.h
File be/src/runtime/bufferpool/buffer-pool-counters.h:

http://gerrit.cloudera.org:8080/#/c/15454/9/be/src/runtime/bufferpool/buffer-pool-counters.h@59
PS9, Line 59:   RuntimeProfile::Counter* uncompressed_bytes_written;
> I think adding the timers makes a lot of sense.
Do we need some tests for the new counters as well?


http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr-internal.h
File be/src/runtime/tmp-file-mgr-internal.h:

http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr-internal.h@94
PS11, Line 94: This is also the
 :   /// offset of the next chunk to allocate in the file.
maybe move this to the beginning of the comment block?


http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc
File be/src/runtime/tmp-file-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@586
PS11, Line 586:  DCHECK(io_mgr_buffer->eosr());
  :   DCHECK_LE(io_mgr_buffer->len(), read_buffer.len());
  :   if (io_mgr_
at this point, the compressed_.buffer() should be a nullptr, right? should we 
add a DCHECK for that?


http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@644
PS11, Line 644: return status;
  : }
  :
not sure I understand what this method is for, but why don't we need to decrypt 
the data if the data is compressed?


http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@779
PS11, Line 779:   write_range_->SetData(buffer_to_write.data(), 
buffer_to_write.len());
if this fails, do we still need to call FreeCompressedBuffer()?


http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@807
PS11, Line 807:
might be cleaner to have return true/false - true if the compression succeeded, 
false otherwise. I think it currently relies on checking if 'compressed_len_' 
has been set, which seems more fragile.
i think it simplifies the caller as well, you could re-write it to something 
like this:

 MemRange buffer_to_write = buffer;
 if (parent_->tmp_file_mgr_->compression_enabled()) {
   if (TryCompress(buffer)) {
 buffer_to_write = MemRange(compressed_.buffer(), compressed_len_);
}
 }


http://gerrit.cloudera.org:8080/#/c/15454/11/be/src/runtime/tmp-file-mgr.cc@835
PS11, Line 835:   Status status = io_ctx->AddWriteRange(write_range_.get());
is it possible for ProcessBlock to fail, but still write some data to the 
compressed buffer? if it is, then will this DCHECK? at this point 
compressed_len_ should still be -1 right? which means 'DCHECK(is_compressed())' 
will fail because 'is_compressed()' should return false?



--
To view, visit http://gerrit.cloudera.org:8080/15454
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9c08ff9504097f0fee8c32316c5c150136abe659
Gerrit-Change-Number: 15454
Gerrit-PatchSet: 9
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 25 Mar 2020 17:34:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9472,IMPALA-9473: Add per-partition metrics for data cache

2020-03-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15382 )

Change subject: IMPALA-9472,IMPALA-9473: Add per-partition metrics for data 
cache
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5598/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/15382
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I56a57d75ff11f00ebc85b85bcaf104fb8108c478
Gerrit-Change-Number: 15382
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Wed, 25 Mar 2020 17:34:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9472,IMPALA-9473: Add per-partition metrics for data cache

2020-03-25 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15382 )

Change subject: IMPALA-9472,IMPALA-9473: Add per-partition metrics for data 
cache
..


Patch Set 3:

Just stacked this on top of the latest LIRS


--
To view, visit http://gerrit.cloudera.org:8080/15382
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I56a57d75ff11f00ebc85b85bcaf104fb8108c478
Gerrit-Change-Number: 15382
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Wed, 25 Mar 2020 16:52:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9472,IMPALA-9473: Add per-partition metrics for data cache

2020-03-25 Thread Joe McDonnell (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/15382

to look at the new patch set (#3).

Change subject: IMPALA-9472,IMPALA-9473: Add per-partition metrics for data 
cache
..

IMPALA-9472,IMPALA-9473: Add per-partition metrics for data cache

This adds two sets of metrics. The first is per-partition metrics
to track the performance of the underlying filesystem for the
data cache. It keeps histograms of read, write, and eviction
latency for each data cache partition along with another metric
recording the path for the partition. These are exposed as the
following metrics:
impala-server.io-mgr.remote-data-cache-partition-$0.path
impala-server.io-mgr.remote-data-cache-partition-$0.read-latency
impala-server.io-mgr.remote-data-cache-partition-$0.write-latency
impala-server.io-mgr.remote-data-cache-partition-$0.eviction-latency

This also adds metrics to keep counts of hits, misses, and entries
in the data cache. Since reducing the latency of IO is an important
feature of the data cache, the absolute count of hits and misses
is as important as the hit bytes and miss bytes. This adds the
following metrics:
impala-server.io-mgr.remote-data-cache-hit-count
impala-server.io-mgr.remote-data-cache-miss-count
impala-server.io-mgr.remote-data-cache-num-entries

Testing:
 - Hand testing to verify the per-partition latency histograms
 - Modified custom_cluster/test_data_cache.py to also test
   the counts.

Change-Id: I56a57d75ff11f00ebc85b85bcaf104fb8108c478
---
M be/src/runtime/io/data-cache.cc
M be/src/runtime/io/data-cache.h
M be/src/runtime/io/hdfs-file-reader.cc
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
M common/thrift/metrics.json
M tests/custom_cluster/test_data_cache.py
7 files changed, 205 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/15382/3
--
To view, visit http://gerrit.cloudera.org:8080/15382
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I56a57d75ff11f00ebc85b85bcaf104fb8108c478
Gerrit-Change-Number: 15382
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-9107: Add timestamp to maven logging options.

2020-03-25 Thread Joe McDonnell (Code Review)
Joe McDonnell has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/15537 )

Change subject: IMPALA-9107: Add timestamp to maven logging options.
..

IMPALA-9107: Add timestamp to maven logging options.

We found that using awk to add a timestamp to the maven log can fail
if gawk is not installed. It seems better to configure maven to add
the timestamp itself.

Sample output:


Running mvn -U -Dorg.slf4j.simpleLogger.showDateTime=true 
-Dorg.slf4j.simpleLogger.dateTimeFormat=HH:mm:ss -B install -DskipTests
Directory /home/dknupp/Impala/ext-data-source

16:37:16 [INFO] Scanning for projects...
16:37:16 [INFO] 

16:37:16 [INFO] Reactor Build Order:
16:37:16 [INFO]
16:37:16 [INFO] Apache Impala External Data Source  
   [pom]
16:37:16 [INFO] Apache Impala External Data Source API  
   [jar]
16:37:16 [INFO] Apache Impala External Data Source Sample   
   [jar]
16:37:16 [INFO] Apache Impala External Data Source Test Library 
   [jar]
16:37:17 [INFO]
16:37:17 [INFO] < org.apache.impala:impala-data-source 
>
16:37:17 [INFO] Building Apache Impala External Data Source 1.0-SNAPSHOT
   [1/4]
16:37:17 [INFO] [ pom 
]-
[etc...]

Change-Id: I10fbe9eb76b66e6ba00db9f95c91063410dd1b4e
Reviewed-on: http://gerrit.cloudera.org:8080/15537
Reviewed-by: Laszlo Gaal 
Tested-by: Impala Public Jenkins 
---
M bin/mvn-quiet.sh
1 file changed, 4 insertions(+), 2 deletions(-)

Approvals:
  Laszlo Gaal: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/15537
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I10fbe9eb76b66e6ba00db9f95c91063410dd1b4e
Gerrit-Change-Number: 15537
Gerrit-PatchSet: 4
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Laszlo Gaal 


[Impala-ASF-CR] IMPALA-9538 Bump up linux-syscall-support.h

2020-03-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15510 )

Change subject: IMPALA-9538 Bump up linux-syscall-support.h
..


Patch Set 12:

I cherry-picked this to Kudu here: https://gerrit.cloudera.org/#/c/15557/

It looks like it conflicts with a different, larger, Kudu patch but it probably 
makes sense to merge this ahead.


--
To view, visit http://gerrit.cloudera.org:8080/15510
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6c46acb17f048890a3f93fc6b910b2df3c1a7058
Gerrit-Change-Number: 15510
Gerrit-PatchSet: 12
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 25 Mar 2020 16:35:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9538 Bump up linux-syscall-support.h

2020-03-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/15510 )

Change subject: IMPALA-9538 Bump up linux-syscall-support.h
..

IMPALA-9538 Bump up linux-syscall-support.h

Bump up linux-syscall-support.h to newest version
which support aarch64

Change-Id: I6c46acb17f048890a3f93fc6b910b2df3c1a7058
Reviewed-on: http://gerrit.cloudera.org:8080/15510
Reviewed-by: Tim Armstrong 
Tested-by: Tim Armstrong 
---
M be/src/gutil/linux_syscall_support.h
M be/src/gutil/spinlock_linux-inl.h
M be/src/kudu/util/debug-util.cc
3 files changed, 1,746 insertions(+), 891 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/15510
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I6c46acb17f048890a3f93fc6b910b2df3c1a7058
Gerrit-Change-Number: 15510
Gerrit-PatchSet: 12
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9538 Bump up linux-syscall-support.h

2020-03-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15510 )

Change subject: IMPALA-9538 Bump up linux-syscall-support.h
..


Patch Set 11: Verified+1 Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/15510
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6c46acb17f048890a3f93fc6b910b2df3c1a7058
Gerrit-Change-Number: 15510
Gerrit-PatchSet: 11
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 25 Mar 2020 16:13:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9538 Bump up linux-syscall-support.h

2020-03-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15510 )

Change subject: IMPALA-9538 Bump up linux-syscall-support.h
..


Patch Set 10: Verified+1

I'll carry the verification since nothing changed between patches and it 
compiled OK


--
To view, visit http://gerrit.cloudera.org:8080/15510
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6c46acb17f048890a3f93fc6b910b2df3c1a7058
Gerrit-Change-Number: 15510
Gerrit-PatchSet: 10
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 25 Mar 2020 16:12:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3766: optionally compress spilled data

2020-03-25 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15454 )

Change subject: IMPALA-3766: optionally compress spilled data
..


Patch Set 11:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/15454/9/be/src/runtime/bufferpool/buffer-pool-counters.h
File be/src/runtime/bufferpool/buffer-pool-counters.h:

http://gerrit.cloudera.org:8080/#/c/15454/9/be/src/runtime/bufferpool/buffer-pool-counters.h@59
PS9, Line 59:   RuntimeProfile::Counter* uncompressed_bytes_written;
> That's a good point. This is a per-operator counter versus a per-backend co
I think adding the timers makes a lot of sense.
For uncompressed bytes counter, I'm not sure. I could go either way. It looks 
like most "Buffer pool" counters are exposed per-operator, and are then 
aggregated in the per-backend profiles ("Per Node Profiles")?
I agree its probably not super useful to understand the compression ratio 
per-operator.


http://gerrit.cloudera.org:8080/#/c/15454/9/be/src/runtime/tmp-file-mgr.cc
File be/src/runtime/tmp-file-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/15454/9/be/src/runtime/tmp-file-mgr.cc@63
PS9, Line 63: d) Limit on the total bytes of compressio
> Updated the comment to explain (I think that's what you were getting at, ri
yeah updated docs for this flag look good, thanks



--
To view, visit http://gerrit.cloudera.org:8080/15454
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9c08ff9504097f0fee8c32316c5c150136abe659
Gerrit-Change-Number: 15454
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 25 Mar 2020 16:12:22 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9538 Bump up linux-syscall-support.h

2020-03-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15510 )

Change subject: IMPALA-9538 Bump up linux-syscall-support.h
..


Patch Set 10: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/15510
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6c46acb17f048890a3f93fc6b910b2df3c1a7058
Gerrit-Change-Number: 15510
Gerrit-PatchSet: 10
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 25 Mar 2020 16:12:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9548: UdfExecutorTest failures after HIVE-22893

2020-03-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15544 )

Change subject: IMPALA-9548: UdfExecutorTest failures after HIVE-22893
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/15544
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5b3d5df08c2d48d21293d5a5308eb453f40184bf
Gerrit-Change-Number: 15544
Gerrit-PatchSet: 2
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 25 Mar 2020 14:44:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9548: UdfExecutorTest failures after HIVE-22893

2020-03-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15544 )

Change subject: IMPALA-9548: UdfExecutorTest failures after HIVE-22893
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5546/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/15544
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5b3d5df08c2d48d21293d5a5308eb453f40184bf
Gerrit-Change-Number: 15544
Gerrit-PatchSet: 2
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 25 Mar 2020 14:44:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE SENTRY to true

2020-03-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15214 )

Change subject: IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE_SENTRY 
to true
..


Patch Set 11:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5545/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/15214
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9690a926953a8d3c3872277680b4be0551546c68
Gerrit-Change-Number: 15214
Gerrit-PatchSet: 11
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 25 Mar 2020 13:46:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE SENTRY to true

2020-03-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15214 )

Change subject: IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE_SENTRY 
to true
..


Patch Set 11: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/15214
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9690a926953a8d3c3872277680b4be0551546c68
Gerrit-Change-Number: 15214
Gerrit-PatchSet: 11
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 25 Mar 2020 13:46:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE SENTRY to true

2020-03-25 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15214 )

Change subject: IMPALA-8870: Bump up Guava to 28.1-jre and set DISABLE_SENTRY 
to true
..


Patch Set 10: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/15214
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9690a926953a8d3c3872277680b4be0551546c68
Gerrit-Change-Number: 15214
Gerrit-PatchSet: 10
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Wed, 25 Mar 2020 13:44:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9545 Decide cacheline size of aarch64

2020-03-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/1 )

Change subject: IMPALA-9545 Decide cacheline size of aarch64
..


Patch Set 2: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5544/


--
To view, visit http://gerrit.cloudera.org:8080/1
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id56bfa63e4b6cd957c4997f10de78a5f4111f61f
Gerrit-Change-Number: 1
Gerrit-PatchSet: 2
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 25 Mar 2020 11:17:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9545 Decide cacheline size of aarch64

2020-03-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/1 )

Change subject: IMPALA-9545 Decide cacheline size of aarch64
..


Patch Set 2:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/5597/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/1
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id56bfa63e4b6cd957c4997f10de78a5f4111f61f
Gerrit-Change-Number: 1
Gerrit-PatchSet: 2
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 25 Mar 2020 07:03:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3343, IMPALA-9489: Make impala-shell compatible with python 3.

2020-03-25 Thread David Knupp (Code Review)
David Knupp has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15524 )

Change subject: IMPALA-3343, IMPALA-9489: Make impala-shell compatible with 
python 3.
..


Patch Set 4:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/15524/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15524/4//COMMIT_MSG@121
PS4, Line 121:   processes.
> I'm running into an issue where if I run the packaged shell with python3, s
It's true -- I actually had one more JIRA filed for that:
https://issues.apache.org/jira/browse/IMPALA-9362

It is weird that the pip-installed version doesn't have this issue:

  $ impala-shell
  Starting Impala with no authentication using Python 3.7.6  <-- note version
  Opened TCP connection to localhost:21000
  Connected to localhost:21000
  Server version: impalad version 3.4.0-SNAPSHOT DEBUG (build cc91c...)
  
***
  Welcome to the Impala shell.
  (Impala Shell v3.4.0-SNAPSHOT (cc91c66) built on Tue Mar 24 09:54:50 PDT 2020)

  The SET command shows the current value of all shell and query options.
  
***
  [localhost:21000] default>


http://gerrit.cloudera.org:8080/#/c/15524/4/shell/impala_shell.py
File shell/impala_shell.py:

http://gerrit.cloudera.org:8080/#/c/15524/4/shell/impala_shell.py@1790
PS4, Line 1790: if isinstance(options.output_delimiter, str):
> This seems to barf if I pass in a unicode delimiter. I don't think this is
Ah, interesting. I guess I was just working from our existing test cases -- I 
guess we don't have one for this. We can try to resolve it if you think it's 
important.


http://gerrit.cloudera.org:8080/#/c/15524/4/shell/make_shell_tarball.sh
File shell/make_shell_tarball.sh:

http://gerrit.cloudera.org:8080/#/c/15524/4/shell/make_shell_tarball.sh@52
PS4, Line 52: if [ "${USE_THRIFT11_GEN_PY:-}" == "false" ]; then
> Does it make sense to allow overriding this? Would a shell built with old t
Actually, that's a good point. Other tests pass, but unicode-related tests 
don't. We should probably disallow this.


http://gerrit.cloudera.org:8080/#/c/15524/4/shell/make_shell_tarball.sh@53
PS4, Line 53:   # thrift 0.9.3-p7
> Not sure if we need this comment?
Ack



--
To view, visit http://gerrit.cloudera.org:8080/15524
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idb004d352fe230a890a6b6356496ba76c2fab615
Gerrit-Change-Number: 15524
Gerrit-PatchSet: 4
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 25 Mar 2020 06:28:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9545 Decide cacheline size of aarch64

2020-03-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/1 )

Change subject: IMPALA-9545 Decide cacheline size of aarch64
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5544/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/1
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id56bfa63e4b6cd957c4997f10de78a5f4111f61f
Gerrit-Change-Number: 1
Gerrit-PatchSet: 2
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Wed, 25 Mar 2020 06:23:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9545 Decide cacheline size of aarch64

2020-03-25 Thread Anonymous Coward (Code Review)
zhaoren...@hotmail.com has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/1


Change subject: IMPALA-9545 Decide cacheline size of aarch64
..

IMPALA-9545 Decide cacheline size of aarch64

ARM64's L3 cacheline size is different according
 to CPU vendor's architecture. If user defined
 CACHELINESIZE_AARCH64 in impala-config-local.sh,
then we will use that value, if user did not
 define it, then we will get the value from OS,
if fail, then we will use the default value 64.

Change-Id: Id56bfa63e4b6cd957c4997f10de78a5f4111f61f
---
M CMakeLists.txt
M be/src/gutil/port.h
M buildall.sh
3 files changed, 24 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/1/2
--
To view, visit http://gerrit.cloudera.org:8080/1
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Id56bfa63e4b6cd957c4997f10de78a5f4111f61f
Gerrit-Change-Number: 1
Gerrit-PatchSet: 2
Gerrit-Owner: Anonymous Coward