[Impala-ASF-CR] IMPALA-7360: sequence scanners sometimes skip blocks

2018-07-26 Thread Tim Armstrong (Code Review)
Tim Armstrong has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/11062 )

Change subject: IMPALA-7360: sequence scanners sometimes skip blocks
..

IMPALA-7360: sequence scanners sometimes skip blocks

The handling of sync markers after processing a block was broken - eos_
was set if the sync marker straddles the boundary. The expected
behaviour (documented by comments) in this case is that the current
scanner should process the next block, if there is one.

If you look at the logic before the IMPALA-3905 change in commit
931bf49cd90e496df6bf260ae668ec6944f0016c, it split the checking
of eosr() and eof() similar to this patch.

Testing:
Add regression tests that scans a large table with a variety of
different scan range lengths, with some randomisation to exercise
different edge cases. This reliably triggered the bug.

Change-Id: I49a70a4925b0271204b8eea4f980299d7654805a
Reviewed-on: http://gerrit.cloudera.org:8080/11062
Reviewed-by: Michael Ho 
Tested-by: Impala Public Jenkins 
---
M be/src/exec/base-sequence-scanner.cc
A testdata/workloads/tpch/queries/tpch-scan-range-lengths.test
M tests/query_test/test_scanners.py
3 files changed, 86 insertions(+), 20 deletions(-)

Approvals:
  Michael Ho: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/11062
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I49a70a4925b0271204b8eea4f980299d7654805a
Gerrit-Change-Number: 11062
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7360: sequence scanners sometimes skip blocks

2018-07-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11062 )

Change subject: IMPALA-7360: sequence scanners sometimes skip blocks
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/11062
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I49a70a4925b0271204b8eea4f980299d7654805a
Gerrit-Change-Number: 11062
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Pranay Singh
Gerrit-Comment-Date: Fri, 27 Jul 2018 04:47:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility

2018-07-26 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10998 )

Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10998/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10998/3//COMMIT_MSG@42
PS3, Line 42: These metrics are exposed on the daemon's web UI in /memz page 
under
: section "JVM Garbage Collection metrics". They can also be 
consumed
: by tools from the /metrics endpoint.
> Yea, I was thinking maybe you could import JMXJsonServlet.java wholesale fr
I have a working patch for /jmx end point. Before I send it out for review, I'm 
wondering if we should still expose the GC metrics in the /memz page and 
/metrics as in https://pasteboard.co/Hwc4P5f.png

What do you think?



--
To view, visit http://gerrit.cloudera.org:8080/10998
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341
Gerrit-Change-Number: 10998
Gerrit-PatchSet: 3
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 27 Jul 2018 02:17:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7360: sequence scanners sometimes skip blocks

2018-07-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11062 )

Change subject: IMPALA-7360: sequence scanners sometimes skip blocks
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2868/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/11062
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I49a70a4925b0271204b8eea4f980299d7654805a
Gerrit-Change-Number: 11062
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Pranay Singh
Gerrit-Comment-Date: Fri, 27 Jul 2018 01:32:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7360: sequence scanners sometimes skip blocks

2018-07-26 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11062 )

Change subject: IMPALA-7360: sequence scanners sometimes skip blocks
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11062
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I49a70a4925b0271204b8eea4f980299d7654805a
Gerrit-Change-Number: 11062
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Pranay Singh
Gerrit-Comment-Date: Fri, 27 Jul 2018 01:32:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7333: remove MarkNeedsDeepCopy() in agg and BTS

2018-07-26 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11007 )

Change subject: IMPALA-7333: remove MarkNeedsDeepCopy() in agg and BTS
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11007/8/be/src/runtime/buffered-tuple-stream.cc
File be/src/runtime/buffered-tuple-stream.cc:

http://gerrit.cloudera.org:8080/#/c/11007/8/be/src/runtime/buffered-tuple-stream.cc@763
PS8, Line 763: has_read_write_page
> I think I see your point - in that case on the next GetNext() call we go do
I was able to hit a DCHECK in NextReadPage() with a new test, will work on a 
fix.



--
To view, visit http://gerrit.cloudera.org:8080/11007
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I91ac53bacc00df4726c015a30ba5a2026aa4b5f5
Gerrit-Change-Number: 11007
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Jul 2018 01:08:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7195 IMPALA-7222: [DOCS] Impala delegation with groups

2018-07-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11068 )

Change subject: IMPALA-7195 IMPALA-7222: [DOCS] Impala delegation with groups
..


Patch Set 2: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/13/ : Doc tests passed.


--
To view, visit http://gerrit.cloudera.org:8080/11068
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6948ab2ef9b82b123f7a1f4fdc83cfb06e9f912f
Gerrit-Change-Number: 11068
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Anonymous Coward #406
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Jul 2018 00:36:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3825: Distribute Runtime Filtering Aggregation

2018-07-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11055 )

Change subject: IMPALA-3825: Distribute Runtime Filtering Aggregation
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/81/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11055
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I94e183a0353fc46f8d3eccae029d2d52c5cdc40c
Gerrit-Change-Number: 11055
Gerrit-PatchSet: 3
Gerrit-Owner: Rahul Shivu Mahadev 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 27 Jul 2018 00:34:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7195 IMPALA-7222: [DOCS] Impala delegation with groups

2018-07-26 Thread Alex Rodoni (Code Review)
Hello Fredy Wijaya, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11068

to look at the new patch set (#2).

Change subject: IMPALA-7195 IMPALA-7222: [DOCS] Impala delegation with groups
..

IMPALA-7195 IMPALA-7222: [DOCS] Impala delegation with groups

Added clarifications about delegation.

Change-Id: I6948ab2ef9b82b123f7a1f4fdc83cfb06e9f912f
---
M docs/topics/impala_delegation.xml
1 file changed, 181 insertions(+), 45 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/11068/2
--
To view, visit http://gerrit.cloudera.org:8080/11068
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6948ab2ef9b82b123f7a1f4fdc83cfb06e9f912f
Gerrit-Change-Number: 11068
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7195 IMPALA-7222: [DOCS] Impala delegation with groups

2018-07-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11068 )

Change subject: IMPALA-7195 IMPALA-7222: [DOCS] Impala delegation with groups
..


Patch Set 2:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/13/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/11068
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6948ab2ef9b82b123f7a1f4fdc83cfb06e9f912f
Gerrit-Change-Number: 11068
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Jul 2018 00:27:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7362: Add query option to set timezone

2018-07-26 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11064 )

Change subject: IMPALA-7362: Add query option to set timezone
..


Patch Set 2:

(8 comments)

Took

http://gerrit.cloudera.org:8080/#/c/11064/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11064/2//COMMIT_MSG@9
PS2, Line 9: This change adds a new query option "timezone" which
I wonder if something like local_timezone would be clearer? Or is that too 
verbose?


http://gerrit.cloudera.org:8080/#/c/11064/2//COMMIT_MSG@11
PS2, Line 11:
Can you briefly mention that this is meant to be a user-visible option rather 
than a development/debugging option? I think that's the intent right?


http://gerrit.cloudera.org:8080/#/c/11064/2//COMMIT_MSG@24
PS2, Line 24: In the near future Parquet timestamps's
: isAdjustedToUTC property will be supported, which will
: decide whether to do utc->local conversion on a per
: file+column basis.
Is there any potential interaction with table properties or similar? I don't 
know if there's ever a need to specify this per-table rather than per-user.


http://gerrit.cloudera.org:8080/#/c/11064/2/be/src/service/query-options.cc
File be/src/service/query-options.cc:

http://gerrit.cloudera.org:8080/#/c/11064/2/be/src/service/query-options.cc@687
PS2, Line 687: string timezone = 
StripSuffixString(StripPrefixString(value, "\""), "\"");
The stripping might need a brief explanation


http://gerrit.cloudera.org:8080/#/c/11064/2/be/src/service/query-options.cc@689
PS2, Line 689: NULL
nullptr


http://gerrit.cloudera.org:8080/#/c/11064/2/be/src/service/query-options.cc@689
PS2, Line 689: if(TimezoneDatabase::FindTimezone(timezone) == NULL) {
Missing " " before (


http://gerrit.cloudera.org:8080/#/c/11064/2/common/thrift/ImpalaService.thrift
File common/thrift/ImpalaService.thrift:

http://gerrit.cloudera.org:8080/#/c/11064/2/common/thrift/ImpalaService.thrift@331
PS2, Line 331:  The default is UTC,
This is a little unclear because the default value is "". I guess this means 
that if the value is "", then that algorithm is used to decide?


http://gerrit.cloudera.org:8080/#/c/11064/2/tests/custom_cluster/test_local_tz_conversion.py
File tests/custom_cluster/test_local_tz_conversion.py:

http://gerrit.cloudera.org:8080/#/c/11064/2/tests/custom_cluster/test_local_tz_conversion.py@62
PS2, Line 62:   
@CustomClusterTestSuite.with_args("--use_local_tz_for_unix_timestamp_conversions=true")
Can we combine this with test_utc_timestamp_functions() to avoid restarting the 
cluster with the same options?



--
To view, visit http://gerrit.cloudera.org:8080/11064
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I73de86eff096e1c581d3b56a0d9330d686f77272
Gerrit-Change-Number: 11064
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Jul 2018 00:12:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7225: REFRESH..PARTITION shoud not reset partition's num rows

2018-07-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/11056 )

Change subject: IMPALA-7225: REFRESH..PARTITION shoud not reset partition's num 
rows
..

IMPALA-7225: REFRESH..PARTITION shoud not reset partition's num rows

The bug was that HdfsTable#reloadPartition() was not setting the
partition's numRows_ after reloading the partition.

Fix: Move the code to set numRows_ to createPartition() call so that
the callers need not explicitly set it after calling createPartition().
Additionally consolidates other common code in createPartition().

Added a test.

Change-Id: I5886684e26b453c76f331f6e807f813146c6bf3a
Reviewed-on: http://gerrit.cloudera.org:8080/11056
Reviewed-by: Todd Lipcon 
Tested-by: Impala Public Jenkins 
---
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M tests/metadata/test_refresh_partition.py
2 files changed, 42 insertions(+), 25 deletions(-)

Approvals:
  Todd Lipcon: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/11056
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I5886684e26b453c76f331f6e807f813146c6bf3a
Gerrit-Change-Number: 11056
Gerrit-PatchSet: 3
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-7225: REFRESH..PARTITION shoud not reset partition's num rows

2018-07-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11056 )

Change subject: IMPALA-7225: REFRESH..PARTITION shoud not reset partition's num 
rows
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/11056
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5886684e26b453c76f331f6e807f813146c6bf3a
Gerrit-Change-Number: 11056
Gerrit-PatchSet: 2
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 27 Jul 2018 00:11:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7195: [DOCS] Impala delegation with groups

2018-07-26 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11068 )

Change subject: IMPALA-7195: [DOCS] Impala delegation with groups
..


Patch Set 1: Code-Review+1

(1 comment)

I don't know enough about the technical side of this to verify the content, but 
the text seemed clear and I learned a little bit about a feature I knew very 
little about.

http://gerrit.cloudera.org:8080/#/c/11068/1/docs/topics/impala_delegation.xml
File docs/topics/impala_delegation.xml:

http://gerrit.cloudera.org:8080/#/c/11068/1/docs/topics/impala_delegation.xml@100
PS1, Line 100:
space intended?



--
To view, visit http://gerrit.cloudera.org:8080/11068
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6948ab2ef9b82b123f7a1f4fdc83cfb06e9f912f
Gerrit-Change-Number: 11068
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Jul 2018 00:01:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3825: Distribute Runtime Filtering Aggregation

2018-07-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11055 )

Change subject: IMPALA-3825: Distribute Runtime Filtering Aggregation
..


Patch Set 3:

Build Started https://jenkins.impala.io/job/gerrit-code-review-checks/81/

Running initial code review checks. This is experimental - please report any 
issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/11055
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I94e183a0353fc46f8d3eccae029d2d52c5cdc40c
Gerrit-Change-Number: 11055
Gerrit-PatchSet: 3
Gerrit-Owner: Rahul Shivu Mahadev 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 27 Jul 2018 00:00:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3825: Distribute Runtime Filtering Aggregation

2018-07-26 Thread Rahul Shivu Mahadev (Code Review)
Hello Michael Ho, Sailesh Mukil, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11055

to look at the new patch set (#3).

Change subject: IMPALA-3825: Distribute Runtime Filtering Aggregation
..

IMPALA-3825: Distribute Runtime Filtering Aggregation

This patch moves runtime filter aggregation from the coordinator
to other exec nodes.
* QueryState will now house the Aggregated filters.
* FilterState has been decoupled from coordinator code
* A global flag has been added to switch back to legacy aggregation

Change-Id: I94e183a0353fc46f8d3eccae029d2d52c5cdc40c
---
M be/src/common/global-flags.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
D be/src/runtime/coordinator-filter-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
A be/src/runtime/filter-state.cc
A be/src/runtime/filter-state.h
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/runtime-filter-bank.cc
M be/src/service/impala-server.cc
M common/thrift/ImpalaInternalService.thrift
15 files changed, 690 insertions(+), 265 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/11055/3
--
To view, visit http://gerrit.cloudera.org:8080/11055
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I94e183a0353fc46f8d3eccae029d2d52c5cdc40c
Gerrit-Change-Number: 11055
Gerrit-PatchSet: 3
Gerrit-Owner: Rahul Shivu Mahadev 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-7362: Add query option to set timezone

2018-07-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11064 )

Change subject: IMPALA-7362: Add query option to set timezone
..


Patch Set 2: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/2866/


--
To view, visit http://gerrit.cloudera.org:8080/11064
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I73de86eff096e1c581d3b56a0d9330d686f77272
Gerrit-Change-Number: 11064
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 26 Jul 2018 23:59:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7195: [DOCS] Impala delegation with groups

2018-07-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11068 )

Change subject: IMPALA-7195: [DOCS] Impala delegation with groups
..


Patch Set 1: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/12/ : Doc tests passed.


--
To view, visit http://gerrit.cloudera.org:8080/11068
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6948ab2ef9b82b123f7a1f4fdc83cfb06e9f912f
Gerrit-Change-Number: 11068
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Jul 2018 23:57:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7324: remove MarkNeedsDeepCopy() from sorter

2018-07-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11048 )

Change subject: IMPALA-7324: remove MarkNeedsDeepCopy() from sorter
..


Patch Set 2:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/80/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/11048
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If86f038f1f6ca81ad5c9d40af1b7ea6115144ffc
Gerrit-Change-Number: 11048
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Jul 2018 23:55:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7195: [DOCS] Impala delegation with groups

2018-07-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11068 )

Change subject: IMPALA-7195: [DOCS] Impala delegation with groups
..


Patch Set 1:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/12/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/11068
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6948ab2ef9b82b123f7a1f4fdc83cfb06e9f912f
Gerrit-Change-Number: 11068
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 26 Jul 2018 23:47:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7195: [DOCS] Impala delegation with groups

2018-07-26 Thread Alex Rodoni (Code Review)
Alex Rodoni has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11068


Change subject: IMPALA-7195: [DOCS] Impala delegation with groups
..

IMPALA-7195: [DOCS] Impala delegation with groups

Change-Id: I6948ab2ef9b82b123f7a1f4fdc83cfb06e9f912f
---
M docs/topics/impala_delegation.xml
1 file changed, 135 insertions(+), 45 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/11068/1
--
To view, visit http://gerrit.cloudera.org:8080/11068
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I6948ab2ef9b82b123f7a1f4fdc83cfb06e9f912f
Gerrit-Change-Number: 11068
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 


[Impala-ASF-CR] IMPALA-7333: remove MarkNeedsDeepCopy() in agg and BTS

2018-07-26 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11007 )

Change subject: IMPALA-7333: remove MarkNeedsDeepCopy() in agg and BTS
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11007/8/be/src/runtime/buffered-tuple-stream.cc
File be/src/runtime/buffered-tuple-stream.cc:

http://gerrit.cloudera.org:8080/#/c/11007/8/be/src/runtime/buffered-tuple-stream.cc@763
PS8, Line 763: has_read_write_page
> Is this always safe? If next row will not fit into the page then the write
I think I see your point - in that case on the next GetNext() call we go down 
the branch below, right?

  if (UNLIKELY(read_page_ == pages_.end()
  || read_page_rows_returned_ == read_page_->num_rows)) {



--
To view, visit http://gerrit.cloudera.org:8080/11007
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I91ac53bacc00df4726c015a30ba5a2026aa4b5f5
Gerrit-Change-Number: 11007
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Jul 2018 23:33:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3825: [WIP]Distributed RuntimeFiltering Work in progress patch for Distributing runtime filtering aggregation.

2018-07-26 Thread Rahul Shivu Mahadev (Code Review)
Rahul Shivu Mahadev has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10937 )

Change subject: IMPALA-3825: [WIP]Distributed RuntimeFiltering Work in progress 
patch for Distributing runtime filtering aggregation.
..


Patch Set 1:

(9 comments)

Comments addressed in new change https://gerrit.cloudera.org/#/c/11055/

http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/common/global-flags.cc
File be/src/common/global-flags.cc:

http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/common/global-flags.cc@132
PS1, Line 132: DEFINE_bool(enable_distributed_filter_aggregation, true,
 : "Enables aggregation of filters"
 : "in a distributed manner, set to false to revert to 
coordinator based aggregation");
> After looking through the code, it seems better to avoid having this flag.
Will keep it until testing


http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/runtime/coordinator-backend-state.cc@93
PS1, Line 93: x.second.ToThrift();
> Move inside if(), since it's only needed in that case.
Done


http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/runtime/coordinator-backend-state.cc@94
PS1, Line 94: AggregatorRoutingTable::const_iterator it = 
aggregator_routing_table.find(x.first);
: if (it != aggregator_routing_table.end()) {
:   tfs.__set_aggregator_address(it->second);
:   rpc_params->filter_routing_table.insert(
:   std::pair(x.first, tfs));
: }
> Wouldn't all filters that are present in 'filter_routing_table' be present
Done


http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/runtime/coordinator-backend-state.cc@101
PS1, Line 101:
> We also don't need to send the filter routing table to the nodes that don't
Done


http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/runtime/coordinator.cc@284
PS1, Line 284:
> We can set the aggregator address in the FilterState here, and not have the
Done


http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/runtime/query-exec-mgr.h
File be/src/runtime/query-exec-mgr.h:

http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/runtime/query-exec-mgr.h@22
PS1, Line 22: #include 
> None of these new #includes should be needed since you haven't added any ne
Done


http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/runtime/query-state.cc@559
PS1, Line 559:
> Add a TODO here to call PublishFilterFromAggregator() asynchronously using
Done


http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/runtime/runtime-filter-bank.cc
File be/src/runtime/runtime-filter-bank.cc:

http://gerrit.cloudera.org:8080/#/c/10937/1/be/src/runtime/runtime-filter-bank.cc@205
PS1, Line 205:   SendFilterToAggregator(
> What if the aggregator and the producer are on the same node? We can consid
Done


http://gerrit.cloudera.org:8080/#/c/10937/1/bin/bootstrap_toolchain.py
File bin/bootstrap_toolchain.py:

http://gerrit.cloudera.org:8080/#/c/10937/1/bin/bootstrap_toolchain.py@430
PS1, Line 430: "flatbuffers", "gcc", "gflags", "glog",
> I'm guessing this was a mistake?
Done



--
To view, visit http://gerrit.cloudera.org:8080/10937
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0289ee39787d8e9c13a16c3d7d64f471bc554536
Gerrit-Change-Number: 10937
Gerrit-PatchSet: 1
Gerrit-Owner: Rahul Shivu Mahadev 
Gerrit-Reviewer: Rahul Shivu Mahadev 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Thu, 26 Jul 2018 23:22:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7324: remove MarkNeedsDeepCopy() from sorter

2018-07-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11048 )

Change subject: IMPALA-7324: remove MarkNeedsDeepCopy() from sorter
..


Patch Set 2:

Build Started https://jenkins.impala.io/job/gerrit-code-review-checks/80/

Running initial code review checks. This is experimental - please report any 
issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/11048
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If86f038f1f6ca81ad5c9d40af1b7ea6115144ffc
Gerrit-Change-Number: 11048
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 26 Jul 2018 23:19:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7324: remove MarkNeedsDeepCopy() from sorter

2018-07-26 Thread Tim Armstrong (Code Review)
Hello Gabor Kaszab, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11048

to look at the new patch set (#2).

Change subject: IMPALA-7324: remove MarkNeedsDeepCopy() from sorter
..

IMPALA-7324: remove MarkNeedsDeepCopy() from sorter

Testing:
Ran exhaustive tests.

Change-Id: If86f038f1f6ca81ad5c9d40af1b7ea6115144ffc
---
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/runtime/sorted-run-merger.cc
M be/src/runtime/sorted-run-merger.h
M be/src/runtime/sorter.cc
5 files changed, 52 insertions(+), 64 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/11048/2
--
To view, visit http://gerrit.cloudera.org:8080/11048
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If86f038f1f6ca81ad5c9d40af1b7ea6115144ffc
Gerrit-Change-Number: 11048
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-7324: remove MarkNeedsDeepCopy() from sorter

2018-07-26 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11048 )

Change subject: IMPALA-7324: remove MarkNeedsDeepCopy() from sorter
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11048/1/be/src/runtime/row-batch.h
File be/src/runtime/row-batch.h:

http://gerrit.cloudera.org:8080/#/c/11048/1/be/src/runtime/row-batch.h@299
PS1, Line 299: flush
> I would prefer a different name like flush_mode(). flush() suggests that ca
Good point.


http://gerrit.cloudera.org:8080/#/c/11048/1/be/src/runtime/sorter.cc
File be/src/runtime/sorter.cc:

http://gerrit.cloudera.org:8080/#/c/11048/1/be/src/runtime/sorter.cc@974
PS1, Line 974:   
var_len_pages_[var_len_pages_index_].ExtractBuffer(sorter_->buffer_pool_client_);
> nit: long line
Done



--
To view, visit http://gerrit.cloudera.org:8080/11048
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If86f038f1f6ca81ad5c9d40af1b7ea6115144ffc
Gerrit-Change-Number: 11048
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Jul 2018 23:19:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7324: remove MarkNeedsDeepCopy() from sorter

2018-07-26 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11048 )

Change subject: IMPALA-7324: remove MarkNeedsDeepCopy() from sorter
..


Patch Set 2: Code-Review+1

carry


--
To view, visit http://gerrit.cloudera.org:8080/11048
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If86f038f1f6ca81ad5c9d40af1b7ea6115144ffc
Gerrit-Change-Number: 11048
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Jul 2018 23:19:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] [DOCS] Corrected the default inc stats size limit bytes value

2018-07-26 Thread Alex Rodoni (Code Review)
Alex Rodoni has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/11050 )

Change subject: [DOCS] Corrected the default inc_stats_size_limit_bytes value
..

[DOCS] Corrected the default inc_stats_size_limit_bytes value

Change-Id: I4cacc6f1b9aaa1e2105e5436451ee906e3111679
Reviewed-on: http://gerrit.cloudera.org:8080/11050
Tested-by: Impala Public Jenkins 
Reviewed-by: Tim Armstrong 
---
M docs/topics/impala_perf_stats.xml
1 file changed, 1 insertion(+), 1 deletion(-)

Approvals:
  Impala Public Jenkins: Verified
  Tim Armstrong: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/11050
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I4cacc6f1b9aaa1e2105e5436451ee906e3111679
Gerrit-Change-Number: 11050
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] [DOCS] Corrected the default inc stats size limit bytes value

2018-07-26 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11050 )

Change subject: [DOCS] Corrected the default inc_stats_size_limit_bytes value
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11050
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4cacc6f1b9aaa1e2105e5436451ee906e3111679
Gerrit-Change-Number: 11050
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Jul 2018 23:10:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7297: soft memory limit for reservation increases

2018-07-26 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10988 )

Change subject: IMPALA-7297: soft memory limit for reservation increases
..


Patch Set 5:

> Also, doesn't this somewhat rely on the scanner threads being quicker to 
> claim the memory than the other threads? If the other threads already 
> expanded to their full 80%, this won't help, right? Or is the hope that those 
> other threads are cycling their memory often enough that, as scanner threads 
> encroach on the boundary, they'll have a chance to free some memory and not 
> re-allocate?

There is a follow-on patch I'm working on to make the scanner threads a bit 
smarter about whether they start based on available memory and reduce the 
raciness. It still will be somewhat racy. It is typical though that the scans 
grab memory first - e.g. the scans all start immediately and the joins, 
aggregations and sorts gradually balloon as they accumulate memory from the 
scans.

I think one fundamental design decision we made that impacts this is that 
there's no kind of pre-emption - one operator can't force another operator to 
scale down memory consumption if it consumed too much.


--
To view, visit http://gerrit.cloudera.org:8080/10988
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I39dcf2dd870a59c8b8cda4488fe41ce936d715ac
Gerrit-Change-Number: 10988
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 26 Jul 2018 23:04:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7297: soft memory limit for reservation increases

2018-07-26 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10988 )

Change subject: IMPALA-7297: soft memory limit for reservation increases
..


Patch Set 5:

I think you have it right. The 80/20 heuristic has been around since Impala 
2.0. I don't think there's anything special about the numbers but it's proven 
difficult to change without negatively impacting some queries (either worse 
performance or higher memory requirements)

One additional bit of context is that long-term we want to move most or all 
execution memory under reservations, so this is a bit of a stopgap.

I also have a follow-on patch that makes HDFS + Kudu scans check the soft limit 
to decide whether to start or stop scanner threads. HDFS scans use a mix of 
reserved and non-reserved memory. Kudu uses all non-reserved memory. So that's 
a case where the alternative of reducing the reserved memory doesn't help us.


--
To view, visit http://gerrit.cloudera.org:8080/10988
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I39dcf2dd870a59c8b8cda4488fe41ce936d715ac
Gerrit-Change-Number: 10988
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 26 Jul 2018 22:59:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7296: bytes limit for row batch queue

2018-07-26 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10977 )

Change subject: IMPALA-7296: bytes limit for row batch queue
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10977/5/be/src/util/blocking-queue.h
File be/src/util/blocking-queue.h:

http://gerrit.cloudera.org:8080/#/c/10977/5/be/src/util/blocking-queue.h@117
PS5, Line 117:  ElemBytesFn()(*out);
> I'd personally be surprised if these indirect calls are that expensive, any
Yeah that argument makes sense to me. I did run some query benchmarks locally 
earlier and didn't see any perf impact from this change.



--
To view, visit http://gerrit.cloudera.org:8080/10977
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa06d1d8da2a6d101efda08f620c0bf84a71e681
Gerrit-Change-Number: 10977
Gerrit-PatchSet: 5
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 26 Jul 2018 22:49:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7360: sequence scanners sometimes skip blocks

2018-07-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11062 )

Change subject: IMPALA-7360: sequence scanners sometimes skip blocks
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/79/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11062
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I49a70a4925b0271204b8eea4f980299d7654805a
Gerrit-Change-Number: 11062
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pranay Singh
Gerrit-Comment-Date: Thu, 26 Jul 2018 22:12:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7360: sequence scanners sometimes skip blocks

2018-07-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11062 )

Change subject: IMPALA-7360: sequence scanners sometimes skip blocks
..


Patch Set 3:

Build Started https://jenkins.impala.io/job/gerrit-code-review-checks/79/

Running initial code review checks. This is experimental - please report any 
issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/11062
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I49a70a4925b0271204b8eea4f980299d7654805a
Gerrit-Change-Number: 11062
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pranay Singh
Gerrit-Comment-Date: Thu, 26 Jul 2018 21:33:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7360: sequence scanners sometimes skip blocks

2018-07-26 Thread Tim Armstrong (Code Review)
Hello Pranay Singh, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11062

to look at the new patch set (#3).

Change subject: IMPALA-7360: sequence scanners sometimes skip blocks
..

IMPALA-7360: sequence scanners sometimes skip blocks

The handling of sync markers after processing a block was broken - eos_
was set if the sync marker straddles the boundary. The expected
behaviour (documented by comments) in this case is that the current
scanner should process the next block, if there is one.

If you look at the logic before the IMPALA-3905 change in commit
931bf49cd90e496df6bf260ae668ec6944f0016c, it split the checking
of eosr() and eof() similar to this patch.

Testing:
Add regression tests that scans a large table with a variety of
different scan range lengths, with some randomisation to exercise
different edge cases. This reliably triggered the bug.

Change-Id: I49a70a4925b0271204b8eea4f980299d7654805a
---
M be/src/exec/base-sequence-scanner.cc
A testdata/workloads/tpch/queries/tpch-scan-range-lengths.test
M tests/query_test/test_scanners.py
3 files changed, 86 insertions(+), 20 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/11062/3
--
To view, visit http://gerrit.cloudera.org:8080/11062
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I49a70a4925b0271204b8eea4f980299d7654805a
Gerrit-Change-Number: 11062
Gerrit-PatchSet: 3
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pranay Singh


[Impala-ASF-CR] IMPALA-7360: sequence scanners sometimes skip blocks

2018-07-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11062 )

Change subject: IMPALA-7360: sequence scanners sometimes skip blocks
..


Patch Set 2:

Build Started https://jenkins.impala.io/job/gerrit-code-review-checks/78/

Running initial code review checks. This is experimental - please report any 
issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/11062
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I49a70a4925b0271204b8eea4f980299d7654805a
Gerrit-Change-Number: 11062
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Pranay Singh
Gerrit-Comment-Date: Thu, 26 Jul 2018 21:30:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7360: sequence scanners sometimes skip blocks

2018-07-26 Thread Tim Armstrong (Code Review)
Hello Pranay Singh,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11062

to look at the new patch set (#2).

Change subject: IMPALA-7360: sequence scanners sometimes skip blocks
..

IMPALA-7360: sequence scanners sometimes skip blocks

The handling of sync markers after processing a block was broken - eos_
was set if the sync marker straddles the boundary. The expected
behaviour (documented by comments) in this case is that the current
scanner should process the next block, if there is one.

If you look at the logic before the IMPALA-3905 change in commit
931bf49cd90e496df6bf260ae668ec6944f0016c, it split the checking
of eosr() and eof() similar to this patch.

Testing:
Add regression tests that scans a large table with a variety of
different scan range lengths, with some randomisation to exercise
different edge cases. This reliably triggered the bug.

Change-Id: I49a70a4925b0271204b8eea4f980299d7654805a
---
M be/src/exec/base-sequence-scanner.cc
A testdata/workloads/tpch/queries/tpch-scan-range-lengths.test
M tests/query_test/test_scanners.py
3 files changed, 86 insertions(+), 20 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/11062/2
--
To view, visit http://gerrit.cloudera.org:8080/11062
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I49a70a4925b0271204b8eea4f980299d7654805a
Gerrit-Change-Number: 11062
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Pranay Singh


[Impala-ASF-CR] IMPALA-7362: Add query option to set timezone

2018-07-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11064 )

Change subject: IMPALA-7362: Add query option to set timezone
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/77/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11064
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I73de86eff096e1c581d3b56a0d9330d686f77272
Gerrit-Change-Number: 11064
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 26 Jul 2018 21:09:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7225: REFRESH..PARTITION shoud not reset partition's num rows

2018-07-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11056 )

Change subject: IMPALA-7225: REFRESH..PARTITION shoud not reset partition's num 
rows
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2867/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/11056
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5886684e26b453c76f331f6e807f813146c6bf3a
Gerrit-Change-Number: 11056
Gerrit-PatchSet: 2
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 26 Jul 2018 21:06:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7362: Add query option to set timezone

2018-07-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11064 )

Change subject: IMPALA-7362: Add query option to set timezone
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/76/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11064
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I73de86eff096e1c581d3b56a0d9330d686f77272
Gerrit-Change-Number: 11064
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 26 Jul 2018 21:03:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7225: REFRESH..PARTITION shoud not reset partition's num rows

2018-07-26 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11056 )

Change subject: IMPALA-7225: REFRESH..PARTITION shoud not reset partition's num 
rows
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11056
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5886684e26b453c76f331f6e807f813146c6bf3a
Gerrit-Change-Number: 11056
Gerrit-PatchSet: 2
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 26 Jul 2018 20:48:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7362: Add query option to set timezone

2018-07-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11064 )

Change subject: IMPALA-7362: Add query option to set timezone
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2866/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/11064
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I73de86eff096e1c581d3b56a0d9330d686f77272
Gerrit-Change-Number: 11064
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 26 Jul 2018 20:39:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP IMPALA-5050: Add support to read TIMESTAMP MILLIS and TIMESTAMP MICROS from Parquet

2018-07-26 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11057 )

Change subject: WIP IMPALA-5050: Add support to read TIMESTAMP_MILLIS and 
TIMESTAMP_MICROS from Parquet
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11057/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11057/1//COMMIT_MSG@19
PS1, Line 19: - the tests only work if local time is CET - on solution would be
:   to create a custom cluster test which sets the local time zone,
:   but I would prefer to solve this without restarting the cluster
I have started to implement a solution to make the timezone easily settable: 
https://gerrit.cloudera.org/#/c/11064

I plan the fix the tests when it will be finished.



--
To view, visit http://gerrit.cloudera.org:8080/11057
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c7c01fffa31b3d2ca3480adf6ff851137dadac3
Gerrit-Change-Number: 11057
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 26 Jul 2018 20:34:45 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7362: Add query option to set timezone

2018-07-26 Thread Csaba Ringhofer (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11064

to look at the new patch set (#2).

Change subject: IMPALA-7362: Add query option to set timezone
..

IMPALA-7362: Add query option to set timezone

This change adds a new query option "timezone" which
defines the timezone used for utc<->local conversions.

Examples:
set timezone=UTC;
set timezone="Europe/Budapest"

Leading/trailing " characters are stripped because the
/ character cannot be entered unquoted in some contexts.

Currently the timezone has only effect if one of the
following flags are true:
-use_local_tz_for_unix_timestamp_conversions
-convert_legacy_hive_parquet_utc_timestamps
as there are no functions that do local time conversion
by default. In the near future Parquet timestamps's
isAdjustedToUTC property will be supported, which will
decide whether to do utc->local conversion on a per
file+column basis.

Testing:
- Extended test_local_tz_conversion.py to actually
  test utc<->local conversion. Until now the effect
  of flag use_local_tz_for_unix_timestamp_conversions
  was practically untested.
- Added a shell test to check that the default of the
  query option is the system's timezone.

Change-Id: I73de86eff096e1c581d3b56a0d9330d686f77272
---
M be/src/service/impala-server.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
A 
testdata/workloads/functional-query/queries/QueryTest/local-timestamp-functions.test
M tests/custom_cluster/test_local_tz_conversion.py
M tests/shell/test_shell_commandline.py
8 files changed, 88 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/11064/2
--
To view, visit http://gerrit.cloudera.org:8080/11064
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I73de86eff096e1c581d3b56a0d9330d686f77272
Gerrit-Change-Number: 11064
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-7362: Add query option to set timezone

2018-07-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11064 )

Change subject: IMPALA-7362: Add query option to set timezone
..


Patch Set 2:

Build Started https://jenkins.impala.io/job/gerrit-code-review-checks/77/

Running initial code review checks. This is experimental - please report any 
issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/11064
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I73de86eff096e1c581d3b56a0d9330d686f77272
Gerrit-Change-Number: 11064
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 26 Jul 2018 20:31:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7362: Add query option to set timezone

2018-07-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11064 )

Change subject: IMPALA-7362: Add query option to set timezone
..


Patch Set 1:

Build Started https://jenkins.impala.io/job/gerrit-code-review-checks/76/

Running initial code review checks. This is experimental - please report any 
issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/11064
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I73de86eff096e1c581d3b56a0d9330d686f77272
Gerrit-Change-Number: 11064
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 26 Jul 2018 20:28:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7362: Add query option to set timezone

2018-07-26 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11064


Change subject: IMPALA-7362: Add query option to set timezone
..

IMPALA-7362: Add query option to set timezone

This change adds a new query option "timezone" which
defines the timezone used for utc<->local conversions.

Examples:
set timezone=UTC;
set timezone="Europe/Budapest"

Leading/trailing " characters are stripped because the
/ character cannot be entered unquoted in some contexts.

Currently the timezone has only effect if one of the
following flags are true:
-use_local_tz_for_unix_timestamp_conversions
-convert_legacy_hive_parquet_utc_timestamps
as there are no functions that do local time conversion
by default. In the near future Parquet timestamps's
isAdjustedToUTC property will be supported, which will
decide whether to do utc->local conversion on a per
file+column basis.

Testing:
- Extended test_local_tz_conversion.py to actually
  test utc<->local conversion. Until now the effect
  of flag use_local_tz_for_unix_timestamp_conversions
  was practically untested.
- Added a shell test to check that the default of the
  query option is the system's timezone.

Change-Id: I73de86eff096e1c581d3b56a0d9330d686f77272
---
M be/src/service/impala-server.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
A 
testdata/workloads/functional-query/queries/QueryTest/local-timestamp-functions.test
M tests/custom_cluster/test_local_tz_conversion.py
M tests/shell/test_shell_commandline.py
8 files changed, 90 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/11064/1
--
To view, visit http://gerrit.cloudera.org:8080/11064
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I73de86eff096e1c581d3b56a0d9330d686f77272
Gerrit-Change-Number: 11064
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer 


[Impala-ASF-CR] IMPALA-7163: Implement a state machine for the QueryState class

2018-07-26 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10813 )

Change subject: IMPALA-7163: Implement a state machine for the QueryState class
..


Patch Set 7:

(9 comments)

Looking good. Some more comments.

http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/fragment-instance-state.cc
File be/src/runtime/fragment-instance-state.cc:

http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/fragment-instance-state.cc@125
PS7, Line 125: DCHECK(
Does DCHECK_EQ() not work ?


http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/query-state.h@67
PS7, Line 67: however,
: /// if any fragment instance hits an error or cancellation, then 
we immediately change the
: /// state of the query to the ERROR or CANCELLED state 
accordingly.
This is not true for PREPARING phase, right ? Seems easier to separately 
describe the behavior in PREPARING phase and EXECUTING phase instead of making 
a general statement like this and talking about exception in the next paragraph.


http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/query-state.h@70
PS7, Line 70: EXECUTING
PREPARING ?


http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/query-state.h@71
PS7, Line 71: all the underlying FISes will be in Prepare()
This is not necessarily true, right ? In PREPARING state, it only means at 
least one fragment instances is running Prepare(), right ? Some fragment 
instances could already be past Prepare() and executing Open(), Exec() etc, 
right ?


http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/query-state.h@264
PS7, Line 264: BackendExecState backend_exec_state_ = 
BackendExecState::PREPARING;
May want to document the thread safety of this variable. I suppose it's only 
updated by the "query-state thread" only, right ?


http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/query-state.h@287
PS7, Line 287: /// ID of first fragment instance to hit an error.
Please consider documenting the thread safety of this variable too.


http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

http://gerrit.cloudera.org:8080/#/c/10813/6/be/src/runtime/query-state.h@308
PS6, Line 308:
 :   /// Number of active fragment instances and coordinators for 
this query that may consume
 :   /// resources for query execution (i.e. threads, memory) on 
the Impala daemon.
 :   /// Query-wide execution resources for this query are released 
once this goes to zero.
 :   AtomicInt32 exec_resource_refcnt_;
 :
 :   /// Temporary files for this query (owned by obj_pool_). 
Non-null if spilling is
 :   /// enabled. Set in Pr
> I feel that it's not ideal having the fragment instance threads modify the
Now that the patch has sunk in more, I think the state machine the query status 
and the query state has a slightly counter intuitive relationship which exists 
today already.

query_status_ represents the error status which any fragment instances can hit 
either in PREPARING or EXECUTING phase.

The query state means whether at least one fragment instances is still in that 
said state. So, a query state of PREPARING only means at least one fragment 
instance is still preparing but other fragment instances could already be 
executing.

The problem is that each fragment instance will directly report to the 
coordinator on failure today independent of other fragment instances' states.

In the next patch, the status reporting will be driven solely by the query 
state thread so we may actually track the status of PREPARING phase and 
EXECUTING phase separately like you did here. The change in behavior will be 
that we will report any error hit in PREPARING phase first before any error hit 
in EXECUTING phase. Do you think that will make the system easier to reason 
about ?


http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/query-state.cc@230
PS7, Line 230: BackendExecState::EXECUTING :
 : BackendExecState::FINISHED;
nit: one line


http://gerrit.cloudera.org:8080/#/c/10813/7/be/src/runtime/query-state.cc@479
PS7, Line 479:   {
 : unique_lock l(status_lock_);
 : query_status_ = Status::CANCELLED;
 :   }
Should this happen after line 483 below ?

Actually, do we need to set the query_status_ here ? Won't the first thread 
which notices the cancellation set it when calling ErrorDuringExec() set it ?

Do we have to worry about overwriting the error status if query_status_ already 
contains some errors ?



--
To view, visit http://gerrit.cloudera.org:8080/10813
To unsubscribe, 

[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility

2018-07-26 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10998 )

Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10998/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10998/3//COMMIT_MSG@42
PS3, Line 42: These metrics are exposed on the daemon's web UI in /memz page 
under
: section "JVM Garbage Collection metrics". They can also be 
consumed
: by tools from the /metrics endpoint.
> Yea, these are the same as JMX MBean.
Yea, I was thinking maybe you could import JMXJsonServlet.java wholesale from 
another project and just change it to return a string which can be passed to 
the backend or somesuch, and then take advantage of the fact that other tools 
know how to parse it



--
To view, visit http://gerrit.cloudera.org:8080/10998
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341
Gerrit-Change-Number: 10998
Gerrit-PatchSet: 3
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 26 Jul 2018 18:20:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility

2018-07-26 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10998 )

Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10998/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10998/3//COMMIT_MSG@42
PS3, Line 42: These metrics are exposed on the daemon's web UI in /memz page 
under
: section "JVM Garbage Collection metrics". They can also be 
consumed
: by tools from the /metrics endpoint.
> Are these redundant with the metrics that are available in the JMX MBean? I
Yea, these are the same as JMX MBean.

I thought about it for a while and settled with this because the JVM memory 
metrics was implemented this way [1](I'm not sure why). Looking at the jmx 
output, it is way more detailed and can be consumed by a variety of tools. 
Lemme try doing that and see how it works. Will get back.

[1] 
https://github.com/apache/impala/commit/daf0626dbd4171dc582160af467cf1638271dc60



--
To view, visit http://gerrit.cloudera.org:8080/10998
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341
Gerrit-Change-Number: 10998
Gerrit-PatchSet: 3
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 26 Jul 2018 18:08:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7225: REFRESH..PARTITION shoud not reset partition's num rows

2018-07-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11056 )

Change subject: IMPALA-7225: REFRESH..PARTITION shoud not reset partition's num 
rows
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/75/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11056
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5886684e26b453c76f331f6e807f813146c6bf3a
Gerrit-Change-Number: 11056
Gerrit-PatchSet: 2
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 26 Jul 2018 17:50:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7225: REFRESH..PARTITION shoud not reset partition's num rows

2018-07-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11056 )

Change subject: IMPALA-7225: REFRESH..PARTITION shoud not reset partition's num 
rows
..


Patch Set 2:

Build Started https://jenkins.impala.io/job/gerrit-code-review-checks/75/

Running initial code review checks. This is experimental - please report any 
issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/11056
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5886684e26b453c76f331f6e807f813146c6bf3a
Gerrit-Change-Number: 11056
Gerrit-PatchSet: 2
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 26 Jul 2018 17:16:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7225: REFRESH..PARTITION shoud not reset partition's num rows

2018-07-26 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11056 )

Change subject: IMPALA-7225: REFRESH..PARTITION shoud not reset partition's num 
rows
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11056/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/11056/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@855
PS1, Line 855: if 
(!TAccessLevelUtil.impliesWriteAccess(partition.getAccessLevel())) {
 :   // TODO: READ_ONLY isn't exactly correct because the 
it's possible the
 :   // partition does not have READ permissions either. 
When we start checking
 :   // whether we can READ from a table, this should be 
updated to set the
 :   // table's access level to the "lowest" effective 
level across all
 :   // partitions. That is, if one partition has READ_ONLY 
and another has
 :   // WRITE_ONLY the table's access level should be NONE.
 :   accessLevel_ = TAccessLevel.READ_ONLY;
 : }
> while we are here, is it possible to consolidate this common code too?
Done



--
To view, visit http://gerrit.cloudera.org:8080/11056
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5886684e26b453c76f331f6e807f813146c6bf3a
Gerrit-Change-Number: 11056
Gerrit-PatchSet: 1
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 26 Jul 2018 17:16:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7225: REFRESH..PARTITION shoud not reset partition's num rows

2018-07-26 Thread Bharath Vissapragada (Code Review)
Hello Tianyi Wang, Todd Lipcon, Impala Public Jenkins, Vuk Ercegovac,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11056

to look at the new patch set (#2).

Change subject: IMPALA-7225: REFRESH..PARTITION shoud not reset partition's num 
rows
..

IMPALA-7225: REFRESH..PARTITION shoud not reset partition's num rows

The bug was that HdfsTable#reloadPartition() was not setting the
partition's numRows_ after reloading the partition.

Fix: Move the code to set numRows_ to createPartition() call so that
the callers need not explicitly set it after calling createPartition().
Additionally consolidates other common code in createPartition().

Added a test.

Change-Id: I5886684e26b453c76f331f6e807f813146c6bf3a
---
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M tests/metadata/test_refresh_partition.py
2 files changed, 42 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/11056/2
--
To view, visit http://gerrit.cloudera.org:8080/11056
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5886684e26b453c76f331f6e807f813146c6bf3a
Gerrit-Change-Number: 11056
Gerrit-PatchSet: 2
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-7213: Port ReportExecStatus() RPC to use KRPC

2018-07-26 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10855 )

Change subject: IMPALA-7213: Port ReportExecStatus() RPC to use KRPC
..


Patch Set 4:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/10855/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10855/4//COMMIT_MSG@16
PS4, Line 16: This patch also introduces a new service pool for all query 
execution
: control related RPCs in the future so that control commands from
: coordinators aren't blocked by long-running DataStream services' 
RPCs.
> +1 for this.
yep, that's essentially what I meant. Thanks for explaining it better than I 
did :) My thought of "connection class" is a different name for your "proxy 
hash".

Should we consider this a prerequisite for merging this, or a nice-to-have 
follow-up? I'm not clear how latency-sensitive ReportExecStatus is. If a call 
takes 10 seconds to arrive is that a problem?


http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/10855/2/be/src/runtime/query-state.cc@290
PS2, Line 290: query_ctx().coord_address.hostname, );
> I think I'm missing something here. Where is the code that does the retry?
Semi-related: perhaps we should be setting the TCP_USER_TIMEOUT socket option 
to try to improve the hang behavior when a node goes offline? It seems this may 
allow us to get more aggressive connection dropping (and concomitant call 
failure) when a node's network connectivity is lost. See 'man 7 tcp' for a 
description.


http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/query-state.cc@283
PS4, Line 283: req.clear_error_log();
isnt the error log already clear because this is a new instance?


http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/query-state.cc@284
PS4, Line 284: state->GetUnreportedErrors(req.mutable_error_log());
it's a shame that this mutates the state, making retries a bit more difficult


http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/query-state.cc@305
PS4, Line 305: // We can retry the RPC if the payload hasn't been sent yet.
are these reports idempotent? seems like it shoudl be easy to make them 
idempotent by including a sequence number or something if they aren't, but 
idempotency seems a prereq for retrying on conn reset


http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/runtime/query-state.cc@306
PS4, Line 306: RpcMgr::IsServerTooBusy(rpc_controller
I'm a bit fuzzy on the context of this code, but if we get TOO_BUSY, it seems 
like we should retry either inline for a final report, or just on the next 
interval for a profile report? (likely with some other small change to avoid 
clearing the error log in the case that the RPC fails)


http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/service/control-service.cc
File be/src/service/control-service.cc:

http://gerrit.cloudera.org:8080/#/c/10855/4/be/src/service/control-service.cc@72
PS4, Line 72:   // TODO: implement something more efficient here, we're 
currently
:   // acquiring/releasing the map lock and doing a map lookup for
:   // every report (assign each query a local int32_t id and use 
that to index into a
:   // vector of ClientRequestStates, w/o lookup or locking?)
> I'd done an analysis on using a R/W lock for the ClientRequestState map ear
Sounds reasonable to me.



--
To view, visit http://gerrit.cloudera.org:8080/10855
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe
Gerrit-Change-Number: 10855
Gerrit-PatchSet: 4
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Thu, 26 Jul 2018 16:37:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7225: REFRESH..PARTITION shoud not reset partition's num rows

2018-07-26 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11056 )

Change subject: IMPALA-7225: REFRESH..PARTITION shoud not reset partition's num 
rows
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11056/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/11056/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@855
PS1, Line 855: if 
(!TAccessLevelUtil.impliesWriteAccess(partition.getAccessLevel())) {
 :   // TODO: READ_ONLY isn't exactly correct because the 
it's possible the
 :   // partition does not have READ permissions either. 
When we start checking
 :   // whether we can READ from a table, this should be 
updated to set the
 :   // table's access level to the "lowest" effective 
level across all
 :   // partitions. That is, if one partition has READ_ONLY 
and another has
 :   // WRITE_ONLY the table's access level should be NONE.
 :   accessLevel_ = TAccessLevel.READ_ONLY;
 : }
while we are here, is it possible to consolidate this common code too?



--
To view, visit http://gerrit.cloudera.org:8080/11056
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5886684e26b453c76f331f6e807f813146c6bf3a
Gerrit-Change-Number: 11056
Gerrit-PatchSet: 1
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 26 Jul 2018 16:12:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6857: Add Jvm pause/GC Monitor utility

2018-07-26 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10998 )

Change subject: IMPALA-6857: Add Jvm pause/GC Monitor utility
..


Patch Set 3:

(2 comments)

didn't review the code in detail, just one high-level suggestion here

http://gerrit.cloudera.org:8080/#/c/10998/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10998/3//COMMIT_MSG@42
PS3, Line 42: These metrics are exposed on the daemon's web UI in /memz page 
under
: section "JVM Garbage Collection metrics". They can also be 
consumed
: by tools from the /metrics endpoint.
Are these redundant with the metrics that are available in the JMX MBean? If we 
expose JMXJsonServlet (or its moral equivalent piped through our C++ webserver) 
then we'll fit in with the standard of the other Hadoop daemons for exposing 
the JVM stats. See http://namenode:5070/jmx (eg using the namenode in your 
minicluster) to see what I mean. The metrics in there are a lot more granular, 
including info like the heap size after the last collection (a decent estimate 
of live heap), etc.

If possible, it might be nice to include in these metrics some other info 
that's _not_ available above, eg a histogram of GC times or pause times. Also, 
would be good to expose the host machine pause time which may not be 
GC-related. (eg timestamp of last N pauses, length of last pauses)


http://gerrit.cloudera.org:8080/#/c/10998/3/common/thrift/Frontend.thrift
File common/thrift/Frontend.thrift:

http://gerrit.cloudera.org:8080/#/c/10998/3/common/thrift/Frontend.thrift@830
PS3, Line 830:   3: required i64 collection_time
if you don't take my other suggestion about exposing "/jmx" directly, it would 
be good to expose the info from "memoryUsageBeforeGc" and "memoryUsageAfterGc" 
elements in JMX,as well as the timestamp and duration of the most recent GC.



--
To view, visit http://gerrit.cloudera.org:8080/10998
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I30d897b7e063846ad6d8f88243e2c04264da0341
Gerrit-Change-Number: 10998
Gerrit-PatchSet: 3
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 26 Jul 2018 16:09:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP IMPALA-5050: Add support to read TIMESTAMP MILLIS and TIMESTAMP MICROS from Parquet

2018-07-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11057 )

Change subject: WIP IMPALA-5050: Add support to read TIMESTAMP_MILLIS and 
TIMESTAMP_MICROS from Parquet
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/74/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11057
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c7c01fffa31b3d2ca3480adf6ff851137dadac3
Gerrit-Change-Number: 11057
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 26 Jul 2018 15:29:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP IMPALA-5050: Add support to read TIMESTAMP MILLIS and TIMESTAMP MICROS from Parquet

2018-07-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11057 )

Change subject: WIP IMPALA-5050: Add support to read TIMESTAMP_MILLIS and 
TIMESTAMP_MICROS from Parquet
..


Patch Set 1:

Build Started https://jenkins.impala.io/job/gerrit-code-review-checks/74/

Running initial code review checks. This is experimental - please report any 
issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/11057
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c7c01fffa31b3d2ca3480adf6ff851137dadac3
Gerrit-Change-Number: 11057
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 26 Jul 2018 14:53:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP IMPALA-5050: Add support to read TIMESTAMP MILLIS and TIMESTAMP MICROS from Parquet

2018-07-26 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11057


Change subject: WIP IMPALA-5050: Add support to read TIMESTAMP_MILLIS and 
TIMESTAMP_MICROS from Parquet
..

WIP IMPALA-5050: Add support to read TIMESTAMP_MILLIS and TIMESTAMP_MICROS from 
Parquet

Changes:
- parquet.thrift is updated to contain the timestamp logical type.
- INT64 columns with converted types TIMESTAMP_MILLIS and TIMESTAMP_MICROS
  can be read as TIMESTAMP.
- If the logical type is timestamp, then the type will contain the
  information whether the UTC->local conversion is necessary. This
  feature is only supported for the new timestamp types, so INT96
  timestamps must still use flag convert_legacy_hive_parquet_utc_timestamps.

TODOs:
- the tests only work if local time is CET - on solution would be
  to create a custom cluster test which sets the local time zone,
  but I would prefer to solve this without restarting the cluster
- it would be better to test with more parquet files (for example
  test both plain and dictionary encoding).
- CREATE TABLE LIKE PARQUET should be also tested with the new
  types

Change-Id: I4c7c01fffa31b3d2ca3480adf6ff851137dadac3
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M be/src/exec/parquet-column-stats.inline.h
M be/src/exec/parquet-common.h
M be/src/exec/parquet-metadata-utils.cc
M common/thrift/parquet.thrift
M fe/src/main/java/org/apache/impala/analysis/ParquetHelper.java
M testdata/data/README
A testdata/data/int64_timestamp_plain.parquet
A 
testdata/workloads/functional-query/queries/QueryTest/parquet-int64-timestamps.test
M tests/query_test/test_scanners.py
11 files changed, 397 insertions(+), 43 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/57/11057/1
--
To view, visit http://gerrit.cloudera.org:8080/11057
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I4c7c01fffa31b3d2ca3480adf6ff851137dadac3
Gerrit-Change-Number: 11057
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer 


[Impala-ASF-CR] IMPALA-7333: remove MarkNeedsDeepCopy() in agg and BTS

2018-07-26 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11007 )

Change subject: IMPALA-7333: remove MarkNeedsDeepCopy() in agg and BTS
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11007/8/be/src/runtime/buffered-tuple-stream.cc
File be/src/runtime/buffered-tuple-stream.cc:

http://gerrit.cloudera.org:8080/#/c/11007/8/be/src/runtime/buffered-tuple-stream.cc@763
PS8, Line 763: has_read_write_page
Is this always safe? If next row will not fit into the page then the write page 
will be advanced on write. So calling AddRow() and GetNext() will decrease the 
pin count of the current page to 0. ReadWrite streams are only used by 
AnalyticEvalNode() - I looked at it a bit but I am not sure yet whether this 
can cause a problem or not.

My idea for a safe solution is that if attach_on_read_ is false, then  
MarkFlushResources() has to be called for read_write pages, while if 
attach_on_read_ is true, then the page has to be attached to the RowBatch in 
the next GetNext() call before calling NextReadPage().



--
To view, visit http://gerrit.cloudera.org:8080/11007
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I91ac53bacc00df4726c015a30ba5a2026aa4b5f5
Gerrit-Change-Number: 11007
Gerrit-PatchSet: 8
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 26 Jul 2018 13:52:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7324: remove MarkNeedsDeepCopy() from sorter

2018-07-26 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11048 )

Change subject: IMPALA-7324: remove MarkNeedsDeepCopy() from sorter
..


Patch Set 1: Code-Review+1

(2 comments)

comments about nit + naming, lgtm otherwise

http://gerrit.cloudera.org:8080/#/c/11048/1/be/src/runtime/row-batch.h
File be/src/runtime/row-batch.h:

http://gerrit.cloudera.org:8080/#/c/11048/1/be/src/runtime/row-batch.h@299
PS1, Line 299: flush
I would prefer a different name like flush_mode(). flush() suggests that 
calling the function actually flushes something.


http://gerrit.cloudera.org:8080/#/c/11048/1/be/src/runtime/sorter.cc
File be/src/runtime/sorter.cc:

http://gerrit.cloudera.org:8080/#/c/11048/1/be/src/runtime/sorter.cc@974
PS1, Line 974:   
var_len_pages_[var_len_pages_index_].ExtractBuffer(sorter_->buffer_pool_client_);
nit: long line



--
To view, visit http://gerrit.cloudera.org:8080/11048
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If86f038f1f6ca81ad5c9d40af1b7ea6115144ffc
Gerrit-Change-Number: 11048
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 26 Jul 2018 13:53:22 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7225: REFRESH..PARTITION shoud not reset partition's num rows

2018-07-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11056 )

Change subject: IMPALA-7225: REFRESH..PARTITION shoud not reset partition's num 
rows
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/11056
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5886684e26b453c76f331f6e807f813146c6bf3a
Gerrit-Change-Number: 11056
Gerrit-PatchSet: 1
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 26 Jul 2018 11:05:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7225: REFRESH..PARTITION shoud not reset partition's num rows

2018-07-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11056 )

Change subject: IMPALA-7225: REFRESH..PARTITION shoud not reset partition's num 
rows
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/73/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11056
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5886684e26b453c76f331f6e807f813146c6bf3a
Gerrit-Change-Number: 11056
Gerrit-PatchSet: 1
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 26 Jul 2018 08:25:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3825: Distribute Runtime Filtering Aggregation

2018-07-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11055 )

Change subject: IMPALA-3825: Distribute Runtime Filtering Aggregation
..


Patch Set 1:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/72/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/11055
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I94e183a0353fc46f8d3eccae029d2d52c5cdc40c
Gerrit-Change-Number: 11055
Gerrit-PatchSet: 1
Gerrit-Owner: Rahul Shivu Mahadev 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Thu, 26 Jul 2018 08:14:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7225: REFRESH..PARTITION shoud not reset partition's num rows

2018-07-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11056 )

Change subject: IMPALA-7225: REFRESH..PARTITION shoud not reset partition's num 
rows
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2865/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/11056
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5886684e26b453c76f331f6e807f813146c6bf3a
Gerrit-Change-Number: 11056
Gerrit-PatchSet: 1
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 26 Jul 2018 07:54:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7225: REFRESH..PARTITION shoud not reset partition's num rows

2018-07-26 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11056


Change subject: IMPALA-7225: REFRESH..PARTITION shoud not reset partition's num 
rows
..

IMPALA-7225: REFRESH..PARTITION shoud not reset partition's num rows

The bug was that HdfsTable#reloadPartition() was not setting the
partition's numRows_ after reloading the partition.

Fix: Move the code to set numRows_ to createPartition() call so that
the callers need not explicitly set it after calling createPartition().

Added a test.

Change-Id: I5886684e26b453c76f331f6e807f813146c6bf3a
---
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M tests/metadata/test_refresh_partition.py
2 files changed, 32 insertions(+), 6 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/11056/1
--
To view, visit http://gerrit.cloudera.org:8080/11056
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I5886684e26b453c76f331f6e807f813146c6bf3a
Gerrit-Change-Number: 11056
Gerrit-PatchSet: 1
Gerrit-Owner: Bharath Vissapragada 


[Impala-ASF-CR] IMPALA-3825: Distribute Runtime Filtering Aggregation

2018-07-26 Thread Rahul Shivu Mahadev (Code Review)
Hello Michael Ho, Sailesh Mukil, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11055

to look at the new patch set (#2).

Change subject: IMPALA-3825: Distribute Runtime Filtering Aggregation
..

IMPALA-3825: Distribute Runtime Filtering Aggregation

This patch moves runtime filter aggregation from the coordinator
to other exec nodes.
* QueryState will now house the Aggregated filters.
* FilterState has been decoupled from coordinator code
* A global flag has been added to switch back to legacy aggregation

Change-Id: I94e183a0353fc46f8d3eccae029d2d52c5cdc40c
---
M be/src/common/global-flags.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
D be/src/runtime/coordinator-filter-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
A be/src/runtime/filter-state.cc
A be/src/runtime/filter-state.h
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/runtime-filter-bank.cc
M be/src/service/impala-server.cc
M common/thrift/ImpalaInternalService.thrift
15 files changed, 689 insertions(+), 265 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/11055/2
--
To view, visit http://gerrit.cloudera.org:8080/11055
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I94e183a0353fc46f8d3eccae029d2d52c5cdc40c
Gerrit-Change-Number: 11055
Gerrit-PatchSet: 2
Gerrit-Owner: Rahul Shivu Mahadev 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-3825: Distribute Runtime Filtering Aggregation This patch moves runtime filter aggregation from the coordinator to other exec nodes. * QueryState will now house the Aggregated f

2018-07-26 Thread Rahul Shivu Mahadev (Code Review)
Rahul Shivu Mahadev has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11055


Change subject: IMPALA-3825: Distribute Runtime Filtering Aggregation This 
patch moves runtime filter aggregation from the coordinator to other exec 
nodes. * QueryState will now house the Aggregated filters. * FilterState has 
been decoupled from coordinator code * A glob
..

IMPALA-3825: Distribute Runtime Filtering Aggregation
This patch moves runtime filter aggregation from the coordinator
to other exec nodes.
* QueryState will now house the Aggregated filters.
* FilterState has been decoupled from coordinator code
* A global flag has been added to switch back to legacy aggregation

Change-Id: I94e183a0353fc46f8d3eccae029d2d52c5cdc40c
---
M be/src/common/global-flags.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
D be/src/runtime/coordinator-filter-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
A be/src/runtime/filter-state.cc
A be/src/runtime/filter-state.h
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/runtime-filter-bank.cc
M be/src/service/impala-server.cc
M common/thrift/ImpalaInternalService.thrift
15 files changed, 689 insertions(+), 265 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/11055/1
--
To view, visit http://gerrit.cloudera.org:8080/11055
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I94e183a0353fc46f8d3eccae029d2d52c5cdc40c
Gerrit-Change-Number: 11055
Gerrit-PatchSet: 1
Gerrit-Owner: Rahul Shivu Mahadev 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil