[Impala-ASF-CR] IMPALA-5144: Remove sortby() hint

2017-05-19 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-5144: Remove sortby() hint
..


Patch Set 3:

Lars, technically I can't +2 your patch. Ping Alex, it should be trivial to 
review.

-- 
To view, visit http://gerrit.cloudera.org:8080/6885
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I83e1cd6fa7039035973676322deefbce00d3f594
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5340: Query profile displays stale query state

2017-05-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5340: Query profile displays stale query state
..


Patch Set 3: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6923
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I952319b7308a24d4e2dff924199c0c771bce25b3
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5340: Query profile displays stale query state

2017-05-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5340: Query profile displays stale query state
..


IMPALA-5340: Query profile displays stale query state

Previously, updates to the query state in ClientRequestState were
not immediately reflected in the query profile, potentially
leading to the profile showing an incorrect state for an extended
perioud during execution.

In particular, queries were being shown in the 'CREATED' state
long after they had started 'RUNNING'.

The fix is to update the profile whenever the state is updated.

Testing:
- Extended existing hs2 tests and added a beeswax test to check
  for expected query states in the profile

Change-Id: I952319b7308a24d4e2dff924199c0c771bce25b3
Reviewed-on: http://gerrit.cloudera.org:8080/6923
Reviewed-by: Dan Hecht 
Reviewed-by: Thomas Tauber-Marshall 
Tested-by: Impala Public Jenkins
---
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M tests/hs2/test_hs2.py
M tests/query_test/test_observability.py
4 files changed, 75 insertions(+), 13 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Thomas Tauber-Marshall: Looks good to me, approved
  Dan Hecht: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/6923
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I952319b7308a24d4e2dff924199c0c771bce25b3
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-5259: Add REFRESH FUNCTIONS statement

2017-05-19 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-5259: Add REFRESH FUNCTIONS  statement
..


Patch Set 7:

Added some toSql tests.

-- 
To view, visit http://gerrit.cloudera.org:8080/6878
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3625c88bb51cca833f3293c224d3f0feb00e6e0b
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5259: Add REFRESH FUNCTIONS statement

2017-05-19 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#7).

Change subject: IMPALA-5259: Add REFRESH FUNCTIONS  statement
..

IMPALA-5259: Add REFRESH FUNCTIONS  statement

Before this patch, Impala relied on INVALIDATE METADATA to load
externally added UDFs from HMS. The problem with this approach is that
INVALIDATE METADATA affects all databases and tables in the entire
cluster.

In this patch, we add a REFRESH FUNCTIONS  statement that reloads
the functions of a database from HMS. We return a list of updated and
removed db functions to the issuing Impalad in order to update its
local catalog cache.

Testing:
- Ran a private build which passed.

Change-Id: I3625c88bb51cca833f3293c224d3f0feb00e6e0b
---
M common/thrift/CatalogService.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Function.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
M tests/custom_cluster/test_permanent_udfs.py
12 files changed, 326 insertions(+), 62 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/6878/7
-- 
To view, visit http://gerrit.cloudera.org:8080/6878
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3625c88bb51cca833f3293c224d3f0feb00e6e0b
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-5259: Add REFRESH FUNCTIONS statement

2017-05-19 Thread Taras Bobrovytsky (Code Review)
Hello Alex Behm,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6878

to look at the new patch set (#7).

Change subject: IMPALA-5259: Add REFRESH FUNCTIONS  statement
..

IMPALA-5259: Add REFRESH FUNCTIONS  statement

Before this patch, Impala relied on INVALIDATE METADATA to load
externally added UDFs from HMS. The problem with this approach is that
INVALIDATE METADATA affects all databases and tables in the entire
cluster.

In this patch, we add a REFRESH FUNCTIONS  statement that reloads
the functions of a database from HMS. We return a list of updated and
removed db functions to the issuing Impalad in order to update its
local catalog cache.

Testing:
- Ran a private build which passed.

Change-Id: I3625c88bb51cca833f3293c224d3f0feb00e6e0b
---
M common/thrift/CatalogService.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Function.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java
M tests/custom_cluster/test_permanent_udfs.py
12 files changed, 326 insertions(+), 62 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/6878/7
-- 
To view, visit http://gerrit.cloudera.org:8080/6878
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3625c88bb51cca833f3293c224d3f0feb00e6e0b
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-5338: Fix Kudu timestamp column default values

2017-05-19 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6936

Change subject: IMPALA-5338: Fix Kudu timestamp column default values
..

IMPALA-5338: Fix Kudu timestamp column default values

While support for TIMESTAMP columns in Kudu tables has been
committed (IMPALA-5137), it does not support TIMESTAMP
column default values.

This supports CREATE TABLE syntax to specify the default
values, but more importantly this fixes the loading of Kudu
tables that may have had default values set on
UNIXTIME_MICROS columns, e.g. if the table was created via
the python client. This involves fixing KuduColumn to hide
the LiteralExpr representing the default value because it
will be a BIGINT if the column type is TIMESTAMP. It is only
needed to call toSql() and toStringValue(), so helper
functions are added to KuduColumn to encapsulate special
logic for TIMESTAMP.

TODO: Add support and tests for ALTER setting the default
value (when IMPALA-4622 is committed).

Change-Id: I655910fb4805bb204a999627fa9f68e43ea8aaf2
---
M be/src/exec/kudu-scanner.cc
M be/src/exprs/timestamp-functions-ir.cc
M be/src/exprs/timestamp-functions.h
M be/src/runtime/timestamp-value.h
M be/src/runtime/timestamp-value.inline.h
M common/function-registry/impala_functions.py
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/KuduColumn.java
M fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
M tests/query_test/test_kudu.py
14 files changed, 206 insertions(+), 20 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/36/6936/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6936
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I655910fb4805bb204a999627fa9f68e43ea8aaf2
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 


[Impala-ASF-CR] IMPALA-5164: Fix flaky benchmarks

2017-05-19 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5164: Fix flaky benchmarks
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6935/1/be/src/benchmarks/free-lists-benchmark.cc
File be/src/benchmarks/free-lists-benchmark.cc:

Line 76: // of the allocator-> However, the balance may shift on larger 
machines with more cores
regexp replace damage


http://gerrit.cloudera.org:8080/#/c/6935/1/be/src/util/benchmark.cc
File be/src/util/benchmark.cc:

Line 47: // Start with a new stopwatch each time; loop until we get a 
reasonable estimate.
stale comment


-- 
To view, visit http://gerrit.cloudera.org:8080/6935
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7e0ddfaa9bd1340990ae2b6ff946d72bbd26c274
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5164: Fix flaky benchmarks

2017-05-19 Thread Zach Amsden (Code Review)
Zach Amsden has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6935

Change subject: IMPALA-5164: Fix flaky benchmarks
..

IMPALA-5164: Fix flaky benchmarks

Improve benchmarks by detecting involuntary context switches.
If a server is heavily loaded due to some other jobs running,
benchmarking will not be reliable.  We can detect this by using
getrusage() to measure context switches.  If 90% or more of
benchmark time is lost due to switching, we abort the benchmark,
but catch the failure and exit silently.  We subtract out the
time during which we might have been switched out from the total
result time, and only count runs which were uninterrupted.

If 10% of the result is valid, that seems barely legitimate,
but we will warn when anything less than 50% of the benchmark
ran without switching.

Testing: ran the free-lists-benchmark until the first test passed
and then started "make -j 60" on an Impala directory.  Got this
result:

E0519 23:35:41.482010  8766 benchmark.cc:161] Benchmark failed to
complete due to context switching.
W0519 23:35:41.548840  8766 benchmark.cc:162] Exiting silently from
FreeLists 0 iters on 128 kb to avoid spurious test failure.

Also, fixed two benchmarks that had crashes on start due to static
initialization order problems and missing pieces of initialization.
Some benchmarks are still crashing (expr-benchmark and
network-perf-be), but those will require a lot more work to fix.

Change-Id: I7e0ddfaa9bd1340990ae2b6ff946d72bbd26c274
---
M be/src/benchmarks/free-lists-benchmark.cc
M be/src/benchmarks/hash-benchmark.cc
M be/src/util/benchmark.cc
3 files changed, 84 insertions(+), 33 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/6935/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6935
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I7e0ddfaa9bd1340990ae2b6ff946d72bbd26c274
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 


[Impala-ASF-CR] IMPALA-5333: Add support for Impala to work with ADLS

2017-05-19 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-5333: Add support for Impala to work with ADLS
..


Patch Set 3:

> I have a few high level questions about this patch. This patch
 > treats S3 and ADL the same way but after looking at the HDFS
 > classes and the ADL API I see some differences between ADL and S3
 > API. For instance, I don't see a way in ADL to recursively
 > enumerate all entries under a particular directory (something that
 > exists in S3). Our code today for S3 sort of relies on that
 > property so I am wondering if our code works as expected or if I am
 > missing something (quite possible). Also, the AdlFileSystem class
 > has functions for returning the block locations, which means that
 > we don't have to call the synthesize metadata calls as we do for
 > S3. In the long run, HDFS may expose replica locations from the ADL
 > local tiers (e.g. Cosmos), but I guess this is not relevant today.
 > Thoughts?

Very good questions, Dimitris.

- Could you point me to the exact API that does the recursive enumeration?

- It's true that we don't need to synthesize the block metadata. However, I was 
in conversation with the ADLS connector folks and ADLS doesn't actually expose 
the block locations internally. So if you look at the getFileBlockLocations() 
code in AdlFileSystem, it actually itself synthesizes the block metadata:
http://github.mtv.cloudera.com/CDH/hadoop/blob/4da5de1a7c965f63f0b798e9ec4ec72ed7249f18/hadoop-tools/hadoop-azure-datalake/src/main/java/org/apache/hadoop/fs/adl/AdlFileSystem.java#L893

So functionally it's the same. But I agree, we should call the relevant API if 
there it is "supported" in some form or the other. I will update the patch.

-- 
To view, visit http://gerrit.cloudera.org:8080/6910
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic56b9988b32a330443f24c44f9cb2c80842f7542
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5339: Fix analysis with sort.columns and expr rewrites

2017-05-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5339: Fix analysis with sort.columns and expr rewrites
..


IMPALA-5339: Fix analysis with sort.columns and expr rewrites

IMPALA-4166 introduced a bug by duplicating code that adds sort
expressions. Upon re-analysis, this code would hit an
IndexOutOfBoundsException.

Change-Id: Ibebba29509ae7eaa691fe305500cda6bd41a179a
Reviewed-on: http://gerrit.cloudera.org:8080/6921
Reviewed-by: Lars Volker 
Tested-by: Impala Public Jenkins
---
M fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableDef.java
M testdata/workloads/functional-planner/queries/PlannerTest/insert-sort-by.test
4 files changed, 82 insertions(+), 6 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Lars Volker: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/6921
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ibebba29509ae7eaa691fe305500cda6bd41a179a
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-5339: Fix analysis with sort.columns and expr rewrites

2017-05-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5339: Fix analysis with sort.columns and expr rewrites
..


Patch Set 4: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6921
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibebba29509ae7eaa691fe305500cda6bd41a179a
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5259: Add REFRESH FUNCTIONS statement

2017-05-19 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#6).

Change subject: IMPALA-5259: Add REFRESH FUNCTIONS  statement
..

IMPALA-5259: Add REFRESH FUNCTIONS  statement

Before this patch, Impala relied on INVALIDATE METADATA to load
externally added UDFs from HMS. The problem with this approach is that
INVALIDATE METADATA affects all databases and tables in the entire
cluster.

In this patch, we add a REFRESH FUNCTIONS  statement that reloads
the functions of a database from HMS. We return a list of updated and
removed db functions to the issuing Impalad in order to update its
local catalog cache.

Testing:
- Ran a private build which passed.

Change-Id: I3625c88bb51cca833f3293c224d3f0feb00e6e0b
---
M common/thrift/CatalogService.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Function.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M tests/custom_cluster/test_permanent_udfs.py
10 files changed, 303 insertions(+), 62 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/6878/6
-- 
To view, visit http://gerrit.cloudera.org:8080/6878
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3625c88bb51cca833f3293c224d3f0feb00e6e0b
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-5259: Add REFRESH FUNCTIONS statement

2017-05-19 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-5259: Add REFRESH FUNCTIONS  statement
..


Patch Set 6:

Updated the toSql statement in the latest patch

-- 
To view, visit http://gerrit.cloudera.org:8080/6878
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3625c88bb51cca833f3293c224d3f0feb00e6e0b
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5259: Add REFRESH FUNCTIONS statement

2017-05-19 Thread Taras Bobrovytsky (Code Review)
Hello Alex Behm,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6878

to look at the new patch set (#6).

Change subject: IMPALA-5259: Add REFRESH FUNCTIONS  statement
..

IMPALA-5259: Add REFRESH FUNCTIONS  statement

Before this patch, Impala relied on INVALIDATE METADATA to load
externally added UDFs from HMS. The problem with this approach is that
INVALIDATE METADATA affects all databases and tables in the entire
cluster.

In this patch, we add a REFRESH FUNCTIONS  statement that reloads
the functions of a database from HMS. We return a list of updated and
removed db functions to the issuing Impalad in order to update its
local catalog cache.

Testing:
- Ran a private build which passed.

Change-Id: I3625c88bb51cca833f3293c224d3f0feb00e6e0b
---
M common/thrift/CatalogService.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Function.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M tests/custom_cluster/test_permanent_udfs.py
10 files changed, 303 insertions(+), 62 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/6878/6
-- 
To view, visit http://gerrit.cloudera.org:8080/6878
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3625c88bb51cca833f3293c224d3f0feb00e6e0b
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-1972/IMPALA-3882: Fix client request state map lock contention

2017-05-19 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has uploaded a new patch set (#9).

Change subject: IMPALA-1972/IMPALA-3882: Fix client_request_state_map_lock_ 
contention
..

IMPALA-1972/IMPALA-3882: Fix client_request_state_map_lock_ contention

Holding client_request_state_map_lock_ and CRS::lock_ together in certain
paths could potentially block the impalad from registering new queries.
The most common occurrence of this is while loading the webpage of a
query while the query planning is still in progress. Since we hold the
CRS::lock_ during planning, it blocks the web page from loading which
inturn blocks incoming queries by holding client_request_state_map_lock_.

This patch makes client_request_state_map_lock_ a terminal lock so that
we don't have interleaving locking with CRS::lock_.

Testing: Tested it locally by adding a long sleep in
JniFrontend.createExecRequest() and still was able to refresh the web UI
and run parallel queries. Also added a custom cluster test that does the
same sequence of actions by injecting a metadata loading pause.

Change-Id: Ie44daa93e3ae4d04d091261f3ec4891caffe8026
---
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
A tests/custom_cluster/test_query_concurrency.py
M www/query_plan.tmpl
7 files changed, 162 insertions(+), 53 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/6707/9
-- 
To view, visit http://gerrit.cloudera.org:8080/6707
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie44daa93e3ae4d04d091261f3ec4891caffe8026
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 


[Impala-ASF-CR] IMPALA-1972/IMPALA-3882: Fix client request state map lock contention

2017-05-19 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has uploaded a new patch set (#8).

Change subject: IMPALA-1972/IMPALA-3882: Fix client_request_state_map_lock_ 
contention
..

IMPALA-1972/IMPALA-3882: Fix client_request_state_map_lock_ contention

Holding client_request_state_map_lock_ and CRS::lock_ together in certain
paths could potentially block the impalad from registering new queries.
The most common occurrence of this is while loading the webpage of a
query while the query planning is still in progress. Since we hold the
CRS::lock_ during planning, it blocks the web page from loading which
inturn blocks incoming queries by holding client_request_state_map_lock_.

This patch makes client_request_state_map_lock_ a terminal lock so that
we don't have interleaving locking with CRS::lock_.

Testing: Tested it locally by adding a long sleep in
JniFrontend.createExecRequest() and still was able to refresh the web UI
and run parallel queries. Also added a custom cluster test that does the
same sequence of actions by injecting a metadata loading pause.

Change-Id: Ie44daa93e3ae4d04d091261f3ec4891caffe8026
---
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
A tests/custom_cluster/test_query_concurrency.py
M www/query_plan.tmpl
7 files changed, 164 insertions(+), 54 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/6707/8
-- 
To view, visit http://gerrit.cloudera.org:8080/6707
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie44daa93e3ae4d04d091261f3ec4891caffe8026
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 


[Impala-ASF-CR] IMPALA-1972/IMPALA-3882: Fix client request state map lock contention

2017-05-19 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-1972/IMPALA-3882: Fix client_request_state_map_lock_ 
contention
..


Patch Set 7:

(8 comments)

The new test fails deterministically without the patch when run locally.

http://gerrit.cloudera.org:8080/#/c/6707/7/be/src/service/impala-beeswax-server.cc
File be/src/service/impala-beeswax-server.cc:

PS7, Line 296: NULL
> nit: change that one too to at least keep functions consistent
Done


http://gerrit.cloudera.org:8080/#/c/6707/7/be/src/service/impala-http-handler.cc
File be/src/service/impala-http-handler.cc:

PS7, Line 721: just return
> that's not what the code does (it also sets plan_metadata_unavailable), ple
Done


PS7, Line 730: adopt_lock_t
> shouldn't that be deleted?
Oops, missed that during rebase.


http://gerrit.cloudera.org:8080/#/c/6707/7/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

Line 836: #endif
> actually, how about moving this to before UpdateQueryStatus() since that mo
Done


http://gerrit.cloudera.org:8080/#/c/6707/7/tests/custom_cluster/test_query_concurrency.py
File tests/custom_cluster/test_query_concurrency.py:

PS7, Line 32: The intention here is to check contention on the 
query_exec_state_map_lock_
> This is talking about how the old code worked, which won't make sense to pe
Rephrased.


PS7, Line 54: This creates lock contention on the coordinator by
: calling QuerySummaryHandler() method
> This is no longer true with your fix. How about saying:
Yea it doesn't make sense to someone reading it with the fix. . IMO, the test 
class comment conveys the intention already and this specific comment looks 
redundant. Removed it, please let me know if you disagree.


PS7, Line 74: time.sleep(2)
> I'm worried that this will be flaky, especially with ASAN.  Instead of this
Yep good idea to reduce flakiness. But I don't think passing a large value to 
get_inflight_queries() would achieve that since the /inflight_queries end point 
never times out. Instead we should repeatedly poll that page in loop with 
timeout. Redid the logic accordingly.


PS7, Line 83: time.sleep(2)
> this delay is a bit harder to eliminate.  How about we increase --stress_me
Same comment as above. I redid the logic, please let me know if that makes 
sense.


-- 
To view, visit http://gerrit.cloudera.org:8080/6707
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie44daa93e3ae4d04d091261f3ec4891caffe8026
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5340: Query profile displays stale query state

2017-05-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5340: Query profile displays stale query state
..


Patch Set 3:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/597/

-- 
To view, visit http://gerrit.cloudera.org:8080/6923
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I952319b7308a24d4e2dff924199c0c771bce25b3
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5331: Use new libHDFS API to address "Unknown Error 255"

2017-05-19 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-5331: Use new libHDFS API to address "Unknown Error 255"
..


Patch Set 6: Code-Review+1

LGTM. I think it'd be good to get MikeB or David to sign off on the test since 
it does some non-standard stuff.

-- 
To view, visit http://gerrit.cloudera.org:8080/6894
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I181e316ed63b70b94d4f7a7557d398a931bb171d
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5340: Query profile displays stale query state

2017-05-19 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-5340: Query profile displays stale query state
..


Patch Set 3:

> I should also add - I haven't tested this with CM, just locally
 > with the debug webpage, but I played around in COTT and the
 > behavior there matches what I'm expecting, so I'm fairly confident
 > this will solve their problem.

Agree

-- 
To view, visit http://gerrit.cloudera.org:8080/6923
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I952319b7308a24d4e2dff924199c0c771bce25b3
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-19 Thread Joe McDonnell (Code Review)
Joe McDonnell has uploaded a new patch set (#10).

Change subject: IMPALA-4623: Enable file handle cache
..

IMPALA-4623: Enable file handle cache

Currently, every scan range maintains a file handle, even
when multiple scan ranges are accessing the same file.
Opening the file handles causes load on the NameNode, which
can lead to scaling issues.

There are two parts to this transaction:
1. Enable file handle caching by default
2. Share the file handle between scan ranges from the same
file

The scan range no longer maintains its own Hdfs file
handle. On each read, the io thread will get the Hdfs file
handle from the cache (opening it if necessary) and use
that for the read. This allows multiple scan ranges on the
same file to use the same file handle. Since the file
offsets are no longer consistent for an individual scan
range, all Hdfs reads need to either use hdfsPread or do
a seek before reading. Additionally, since Hdfs read
statistics are maintained on the file handle, the read
statistics must be retrieved and cleared after each read.

To manage contention, the file handle cache is now
partitioned by a hash of the key into independent
caches with independent locks. The allowed capacity
of the file handle cache is split evenly among the
partitions. File handles are evicted independently
for each partition. The file handle cache maintains
ownership of the file handles at all times, but it
will not evict a file handle that is in use.

If max_cached_file_handles is set to 0 or the the
scan range is accessing data cached by Hdfs, the
scan range will get a file handle from the cache
and hold it until the scan range is closed. This
mimics the existing behavior, except the file
handle stays in the cache and is owned by the
cache. Since it is in use, it will not be evicted.

Tests:
query_test/test_hdfs_fd_caching.py copies the files from
an existing table into a new directory and uses that to
create an external table. It queries the external table,
then uses the hdfs commandline to manipulate the hdfs file
(delete, move, etc). It queries again to make sure we
don't crash. Then, it runs "invalidate metadata". It
checks the row counts before the modification and after
"invalidate metadata", but it does not check the results
in between.
custom_cluster/test_hdfs_fd_caching.py starts up a cluster
with a small file handle cache size. It verifies that a
file handle can be reused (i.e. rerunning a query does
not result in more file handles cached). It also verifies
that the cache capacity is enforced.

Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/runtime/buffered-block-mgr.h
A be/src/runtime/disk-io-mgr-handle-cache.h
A be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr-internal.h
M be/src/runtime/disk-io-mgr-reader-context.cc
M be/src/runtime/disk-io-mgr-scan-range.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M tests/custom_cluster/test_hdfs_fd_caching.py
M tests/metadata/test_refresh_partition.py
M tests/query_test/test_hdfs_fd_caching.py
13 files changed, 769 insertions(+), 410 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/6478/10
-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5342: Add comments of loaded tables in the response of GetTables

2017-05-19 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-5342: Add comments of loaded tables in the response of 
GetTables
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/6933/1/fe/src/main/java/org/apache/impala/service/MetadataOp.java
File fe/src/main/java/org/apache/impala/service/MetadataOp.java:

Line 218: // comments[i][j] are the comments of tableNames[j] in dbs[i].
is the comment


Line 291: comment = 
table.getMetaStoreTable().getParameters().get("comment");
Is there an HMS constant for this?


http://gerrit.cloudera.org:8080/#/c/6933/1/fe/src/test/java/org/apache/impala/service/FrontendTest.java
File fe/src/test/java/org/apache/impala/service/FrontendTest.java:

Line 184: Db testDb = addTestDb(dbName, "Stores tables with comments");
Should we fix this for DB comments as well? Or does that already work as 
expected?


-- 
To view, visit http://gerrit.cloudera.org:8080/6933
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I61f327168a93ceb4bd60b47474f39bfa405ae07d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Alex Behm 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5340: Query profile displays stale query state

2017-05-19 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-5340: Query profile displays stale query state
..


Patch Set 3: Code-Review+2

I should also add - I haven't tested this with CM, just locally with the debug 
webpage, but I played around in COTT and the behavior there matches what I'm 
expecting, so I'm fairly confident this will solve their problem.

-- 
To view, visit http://gerrit.cloudera.org:8080/6923
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I952319b7308a24d4e2dff924199c0c771bce25b3
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5259: Add REFRESH FUNCTIONS statement

2017-05-19 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-5259: Add REFRESH FUNCTIONS  statement
..


Patch Set 5: Code-Review+1

Forwarding the +1 from Alex.

-- 
To view, visit http://gerrit.cloudera.org:8080/6878
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3625c88bb51cca833f3293c224d3f0feb00e6e0b
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5259: Add REFRESH FUNCTIONS statement

2017-05-19 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#5).

Change subject: IMPALA-5259: Add REFRESH FUNCTIONS  statement
..

IMPALA-5259: Add REFRESH FUNCTIONS  statement

Before this patch, Impala relied on INVALIDATE METADATA to load
externally added UDFs from HMS. The problem with this approach is that
INVALIDATE METADATA affects all databases and tables in the entire
cluster.

In this patch, we add a REFRESH FUNCTIONS  statement that reloads
the functions of a database from HMS. We return a list of updated and
removed db functions to the issuing Impalad in order to update its
local catalog cache.

Testing:
- Ran a private build which passed.

Change-Id: I3625c88bb51cca833f3293c224d3f0feb00e6e0b
---
M common/thrift/CatalogService.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Function.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M tests/custom_cluster/test_permanent_udfs.py
10 files changed, 294 insertions(+), 57 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/6878/5
-- 
To view, visit http://gerrit.cloudera.org:8080/6878
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3625c88bb51cca833f3293c224d3f0feb00e6e0b
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-5259: Add REFRESH FUNCTIONS statement

2017-05-19 Thread Taras Bobrovytsky (Code Review)
Hello Alex Behm,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6878

to look at the new patch set (#5).

Change subject: IMPALA-5259: Add REFRESH FUNCTIONS  statement
..

IMPALA-5259: Add REFRESH FUNCTIONS  statement

Before this patch, Impala relied on INVALIDATE METADATA to load
externally added UDFs from HMS. The problem with this approach is that
INVALIDATE METADATA affects all databases and tables in the entire
cluster.

In this patch, we add a REFRESH FUNCTIONS  statement that reloads
the functions of a database from HMS. We return a list of updated and
removed db functions to the issuing Impalad in order to update its
local catalog cache.

Testing:
- Ran a private build which passed.

Change-Id: I3625c88bb51cca833f3293c224d3f0feb00e6e0b
---
M common/thrift/CatalogService.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Function.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M tests/custom_cluster/test_permanent_udfs.py
10 files changed, 294 insertions(+), 57 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/6878/5
-- 
To view, visit http://gerrit.cloudera.org:8080/6878
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3625c88bb51cca833f3293c224d3f0feb00e6e0b
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-5259: Add REFRESH FUNCTIONS statement

2017-05-19 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-5259: Add REFRESH FUNCTIONS  statement
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6878/4/fe/src/main/java/org/apache/impala/catalog/Db.java
File fe/src/main/java/org/apache/impala/catalog/Db.java:

Line 369:* Removes all functions.
> remove (same as fn name)
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6878
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3625c88bb51cca833f3293c224d3f0feb00e6e0b
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5340: Query profile displays stale query state

2017-05-19 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-5340: Query profile displays stale query state
..


Patch Set 3: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6923/3/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

Line 89:   "Query State: FINISHED" in profile, profile
> It repros it deterministically.
cool


-- 
To view, visit http://gerrit.cloudera.org:8080/6923
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I952319b7308a24d4e2dff924199c0c771bce25b3
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5340: Query profile displays stale query state

2017-05-19 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-5340: Query profile displays stale query state
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6923/3/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

Line 89:   "Query State: FINISHED" in profile, profile
> was that able to reproduce the bug at some reasonably high rate?
It repros it deterministically.


-- 
To view, visit http://gerrit.cloudera.org:8080/6923
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I952319b7308a24d4e2dff924199c0c771bce25b3
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5340: Query profile displays stale query state

2017-05-19 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-5340: Query profile displays stale query state
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6923/3/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

Line 89:   "Query State: FINISHED" in profile, profile
was that able to reproduce the bug at some reasonably high rate?


-- 
To view, visit http://gerrit.cloudera.org:8080/6923
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I952319b7308a24d4e2dff924199c0c771bce25b3
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5340: Query profile displays stale query state

2017-05-19 Thread Thomas Tauber-Marshall (Code Review)
Hello Michael Ho, Matthew Jacobs, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6923

to look at the new patch set (#3).

Change subject: IMPALA-5340: Query profile displays stale query state
..

IMPALA-5340: Query profile displays stale query state

Previously, updates to the query state in ClientRequestState were
not immediately reflected in the query profile, potentially
leading to the profile showing an incorrect state for an extended
perioud during execution.

In particular, queries were being shown in the 'CREATED' state
long after they had started 'RUNNING'.

The fix is to update the profile whenever the state is updated.

Testing:
- Extended existing hs2 tests and added a beeswax test to check
  for expected query states in the profile

Change-Id: I952319b7308a24d4e2dff924199c0c771bce25b3
---
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M tests/hs2/test_hs2.py
M tests/query_test/test_observability.py
4 files changed, 75 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/6923/3
-- 
To view, visit http://gerrit.cloudera.org:8080/6923
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I952319b7308a24d4e2dff924199c0c771bce25b3
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-5342: Add comments of loaded tables in the response of GetTables

2017-05-19 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6933

Change subject: IMPALA-5342: Add comments of loaded tables in the response of 
GetTables
..

IMPALA-5342: Add comments of loaded tables in the response of GetTables

This commit changes the response of HiveServer2 GetTables request to
return the comments (if any) of loaded tables. For unloaded tables
or for tables with no comments an empty string is returned.

Testing:
- Added a new Frontend test.

Change-Id: I61f327168a93ceb4bd60b47474f39bfa405ae07d
---
M fe/src/main/java/org/apache/impala/service/MetadataOp.java
M fe/src/test/java/org/apache/impala/service/FrontendTest.java
2 files changed, 48 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/6933/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6933
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I61f327168a93ceb4bd60b47474f39bfa405ae07d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 


[Impala-ASF-CR] IMPALA-1972/IMPALA-3882: Fix client request state map lock contention

2017-05-19 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-1972/IMPALA-3882: Fix client_request_state_map_lock_ 
contention
..


Patch Set 7:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/6707/7/be/src/service/impala-beeswax-server.cc
File be/src/service/impala-beeswax-server.cc:

PS7, Line 296: NULL
nit: change that one too to at least keep functions consistent


http://gerrit.cloudera.org:8080/#/c/6707/7/be/src/service/impala-http-handler.cc
File be/src/service/impala-http-handler.cc:

PS7, Line 721: just return
that's not what the code does (it also sets plan_metadata_unavailable), please 
rephrase.  Could rephrase the whole comment as: 

If the query plan isn't generated, avoid waiting for the lock, which could take 
a while if catalog metadata is being loaded.


PS7, Line 730: adopt_lock_t
shouldn't that be deleted?


http://gerrit.cloudera.org:8080/#/c/6707/7/tests/custom_cluster/test_query_concurrency.py
File tests/custom_cluster/test_query_concurrency.py:

PS7, Line 32: The intention here is to check contention on the 
query_exec_state_map_lock_
This is talking about how the old code worked, which won't make sense to people 
reading the current code (after this change). It should say something like:

The intention is to check that the webserver does not hold any global locks or 
otherwise prevent impalad from servicing new requests.


PS7, Line 54: This creates lock contention on the coordinator by
: calling QuerySummaryHandler() method
This is no longer true with your fix. How about saying:

This is to verify that QuerySummaryHandler() does not hold any global locks 
that would, for example, prevent another query from starting.


PS7, Line 74: time.sleep(2)
I'm worried that this will be flaky, especially with ASAN.  Instead of this 
delay, couldn't we just wait for in_flight_queries to become 1?  And you could 
use the parameter to get_in_flight_queries() to do that by passing some largish 
value. That has the advantage that we'll wait only as long as necessary for the 
value to change to 1, so we can have a relatively long timeout (rather than 
delay).


PS7, Line 83: time.sleep(2)
this delay is a bit harder to eliminate.  How about we increase 
--stress_metadata_loading_pause_injection_ms to something really large, say 
1000 seconds (which doesn't matter -- we don't actually need the queries to 
finish planning to end the test, right?). 

And then we can use a larger timeout here, but we don't need to delay for it. 
We can just do:

inflight_query_ids = impalad.service.get_in_flight_queries(30)

which will poll the webui once per second and give up after 30 seconds.


-- 
To view, visit http://gerrit.cloudera.org:8080/6707
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie44daa93e3ae4d04d091261f3ec4891caffe8026
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5259: Add REFRESH FUNCTIONS statement

2017-05-19 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-5259: Add REFRESH FUNCTIONS  statement
..


Patch Set 4: Code-Review+1

(1 comment)

Dimitris should sign off on the catalog changes.

http://gerrit.cloudera.org:8080/#/c/6878/4/fe/src/main/java/org/apache/impala/catalog/Db.java
File fe/src/main/java/org/apache/impala/catalog/Db.java:

Line 369:* Removes all functions.
remove (same as fn name)


-- 
To view, visit http://gerrit.cloudera.org:8080/6878
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3625c88bb51cca833f3293c224d3f0feb00e6e0b
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-19 Thread Joe McDonnell (Code Review)
Joe McDonnell has uploaded a new patch set (#9).

Change subject: IMPALA-4623: Enable file handle cache
..

IMPALA-4623: Enable file handle cache

Currently, every scan range maintains a file handle, even
when multiple scan ranges are accessing the same file.
Opening the file handles causes load on the NameNode, which
can lead to scaling issues.

There are two parts to this transaction:
1. Enable file handle caching by default
2. Share the file handle between scan ranges from the same
file

The scan range no longer maintains its own Hdfs file
handle. On each read, the io thread will get the Hdfs file
handle from the cache (opening it if necessary) and use
that for the read. This allows multiple scan ranges on the
same file to use the same file handle. Since the file
offsets are no longer consistent for an individual scan
range, all Hdfs reads need to either use hdfsPread or do
a seek before reading. Additionally, since Hdfs read
statistics are maintained on the file handle, the read
statistics must be retrieved and cleared after each read.

To manage contention, the file handle cache is now
partitioned by a hash of the key into independent
caches with independent locks. The allowed capacity
of the file handle cache is split evenly among the
partitions. File handles are evicted independently
for each partition. The file handle cache maintains
ownership of the file handles at all times, but it
will not evict a file handle that is in use.

If max_cached_file_handles is set to 0 or the the
scan range is accessing data cached by Hdfs, the
scan range will get a file handle from the cache
and hold it until the scan range is closed. This
mimics the existing behavior, except the file
handle stays in the cache and is owned by the
cache. Since it is in use, it will not be evicted.

Tests:
query_test/test_hdfs_fd_caching.py copies the files from
an existing table into a new directory and uses that to
create an external table. It queries the external table,
then uses the hdfs commandline to manipulate the hdfs file
(delete, move, etc). It queries again to make sure we
don't crash. Then, it runs "invalidate metadata". It
checks the row counts before the modification and after
"invalidate metadata", but it does not check the results
in between.
custom_cluster/test_hdfs_fd_caching.py starts up a cluster
with a small file handle cache size. It verifies that a
file handle can be reused (i.e. rerunning a query does
not result in more file handles cached). It also verifies
that the cache capacity is enforced.

Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/runtime/buffered-block-mgr.h
A be/src/runtime/disk-io-mgr-handle-cache.h
A be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr-internal.h
M be/src/runtime/disk-io-mgr-reader-context.cc
M be/src/runtime/disk-io-mgr-scan-range.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M tests/custom_cluster/test_hdfs_fd_caching.py
M tests/metadata/test_refresh_partition.py
M tests/query_test/test_hdfs_fd_caching.py
13 files changed, 757 insertions(+), 399 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/6478/9
-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4623: Enable file handle cache

2017-05-19 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-4623: Enable file handle cache
..


Patch Set 8:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/6478/8/be/src/runtime/disk-io-mgr-handle-cache.h
File be/src/runtime/disk-io-mgr-handle-cache.h:

Line 88:   /// with a call to ReleaseFileHandle to release exclusive control.
> this should also evict handles if a more recent mtime for a file with exist
Added a TODO for this. I plan on dealing with this when doing eviction by 
timeout.

Here is a scenario that I might be concerned about for immediate eviction when 
seeing a higher mtime:

1. ScanRange #1 starts up
2. ScanRange #1 does a read, putting a file handle with mtime=5 in the cache.
3. The file is modified and now has an mtime=6.
4. ScanRange #2 starts up
5. ScanRange #2 does a read, putting a file handle with mtime=6 in the cache 
and removing the mtime=5 file handle.
6. ScanRange #1 wants to do another read and goes to get a file handle from the 
cache, but mtime=5 file handles aren't available. What happens now? It could 
return an error, which makes some sense. It could open a new file handle, but 
it would be looking at the mtime=6 file. The mtime=6 file might have different 
offsets, so it might be reading garbage. 

A lot of this is hypothetical, but it would be nice for a ScanRange to deal 
with a single version of a file. That is a property that the old code had.


Line 98:   static const uint32_t HASH_SEED = 0x87654321;
> where from?
No need for a non-zero hash seed for our purpose, so I have removed this and 
substituted zero in the code.


http://gerrit.cloudera.org:8080/#/c/6478/8/be/src/runtime/disk-io-mgr-handle-cache.inline.h
File be/src/runtime/disk-io-mgr-handle-cache.inline.h:

Line 65:   // Keep going until find unused entry with the same mtime
> fix grammar
Done


Line 113:   // to its place in the map
> add it?
I started working on this. The templating starts to infect everything. I will 
keep looking into it for a followup, but I think I will leave this as a TODO 
for the first version.


Line 117: if (elem->fh.get() == fh) {
> single line
Done


Line 129:   // File can be unbuffered
> where does the corresponding buffering take place?
Added information in the comment. The buffering here is readahead buffering 
that Hdfs does for a read. That is what we are unbuffering.


http://gerrit.cloudera.org:8080/#/c/6478/8/be/src/runtime/disk-io-mgr-scan-range.cc
File be/src/runtime/disk-io-mgr-scan-range.cc:

Line 417:   int last_read = -1;
> what does this mean? does it count anything, and if so, what?
This is the number of bytes read by either hdfsPread or hdfsRead. In case of an 
error, it returns -1. I changed the name to "last_bytes_read"


Line 429: return Status(GetHdfsErrorMsg("Error seeking HDFS file: 
", file_));
> also print position
Done


Line 435: return Status(GetHdfsErrorMsg("Error reading from HDFS file: 
", file_));
> leave todo that you need to retry once to deal with stale cache entries
Done


Line 549: remote_bytes_ += stats->totalBytesRead - 
stats->totalLocalBytesRead;
> why can't remote_bytes_ go into reader_?
The only real obstacle is this piece of logging:

https://github.com/apache/incubator-impala/blob/master/be/src/runtime/disk-io-mgr-scan-range.cc#L341
(It is in a different location after this code change.)

The logging is printing the number of unexpected remote bytes for this scan 
range. If the remote bytes go the reader_, I don't have a way to separate it 
out.

If we don't need the logging, then the state on the ScanRange can be a boolean, 
and we can send remote bytes directly to reader_.


http://gerrit.cloudera.org:8080/#/c/6478/8/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

Line 1300
> this comment disappeared.
Added a comment in the new location. Hdfs can do some readahead buffering, so 
we want to reduce this buffering when putting a file handle in the cache.


Line 278: cached_read_options_(nullptr),
> move into .h file and remove here
Done


http://gerrit.cloudera.org:8080/#/c/6478/8/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

Line 498: int64_t remote_bytes_;
> num_...
Done


http://gerrit.cloudera.org:8080/#/c/6478/8/tests/query_test/test_hdfs_fd_caching.py
File tests/query_test/test_hdfs_fd_caching.py:

Line 74: self.execute_query_expect_success(self.client, count_query)
> this should give some expected result
Done


Line 99: self.execute_query_expect_success(self.client, count_query)
> you should check for the expected result, depending on the modification typ
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe 

[Impala-ASF-CR] IMPALA-5340: Query profile displays stale query state

2017-05-19 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-5340: Query profile displays stale query state
..


Patch Set 2: Code-Review+2

> (1 comment)
 > 
 > > Any good way to regression test this?
 > 
 > Obviously its a timing issue when we'll advance query states, and
 > without a good way to control the timing tests run the risk of
 > either being flaky or being too permissive to catch bugs.
 > 
 > One observation is with this change it will always be the case that
 > after a call to ExecuteStatement (for hs2, or the equivalent for
 > beeswax), the query will have advanced past CREATED to RUNNING or
 > FINISHED (which was not the case before this patch) and after all
 > rows have been fetched the state will be FINISHED.
 > 
 > We could check for those things in tests, though I don't think
 > they're required by the hs2 (or beeswax) spec so it may change in
 > the future.

Or we could get the profile and get_state() (or GetOperationStatus()) in a 
loop, and ensure that the query state reported by the profile and directly from 
the RPC never go "backwards" in the allowable transitions.

I could be convinced that this is overkill, though. Really, we're kind of 
missing a full set of tests for the debugserver (that should probably be run 
while various queries are executing), but that's out of scope for this change.

I'll let you decide whether a specific regression test is worth the cost for 
this case.

-- 
To view, visit http://gerrit.cloudera.org:8080/6923
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I952319b7308a24d4e2dff924199c0c771bce25b3
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5259: Add REFRESH FUNCTIONS statement

2017-05-19 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#4).

Change subject: IMPALA-5259: Add REFRESH FUNCTIONS  statement
..

IMPALA-5259: Add REFRESH FUNCTIONS  statement

Before this patch, Impala relied on INVALIDATE METADATA to load
externally added UDFs from HMS. The problem with this approach is that
INVALIDATE METADATA affects all databases and tables in the entire
cluster.

In this patch, we add a REFRESH FUNCTIONS  statement that reloads
the functions of a database from HMS. We return a list of updated and
removed db functions to the issuing Impalad in order to update its
local catalog cache.

Testing:
- Ran a private build which passed.

Change-Id: I3625c88bb51cca833f3293c224d3f0feb00e6e0b
---
M common/thrift/CatalogService.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Function.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M tests/custom_cluster/test_permanent_udfs.py
10 files changed, 294 insertions(+), 57 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/6878/4
-- 
To view, visit http://gerrit.cloudera.org:8080/6878
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3625c88bb51cca833f3293c224d3f0feb00e6e0b
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-5259: Add REFRESH FUNCTIONS statement

2017-05-19 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-5259: Add REFRESH FUNCTIONS  statement
..


Patch Set 3:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/6878/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

PS2, Line 639: 
> Oh I see. Got it, my bad. Please revert, the new code has a bug (never rele
Oh, good catch. Fixed.


http://gerrit.cloudera.org:8080/#/c/6878/3/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

Line 420:   throw new 
DatabaseNotFoundException(Analyzer.DB_DOES_NOT_EXIST_ERROR_MSG + dbName);
> Please revert. We try to separate catalog and analysis code as much as poss
Done


Line 638:   throw new 
DatabaseNotFoundException(Analyzer.DB_DOES_NOT_EXIST_ERROR_MSG + dbName);
> please revert this error msg
Done


Line 861: throw new 
DatabaseNotFoundException(Analyzer.DB_DOES_NOT_EXIST_ERROR_MSG +
Reverted this one too.


-- 
To view, visit http://gerrit.cloudera.org:8080/6878
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3625c88bb51cca833f3293c224d3f0feb00e6e0b
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5340: Query profile displays stale query state

2017-05-19 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-5340: Query profile displays stale query state
..


Patch Set 2:

(1 comment)

> Any good way to regression test this?

Obviously its a timing issue when we'll advance query states, and without a 
good way to control the timing tests run the risk of either being flaky or 
being too permissive to catch bugs.

One observation is with this change it will always be the case that after a 
call to ExecuteStatement (for hs2, or the equivalent for beeswax), the query 
will have advanced past CREATED to RUNNING or FINISHED (which was not the case 
before this patch) and after all rows have been fetched the state will be 
FINISHED.

We could check for those things in tests, though I don't think they're required 
by the hs2 (or beeswax) spec so it may change in the future.

http://gerrit.cloudera.org:8080/#/c/6923/2/be/src/service/client-request-state.h
File be/src/service/client-request-state.h:

PS2, Line 413: must ensure lock_
> Any chance we can DCHECK it ?
>From looking through the boost documentation, I don't think so.


-- 
To view, visit http://gerrit.cloudera.org:8080/6923
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I952319b7308a24d4e2dff924199c0c771bce25b3
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5339: Fix analysis with sort.columns and expr rewrites

2017-05-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5339: Fix analysis with sort.columns and expr rewrites
..


Patch Set 4:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/596/

-- 
To view, visit http://gerrit.cloudera.org:8080/6921
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibebba29509ae7eaa691fe305500cda6bd41a179a
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5339: Fix analysis with sort.columns and expr rewrites

2017-05-19 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-5339: Fix analysis with sort.columns and expr rewrites
..


Patch Set 4: Code-Review+2

Rebased, carrying Dimitris' +2

-- 
To view, visit http://gerrit.cloudera.org:8080/6921
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibebba29509ae7eaa691fe305500cda6bd41a179a
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5259: Add REFRESH FUNCTIONS statement

2017-05-19 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#4).

Change subject: IMPALA-5259: Add REFRESH FUNCTIONS  statement
..

IMPALA-5259: Add REFRESH FUNCTIONS  statement

Before this patch, Impala relied on INVALIDATE METADATA to load
externally added UDFs from HMS. The problem with this approach is that
INVALIDATE METADATA affects all databases and tables in the entire
cluster.

In this patch, we add a REFRESH FUNCTIONS  statement that reloads
the functions of a database from HMS. We return a list of updated and
removed db functions to the issuing Impalad in order to update its
local catalog cache.

Testing:
- Ran a private build which passed.

Change-Id: I3625c88bb51cca833f3293c224d3f0feb00e6e0b
---
M common/thrift/CatalogService.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Db.java
M fe/src/main/java/org/apache/impala/catalog/Function.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M tests/custom_cluster/test_permanent_udfs.py
10 files changed, 294 insertions(+), 57 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/6878/4
-- 
To view, visit http://gerrit.cloudera.org:8080/6878
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3625c88bb51cca833f3293c224d3f0feb00e6e0b
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-3973: optional 3rd and 4th arguments for instr().

2017-05-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-3973: optional 3rd and 4th arguments for instr().
..


IMPALA-3973: optional 3rd and 4th arguments for instr().

Change-Id: I17268bdb480230938f94559fe1eabe34ac2448b7
Reviewed-on: http://gerrit.cloudera.org:8080/5589
Reviewed-by: Jim Apple 
Tested-by: Impala Public Jenkins
---
M docs/topics/impala_string_functions.xml
1 file changed, 173 insertions(+), 1 deletion(-)

Approvals:
  Impala Public Jenkins: Verified
  Jim Apple: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/5589
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I17268bdb480230938f94559fe1eabe34ac2448b7
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-Reviewer: zi+z...@cloudera.com


[Impala-ASF-CR] IMPALA-3973: optional 3rd and 4th arguments for instr().

2017-05-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-3973: optional 3rd and 4th arguments for instr().
..


Patch Set 6: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5589
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I17268bdb480230938f94559fe1eabe34ac2448b7
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-Reviewer: zi+z...@cloudera.com
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3973: optional 3rd and 4th arguments for instr().

2017-05-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-3973: optional 3rd and 4th arguments for instr().
..


Patch Set 6:

Build started: http://jenkins.impala.io:8080/job/gerrit-docs-submit/114/

-- 
To view, visit http://gerrit.cloudera.org:8080/5589
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I17268bdb480230938f94559fe1eabe34ac2448b7
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-Reviewer: zi+z...@cloudera.com
Gerrit-HasComments: No


[Impala-ASF-CR] Revert "IMPALA-2716: Hive/Impala incompatibility for timestamp data in Parquet"

2017-05-19 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: Revert "IMPALA-2716: Hive/Impala incompatibility for timestamp 
data in Parquet"
..


Patch Set 1: Code-Review+2

(3 comments)

I only looked at the files with conflicts.

http://gerrit.cloudera.org:8080/#/c/6896/1/be/src/runtime/timestamp-value.cc
File be/src/runtime/timestamp-value.cc:

Line 95:   return;
your changes here seemed like an improvement. consider bringing them back in 
another change (but doesn't have to be part of this revert).


http://gerrit.cloudera.org:8080/#/c/6896/1/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

Line 25: #include 
why do we need these headers?


http://gerrit.cloudera.org:8080/#/c/6896/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

Line 123:   private final static long MIN_SYNTHETIC_BLOCK_SIZE = 1024 * 1024;
what did these have to do with your change? are they just dead variables (in 
which case it'd be nice to re-apply this later)?


-- 
To view, visit http://gerrit.cloudera.org:8080/6896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic66de277c622748540c1b9969152c2cabed1f3bd
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Dan Hecht 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5333: Add support for Impala to work with ADLS

2017-05-19 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-5333: Add support for Impala to work with ADLS
..


Patch Set 3:

I have a few high level questions about this patch. This patch treats S3 and 
ADL the same way but after looking at the HDFS classes and the ADL API I see 
some differences between ADL and S3 API. For instance, I don't see a way in ADL 
to recursively enumerate all entries under a particular directory (something 
that exists in S3). Our code today for S3 sort of relies on that property so I 
am wondering if our code works as expected or if I am missing something (quite 
possible). Also, the AdlFileSystem class has functions for returning the block 
locations, which means that we don't have to call the synthesize metadata calls 
as we do for S3. In the long run, HDFS may expose replica locations from the 
ADL local tiers (e.g. Cosmos), but I guess this is not relevant today. Thoughts?

-- 
To view, visit http://gerrit.cloudera.org:8080/6910
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic56b9988b32a330443f24c44f9cb2c80842f7542
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5340: Query profile displays stale query state

2017-05-19 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-5340: Query profile displays stale query state
..


Patch Set 2:

Any good way to regression test this?

-- 
To view, visit http://gerrit.cloudera.org:8080/6923
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I952319b7308a24d4e2dff924199c0c771bce25b3
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5331: Use new libHDFS API to address "Unknown Error 255"

2017-05-19 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-5331: Use new libHDFS API to address "Unknown Error 255"
..


Patch Set 6:

> Unfortunately, Python 2.6 doesn't have check_output(). So I've used
 > Popen() instead now. Testing it below:
 > http://sandbox.jenkins.cloudera.com/view/Impala/view/Private-Utility/job/impala-private-build-and-test/5631/

All tests passed. Looks like the test works just fine and doesn't affect any 
other tests.

-- 
To view, visit http://gerrit.cloudera.org:8080/6894
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I181e316ed63b70b94d4f7a7557d398a931bb171d
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5339: Fix analysis with sort.columns and expr rewrites

2017-05-19 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-5339: Fix analysis with sort.columns and expr rewrites
..


Patch Set 3: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6921
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibebba29509ae7eaa691fe305500cda6bd41a179a
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3973: optional 3rd and 4th arguments for instr().

2017-05-19 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-3973: optional 3rd and 4th arguments for instr().
..


Patch Set 6: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/5589
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I17268bdb480230938f94559fe1eabe34ac2448b7
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-Reviewer: zi+z...@cloudera.com
Gerrit-HasComments: No


[native-toolchain-CR] Ported native-toolchain to work on ppc64le

2017-05-19 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: Ported native-toolchain to work on ppc64le
..


Patch Set 4:

(1 comment)

Thanks for the changes. The code changes look good aside from one minor comment.

After that I'll need to do a build on x86 and just do some tests to make sure 
nothing broke there. If you can just confirm on your own that it builds on x86 
ok that would be good - it will save us a round-trip if there is anything wrong.

http://gerrit.cloudera.org:8080/#/c/6468/4/buildall.sh
File buildall.sh:

Line 105: CRCUTIL_VERSION=440ba7babeff77ffad992df3a10c767f184e946e-p1\
Can you add the old versions back in, just inside an "if ((BUILD_HISTORICAL))". 
The idea is that we can generate all of the old versions of things if needed.


-- 
To view, visit http://gerrit.cloudera.org:8080/6468
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7931752ca497bc7a5e3cc574bbb54637f382c72a
Gerrit-PatchSet: 4
Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-Owner: Valencia Edna Serrao 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Valencia Edna Serrao 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5339: Fix analysis with sort.columns and expr rewrites

2017-05-19 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-5339: Fix analysis with sort.columns and expr rewrites
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6921/2/testdata/workloads/functional-planner/queries/PlannerTest/insert-sort-by.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/insert-sort-by.test:

PS2, Line 327: expr rewrite
> Maybe also add a test with a subquery in the select in order to trigger a q
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6921
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibebba29509ae7eaa691fe305500cda6bd41a179a
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5339: Fix analysis with sort.columns and expr rewrites

2017-05-19 Thread Lars Volker (Code Review)
Hello Thomas Tauber-Marshall,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6921

to look at the new patch set (#3).

Change subject: IMPALA-5339: Fix analysis with sort.columns and expr rewrites
..

IMPALA-5339: Fix analysis with sort.columns and expr rewrites

IMPALA-4166 introduced a bug by duplicating code that adds sort
expressions. Upon re-analysis, this code would hit an
IndexOutOfBoundsException.

Change-Id: Ibebba29509ae7eaa691fe305500cda6bd41a179a
---
M fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableDef.java
M testdata/workloads/functional-planner/queries/PlannerTest/insert-sort-by.test
4 files changed, 82 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/21/6921/3
-- 
To view, visit http://gerrit.cloudera.org:8080/6921
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibebba29509ae7eaa691fe305500cda6bd41a179a
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[native-toolchain-CR] Ported native-toolchain to work on ppc64le

2017-05-19 Thread Valencia Edna Serrao (Code Review)
Valencia Edna Serrao has posted comments on this change.

Change subject: Ported native-toolchain to work on ppc64le
..


Patch Set 2:

(14 comments)

Thanks for the comments on the patchset, Tim!

I've worked on the points you mentioned. Please let me know if any issues.

http://gerrit.cloudera.org:8080/#/c/6468/3/buildall.sh
File buildall.sh:

Line 106:   $SOURCE_DIR/source/crcutil/build.sh
> The -p2 patch should work for x86-64 too, right? Let's just bump the versio
Resolved. Necessary fix done in the ppc patch for crcutil


Line 262: 

> The control flow here changed for x86-64 - now if BUILD_HISTORICAL is true,
Resolved. Necessary fix done in ppc patches for breakpad.


Line 263: FLATBUFFERS_VERSION=1.6.0 $SOURCE_DIR/source/flatbuffers/build.sh
> Same as above - let's just bump the version for both platforms.
Ok


http://gerrit.cloudera.org:8080/#/c/6468/3/functions.sh
File functions.sh:

PS3, Line 305: o
> This logic is pretty weird - the patches will be numbered differently on di
The idea behind this was to apply the ppc patches only if arch is ppc.
However, as I have fixed the patches to be generic, I have already removed this 
'if' condition


http://gerrit.cloudera.org:8080/#/c/6468/3/source/breakpad/breakpad-88e5b2c8806bac3f2c80d2fe80094be5bd371601-patches/0003-Build-breakpad_88e5b2c-on-ppc64le.patch
File 
source/breakpad/breakpad-88e5b2c8806bac3f2c80d2fe80094be5bd371601-patches/0003-Build-breakpad_88e5b2c-on-ppc64le.patch:

> I really encourage you to get this patch into upstream breakpad. The proble
I understand your point. I've already started working on the breakpad 
upstreaming activity. 

Thanks for bringing this to my notice. I've fixed the issues and it runs 
smoothly now on x86 even when ppc patch is applied


http://gerrit.cloudera.org:8080/#/c/6468/3/source/breakpad/breakpad-88e5b2c_ppc.patch
File source/breakpad/breakpad-88e5b2c_ppc.patch:

> Was this accidentally checked in?
Yes.


http://gerrit.cloudera.org:8080/#/c/6468/3/source/crcutil/build.sh
File source/crcutil/build.sh:

Line 36: 
> Don't need this change
Ok. Removed it.


http://gerrit.cloudera.org:8080/#/c/6468/3/source/crcutil/crcutil-440ba7babeff77ffad992df3a10c767f184e946e-patches/0002_crc_interface_cc.patch
File 
source/crcutil/crcutil-440ba7babeff77ffad992df3a10c767f184e946e-patches/0002_crc_interface_cc.patch:

Line 7
> If I understand correctly, this is just a hack to remove code that doesn't 
I've fixed the issue that required this conditional. The conditional is no 
longer required.


Line 12
> The second return should be in a #else, right? Otherwise the function has t
With the issue now fixed with the new crcutil patch that I've uploaded, the 
second return is removed.


http://gerrit.cloudera.org:8080/#/c/6468/3/source/kudu/build.sh
File source/kudu/build.sh:

Line 93:  # Removing trailing slash
> Can you revert the formatting changes? it just makes it harder to see where
Done.


http://gerrit.cloudera.org:8080/#/c/6468/3/source/libunwind/build.sh
File source/libunwind/build.sh:

Line 35:   autoreconf -i
> This could be a patch too, right?
Sure. Done.

No problem. Thanks!


http://gerrit.cloudera.org:8080/#/c/6468/3/source/llvm/build-3.3.sh
File source/llvm/build-3.3.sh:

PS3, Line 85:   wrap ../llvm-
I noticed that CONFIGURE_FLAGS var has many ocurrences in the native-toolchain 
code, therefore, I'll be using the var 'CONFIGURE_FLAG_BUILD_SYS' instead of 
'CONFIGURE_FLAGS' for setting the ppc-specific config flags.


Line 87: --prefix=$LOCAL_INSTALL --with-pic $EXTRA_CONFIG_ARG \
> nit: don't need TARGET= above, can just use TARGET= here and on line 88
Ok.
I noticed that TARGET var has many occurences throughout native-toolchain code, 
therefore, I have used a new var 'LLVM_BUILD_TARGET' instead of 'TARGET' var.


http://gerrit.cloudera.org:8080/#/c/6468/3/source/llvm/build-source-tarball.sh
File source/llvm/build-source-tarball.sh:

Line 79:   # Invoke CMake with the correct configuration
> nit: don't need TARGET= above, can just use TARGET= here and on line 82
Ok.
I noticed that TARGET var has many occurences throughout native-toolchain code, 
therefore, I have used a new var 'LLVM_BUILD_TARGET' instead of 'TARGET' var.


-- 
To view, visit http://gerrit.cloudera.org:8080/6468
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7931752ca497bc7a5e3cc574bbb54637f382c72a
Gerrit-PatchSet: 2
Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-Owner: Valencia Edna Serrao 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Valencia Edna Serrao 
Gerrit-HasComments: Yes


[native-toolchain-CR] Ported native-toolchain to work on ppc64le

2017-05-19 Thread Valencia Edna Serrao (Code Review)
Valencia Edna Serrao has uploaded a new patch set (#4).

Change subject: Ported native-toolchain to work on ppc64le
..

Ported native-toolchain to work on ppc64le

Change-Id: I7931752ca497bc7a5e3cc574bbb54637f382c72a
---
M buildall.sh
M functions.sh
M init-compiler.sh
M init.sh
M source/autoconf/build.sh
A 
source/breakpad/breakpad-ffe3e478657dc7126fca6329dfcedc49f4c726d9-patches/0002-Build-breakpad-ffe3e47-on-ppc64le.patch
A 
source/crcutil/crcutil-440ba7babeff77ffad992df3a10c767f184e946e-patches/0001-Build-crcutil-440ba7b-on-ppc64le.patch
M source/cyrus-sasl/build.sh
M source/glog/build.sh
M source/kudu/build.sh
M source/libevent/build.sh
M source/libtool/build.sh
M source/libunwind/build.sh
A 
source/libunwind/libunwind-1.1-patches/0001-Build-libunwind-1.1-on-ppc64le.patch
M source/llvm/build-3.3.sh
M source/llvm/build-source-tarball.sh
M source/openldap/build.sh
M source/openssl/build.sh
M source/thrift/build.sh
19 files changed, 729 insertions(+), 21 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/68/6468/4
-- 
To view, visit http://gerrit.cloudera.org:8080/6468
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7931752ca497bc7a5e3cc574bbb54637f382c72a
Gerrit-PatchSet: 4
Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-Owner: Valencia Edna Serrao 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Valencia Edna Serrao 


[Impala-ASF-CR] IMPALA-5180: Don't use non-deterministic exprs in partition pruning

2017-05-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5180: Don't use non-deterministic exprs in partition 
pruning
..


Patch Set 12: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I91054c6bf017401242259a1eff5e859085285546
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5180: Don't use non-deterministic exprs in partition pruning

2017-05-19 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5180: Don't use non-deterministic exprs in partition 
pruning
..


IMPALA-5180: Don't use non-deterministic exprs in partition pruning

Non-deterministic exprs which evaluate as constant should not be
used during HDFS partition pruning.  We consider Exprs which have no
SlotRefs as bound by default, and thus we end up trying to apply
them indisrciminately.  Constant propagation makes this situation
easier to run into and the behavior is rather unexpected.

The fix for now is to explicitly disallow non-deterministic Exprs
in partition pruning.

Change-Id: I91054c6bf017401242259a1eff5e859085285546
Reviewed-on: http://gerrit.cloudera.org:8080/6575
Reviewed-by: Alex Behm 
Tested-by: Impala Public Jenkins
---
M fe/src/main/java/org/apache/impala/analysis/PartitionSet.java
M fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M testdata/workloads/functional-planner/queries/PlannerTest/hdfs.test
4 files changed, 40 insertions(+), 10 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Alex Behm: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/6575
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I91054c6bf017401242259a1eff5e859085285546
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Zach Amsden