[Impala-ASF-CR] IMPALA-4467: Add support for DML statements in stress test

2016-12-09 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4467: Add support for DML statements in stress test
..


Patch Set 4:

(13 comments)

Looks pretty good!

http://gerrit.cloudera.org:8080/#/c/5093/4//COMMIT_MSG
Commit Message:

Line 24: following flag: --dml-mod-values 11 13 17. For each mod value 4 DML
Seems ok for now, but from a perspective we should consider changing these mod 
values to %rows of the original table. I think ultimately a user wants to 
control the "size" of the DML operations and that is more naturally expressed 
via percent of rows imo. The mod values only seem meaningful if you also know 
the table size, and applying the same mod values to different tables seems 
strange.


http://gerrit.cloudera.org:8080/#/c/5093/4/tests/stress/concurrent_select.py
File tests/stress/concurrent_select.py:

Line 368: self._select_probability = select_probability
check that this is indeed a float between 0 and 1


Line 747: self.modifies_table = False
This flag is pretty confusing. What do we use it for?

Even in your example, the table is "modified" in the sense that the upsert 
changes the state of the table, but not the user-visible contents.

Also, after we've deleted some rows, an insert/upsert may even change the 
user-visible contents.

 If we need this flag we should come up with a more accurate name.


Line 1297:   return
add comment why we need to skip compute stats


Line 1428: if table.name + "_original" in set(table.name for table in 
tables):
What does it mean if this is not true? Maybe add a LOG.debug message to explain


Line 1429:   cursor.execute("SHOW CREATE TABLE " + table.name)
Add a note or TODO that this will not work for certain types  of Kudu tables, 
e.g., if they have range partitions.

I assume that the Kudu tables used in our testing have simple hash partitioning 
based on the PK? Sounds like a potential coverage gap. On the other hand, for 
more complex partition schemes, this SHOW CREATE TABLE trick will not work.


Line 1440:   For each table in the database that cursor is connected to, create 
several queries,
Mention the limitations of this function:
1. Only generates DML statements against Kudu tables, and ignores non-Kudu 
tables.
2. Requires that the type of the first column of the primary key is an integer 
type
3. ...


Line 1443:   exists a table with a '_original' suffix that has unmodified, for 
example, tpch data.
that has unmodified -> that is never modified


Line 1450:   continue
LOG.debug what is happening here


Line 1452: if primary_key.exact_type not in (TinyInt, SmallInt, BigInt):
what about Int?


Line 1456:   continue
LOG.debug what is happening here because some of these limitations are not 
really obvious


Line 1481:   "UPDATE a SET {update_list} FROM {table_name} a JOIN 
{table_name}_original b "
This might give you strange results for tables with multiple primary keys. 
Maybe we should explicitly limit this  functionality to tables with a single 
primary-key column?

Seems like a fine limitation for now, but better to be explicit about what 
works as expected, and what probably does not.


Line 1559:   cursor.execute("SHOW CREATE TABLE " + table_name)
same deal with SHOW CREATE TABLE as above, add a note or TODO that this process 
will not work for some Kudu tables


-- 
To view, visit http://gerrit.cloudera.org:8080/5093
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4639: Add pytest option and skip markers for tests that only run locally.

2016-12-09 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4639: Add pytest option and skip markers for tests that 
only run locally.
..


Patch Set 2:

(4 comments)

Is the motivation of this change to get Jenkins jobs run against a remote 
cluster green? Please briefly summarize the motivation of this change.

http://gerrit.cloudera.org:8080/#/c/5446/2/tests/common/skip.py
File tests/common/skip.py:

Line 122:   mini_cluster_only = 
pytest.mark.skipif(pytest.config.option.testing_remote_cluster,
Why skip and not xfail? With xfail we have the option of still running the 
tests, e.g., to detect crashes.


http://gerrit.cloudera.org:8080/#/c/5446/2/tests/conftest.py
File tests/conftest.py:

Line 108:   parser.addoption("--testing_remote_cluster", action="store_true", 
default=False,
I'm ok with adding this flag, but I think we should enforce its semantics. 
Currently, we can set this flag regardless of whether we are testing on a 
remote cluster or not which seems potentially error-prone.


Line 109:help="Indicates that tests are being run against a 
remote cluster.")
Mention what effect this flag has. It will cause tests to be skipped.


http://gerrit.cloudera.org:8080/#/c/5446/2/tests/query_test/test_mt_dop.py
File tests/query_test/test_mt_dop.py:

Line 95:   @SkipIfRemoteTesting.mini_cluster_only
Does test_nested_types.py or test_tpch_nested_queries.py succeed on a remote 
cluster? I'm surprised that only this test would be skipped of loading 
tpch_nested is not working.


-- 
To view, visit http://gerrit.cloudera.org:8080/5446
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4638: Run queries with MT DOP through admission control.

2016-12-09 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4638: Run queries with MT_DOP through admission control.
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5447
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3e05697a59e14617fb6a13ce8e0410820823fd3a
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4638: Run queries with MT DOP through admission control.

2016-12-09 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4638: Run queries with MT_DOP through admission control.
..


IMPALA-4638: Run queries with MT_DOP through admission control.

The bug was a simple oversight: A TODO that was not addressed.

Testing: I examined query profiles before and after this
change with and without MT_DOP set. After this patch all
queries (with and without MT_DOP) go through admission control.

Change-Id: I3e05697a59e14617fb6a13ce8e0410820823fd3a
Reviewed-on: http://gerrit.cloudera.org:8080/5447
Reviewed-by: Alex Behm 
Tested-by: Internal Jenkins
---
M be/src/scheduling/simple-scheduler.cc
1 file changed, 1 insertion(+), 3 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Alex Behm: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/5447
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I3e05697a59e14617fb6a13ce8e0410820823fd3a
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4611: Checking perms on S3 files is a very expensive no-op

2016-12-09 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-4611: Checking perms on S3 files is a very expensive 
no-op
..


Patch Set 1: Code-Review+2

(3 comments)

http://gerrit.cloudera.org:8080/#/c/5449/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

Line 797:* from that.
Expand the comment to mention that for S3 it always returns READ/WRITE access.


PS1, Line 802: getPermissions
nit: when referencing functions use parentheses (getPermissions()).


PS1, Line 806: TODO: Revisit if the S3A connector is updated to be able to 
manage S3 object
 : // permissions.
If there is a JIRA for this, can you add it here?


-- 
To view, visit http://gerrit.cloudera.org:8080/5449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If9d1072c185a6162727019cdf1cb34d7f3f1c75c
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4639: Add pytest option and skip markers for tests that only run locally.

2016-12-09 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-4639: Add pytest option and skip markers for tests that 
only run locally.
..


Patch Set 2: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5446/2/tests/common/skip.py
File tests/common/skip.py:

PS2, Line 123: Setup for this t
Maybe better "This test is currently..."?


-- 
To view, visit http://gerrit.cloudera.org:8080/5446
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4639: Add pytest option and skip markers for tests that only run locally.

2016-12-09 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-4639: Add pytest option and skip markers for tests that 
only run locally.
..


Patch Set 2: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5446
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4639: Add pytest option and skip markers for tests that only run locally.

2016-12-09 Thread David Knupp (Code Review)
David Knupp has uploaded a new patch set (#2).

Change subject: IMPALA-4639: Add pytest option and skip markers for tests that 
only run locally.
..

IMPALA-4639: Add pytest option and skip markers for tests that only run locally.

Add new skip marker to test_mt_dop.TestMtDopParquet.test_parquet_nested

Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f
---
M tests/common/skip.py
M tests/conftest.py
M tests/metadata/test_compute_stats.py
M tests/query_test/test_mt_dop.py
4 files changed, 13 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/5446/2
-- 
To view, visit http://gerrit.cloudera.org:8080/5446
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4639: Add pytest option and skip markers for tests that only run locally.

2016-12-09 Thread David Knupp (Code Review)
David Knupp has posted comments on this change.

Change subject: IMPALA-4639: Add pytest option and skip markers for tests that 
only run locally.
..


Patch Set 1:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/5446/1//COMMIT_MSG
Commit Message:

PS1, Line 7: Add pytest option and skip markers for tests that only run locally.
> Please add Jira ID.
Done


http://gerrit.cloudera.org:8080/#/c/5446/1/tests/common/skip.py
File tests/common/skip.py:

PS1, Line 120: class SkipIfRemoteTesting:
 :   """Skip marker for those tests that may not work on remote 
clusters."""
 :   local_cluster_only = 
pytest.mark.skipif(pytest.config.option.skip_local_only_tests,
 :   reason="--skip_local_only_tests specified")
> I'm fine with the class name. I'm less enthused on the command line name. W
Done


PS1, Line 123: --skip_local_only_tests
> Could this ever be re-used by some other dev without that command line opti
No, if you don't specify the command line, then this won't have any effect.


PS1, Line 124: hbase_split
> Is this used anywhere?
Thanks, forgot to add that file where it gets used.


PS1, Line 125: HBase is not split on remote cluster the same as on the local 
cluster.
> This doesn't tell me much, probably because I don't know how splitting work
Hbase splitting is usually non-deterministic, but we enforce a particular split 
on our local tests via a setup shell script, and then conform our tests to 
expect that split. We don't have a way of doing this on the remote cluster, and 
it would require a big overhaul to our existing tests to work without it.


http://gerrit.cloudera.org:8080/#/c/5446/1/tests/query_test/test_mt_dop.py
File tests/query_test/test_mt_dop.py:

PS1, Line 95: @SkipIfRemoteTesting.local_cluster_only
> From just looking at this it is not clear to me whether this is skipped or 
This will only be skipped if --testing_remote_cluster is supplied on the 
command line.


PS1, Line 95: @SkipIfRemoteTesting.local_cluster_only
> I think also it's confusing when we have the distinction between a local HD
Done


Line 99: self.run_test_case('QueryTest/mt-dop-parquet-nested', vector)
> Why isn't this test expected to succeed on a cluster? It seems like it shou
It's a setup problem. We're still working though kinks on getting our entire 
data load working across the board on remote clusters. Even though the process 
completes without any errors, here are as yet silent failures. I can add a JIRA 
and reference it here.


-- 
To view, visit http://gerrit.cloudera.org:8080/5446
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4638: Run queries with MT DOP through admission control.

2016-12-09 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4638: Run queries with MT_DOP through admission control.
..


Patch Set 2: Verified-1

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/640/

-- 
To view, visit http://gerrit.cloudera.org:8080/5447
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3e05697a59e14617fb6a13ce8e0410820823fd3a
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4640: Fix number of rows displayed by parquet-reader tool

2016-12-09 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/5453

Change subject: IMPALA-4640: Fix number of rows displayed by parquet-reader tool
..

IMPALA-4640: Fix number of rows displayed by parquet-reader tool

The variable just never got updated in the code. This change also adds
verification that all columns contain the same number of rows.

Change-Id: I281a784a0aa2df4ed1852dfb864587a0c1aa4d9a
---
M be/src/util/parquet-reader.cc
1 file changed, 11 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/5453/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5453
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I281a784a0aa2df4ed1852dfb864587a0c1aa4d9a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 


[Impala-ASF-CR] Add pytest option and skip markers for tests that only run locally.

2016-12-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: Add pytest option and skip markers for tests that only run 
locally.
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/5446/1/tests/query_test/test_mt_dop.py
File tests/query_test/test_mt_dop.py:

PS1, Line 95: @SkipIfRemoteTesting.local_cluster_only
> From just looking at this it is not clear to me whether this is skipped or 
I think also it's confusing when we have the distinction between a local HDFS 
cluster and local filesystem.  Maybe this could be minicluster_only?


Line 99: self.run_test_case('QueryTest/mt-dop-parquet-nested', vector)
Why isn't this test expected to succeed on a cluster? It seems like it should 
work unless there's a product bug.


-- 
To view, visit http://gerrit.cloudera.org:8080/5446
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4611: Checking perms on S3 files is a very expensive no-op

2016-12-09 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-4611: Checking perms on S3 files is a very expensive 
no-op
..


Patch Set 1: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If9d1072c185a6162727019cdf1cb34d7f3f1c75c
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4403: Implement SHOW RANGE PARTITIONS for Kudu tables

2016-12-09 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4403: Implement SHOW RANGE PARTITIONS for Kudu tables
..


Patch Set 11: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5390
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idf5b2fdd02938a42fa59ec98884e4ac915dd1f65
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4403: Implement SHOW RANGE PARTITIONS for Kudu tables

2016-12-09 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4403: Implement SHOW RANGE PARTITIONS for Kudu tables
..


IMPALA-4403: Implement SHOW RANGE PARTITIONS for Kudu tables

Change-Id: Idf5b2fdd02938a42fa59ec98884e4ac915dd1f65
Reviewed-on: http://gerrit.cloudera.org:8080/5390
Reviewed-by: Lars Volker 
Reviewed-by: Matthew Jacobs 
Tested-by: Internal Jenkins
---
M common/thrift/Frontend.thrift
M fe/src/main/cup/sql-parser.cup
D fe/src/main/java/org/apache/impala/analysis/ShowPartitionsStmt.java
M fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_partition_ddl.test
M tests/query_test/test_kudu.py
12 files changed, 238 insertions(+), 72 deletions(-)

Approvals:
  Lars Volker: Looks good to me, but someone else must approve
  Matthew Jacobs: Looks good to me, approved
  Internal Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/5390
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Idf5b2fdd02938a42fa59ec98884e4ac915dd1f65
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4638: Run queries with MT DOP through admission control.

2016-12-09 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4638: Run queries with MT_DOP through admission control.
..


Patch Set 2: Code-Review+2

rebase, carry +1, I talked to Dan and he said I should go ahead and GVO

-- 
To view, visit http://gerrit.cloudera.org:8080/5447
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3e05697a59e14617fb6a13ce8e0410820823fd3a
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4638: Run queries with MT DOP through admission control.

2016-12-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4638: Run queries with MT_DOP through admission control.
..


Patch Set 1: Code-Review+1

I don't think I know the code well enough to fully understand the consequences, 
but it makes sense.

-- 
To view, visit http://gerrit.cloudera.org:8080/5447
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3e05697a59e14617fb6a13ce8e0410820823fd3a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4566: Kudu client glog contention can cause timeouts

2016-12-09 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4566: Kudu client glog contention can cause timeouts
..


Patch Set 3: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5334
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie535d89ec2525232d4f6a29dd44f51cd6e18a0d2
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4566: Kudu client glog contention can cause timeouts

2016-12-09 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4566: Kudu client glog contention can cause timeouts
..


IMPALA-4566: Kudu client glog contention can cause timeouts

Under stressful workloads, there appears to be significant contention
in glog resulting from Kudu logging, causing timeouts and failed
queries.

An easy solution for now is to downgrade Kudu WARNINGs to INFOs as
WARNINGs grab a lock to flush the log. This is appropriate as Kudu
logs WARNINGs much more frequently than Impala and for things that
Impala would normally consider INFO-level.

Testing: Manually verified that the patch redirects Kudu WARNINGs
to the INFO log. Not tested under stress to verify if this actually
solves the contention problem.

Change-Id: Ie535d89ec2525232d4f6a29dd44f51cd6e18a0d2
Reviewed-on: http://gerrit.cloudera.org:8080/5334
Reviewed-by: Matthew Jacobs 
Tested-by: Internal Jenkins
---
M be/src/exec/kudu-util.cc
1 file changed, 4 insertions(+), 2 deletions(-)

Approvals:
  Matthew Jacobs: Looks good to me, approved
  Internal Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/5334
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ie535d89ec2525232d4f6a29dd44f51cd6e18a0d2
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-4638: Run queries with MT DOP through admission control.

2016-12-09 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4638: Run queries with MT_DOP through admission control.
..


Patch Set 1: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5447
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3e05697a59e14617fb6a13ce8e0410820823fd3a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4467: Add support for DML statements in stress test

2016-12-09 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-4467: Add support for DML statements in stress test
..


Patch Set 4: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5093
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4467: Add support for DML statements in stress test

2016-12-09 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#4).

Change subject: IMPALA-4467: Add support for DML statements in stress test
..

IMPALA-4467: Add support for DML statements in stress test

- Add support for insert, upsert, update and and delete statements.
- Add support for compute stats with mt_dop query options.
- Update impyla version in order to be able to have access to query
  error text for DML queries.
- Made flake8 fixes. flake8 on this file is clean.

For every Kudu table in the databases, we make a copy and add a
'_original' suffix to the table name. The DML queries will only make
modifications to the non original table, the original table will never
be modified. The orignal tables could be used to bring the non-original
table to the inital state. Two flags were added for doing this:
--reset-databases-before-binary-search and
--reset-databases-after-binary-search.

The DML queries are generated based on the mod values passed in with the
following flag: --dml-mod-values 11 13 17. For each mod value 4 DML
queries are generated. The DML operations will touch table rows where
primary_key % mod_value = 0. So, the larger the mod value, the more rows
would be affected. The DML queries are generated in such a way that the
data for the insert, upsert, and update queries is taken from the table
with the _original suffix. The stress test generates DML queries for
only kudu databases. For example, --tpch-kudu-db=tpch_100_kudu
--tpch-db=tpch_100 --generate-dml-queries would only generate queries
for the tpch_100_kudu database.

Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40
---
M infra/python/deps/requirements.txt
M tests/stress/concurrent_select.py
M tests/util/parse_util.py
3 files changed, 652 insertions(+), 262 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/5093/4
-- 
To view, visit http://gerrit.cloudera.org:8080/5093
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-4467: Add support for DML statements in stress test

2016-12-09 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-4467: Add support for DML statements in stress test
..


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/5093/2/tests/stress/concurrent_select.py
File tests/stress/concurrent_select.py:

PS2, Line 1464:   insert_query.modifies_table = False
> It's possible updatale_column_names is a bad name. In any case, it currentl
I think it would be more clear to leave it as is.


http://gerrit.cloudera.org:8080/#/c/5093/3/tests/stress/concurrent_select.py
File tests/stress/concurrent_select.py:

PS3, Line 1461: this will still be still
> Some extra words here.
Done


PS3, Line 1657: quereis
> spelling: queries
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5093
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4467: Add support for DML statements in stress test

2016-12-09 Thread Taras Bobrovytsky (Code Review)
Hello Michael Brown,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5093

to look at the new patch set (#4).

Change subject: IMPALA-4467: Add support for DML statements in stress test
..

IMPALA-4467: Add support for DML statements in stress test

- Add support for insert, upsert, update and and delete statements.
- Add support for compute stats with mt_dop query options.
- Update impyla version in order to be able to have access to query
  error text for DML queries.
- Made flake8 fixes. flake8 on this file is clean.

For every Kudu table in the databases, we make a copy and add a
'_original' suffix to the table name. The DML queries will only make
modifications to the non original table, the original table will never
be modified. The orignal tables could be used to bring the non-original
table to the inital state. Two flags were added for doing this:
--reset-databases-before-binary-search and
--reset-databases-after-binary-search.

The DML queries are generated based on the mod values passed in with the
following flag: --dml-mod-values 11 13 17. For each mod value 4 DML
queries are generated. The DML operations will touch table rows where
primary_key % mod_value = 0. So, the larger the mod value, the more rows
would be affected. The DML queries are generated in such a way that the
data for the insert, upsert, and update queries is taken from the table
with the _original suffix. The stress test generates DML queries for
only kudu databases. For example, --tpch-kudu-db=tpch_100_kudu
--tpch-db=tpch_100 --generate-dml-queries would only generate queries
for the tpch_100_kudu database.

Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40
---
M infra/python/deps/requirements.txt
M tests/stress/concurrent_select.py
M tests/util/parse_util.py
3 files changed, 652 insertions(+), 262 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/5093/4
-- 
To view, visit http://gerrit.cloudera.org:8080/5093
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-4611: Checking perms on S3 files is a very expensive no-op

2016-12-09 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/5449

Change subject: IMPALA-4611: Checking perms on S3 files is a very expensive 
no-op
..

IMPALA-4611: Checking perms on S3 files is a very expensive no-op

We call getPermissions() on partition directories to find out if
Impala has access to those files. On S3, this currently is a no-op
as the S3A connector does not try to set/get the permissions for S3
objects. So, it always returns the default set of permissions -> 777.
However, it still makes a roundtrip to S3 causing a slow down in the
Catalog.

We can return the READ_WRITE permission immediately if we know we are
accessing an S3 file, thereby avoiding the round trip to S3 for every
partition. This will greatly speedup metadata operations for S3 tables
and partitions, which is already known to be a big bottleneck.

If and when the S3A connector is able to manage permissions in
the future, we need to revisit this code. However, as permissions on
S3 are unsupported by Impala right now, we might as well gain on perf.

Change-Id: If9d1072c185a6162727019cdf1cb34d7f3f1c75c
---
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
1 file changed, 9 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/5449/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: If9d1072c185a6162727019cdf1cb34d7f3f1c75c
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 


[Impala-ASF-CR] Add pytest option and skip markers for tests that only run locally.

2016-12-09 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: Add pytest option and skip markers for tests that only run 
locally.
..


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/5446/1/tests/common/skip.py
File tests/common/skip.py:

PS1, Line 123: --skip_local_only_tests
Could this ever be re-used by some other dev without that command line option 
being present? Then the error might be misleading.


PS1, Line 124: hbase_split
Is this used anywhere?


PS1, Line 125: HBase is not split on remote cluster the same as on the local 
cluster.
This doesn't tell me much, probably because I don't know how splitting works 
local and/or remote. Maybe you can rephrase to explain what exactly is 
different.


http://gerrit.cloudera.org:8080/#/c/5446/1/tests/query_test/test_mt_dop.py
File tests/query_test/test_mt_dop.py:

PS1, Line 95: @SkipIfRemoteTesting.local_cluster_only
>From just looking at this it is not clear to me whether this is skipped or 
>only runs locally / remotely.


-- 
To view, visit http://gerrit.cloudera.org:8080/5446
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2605: Omit the sort and mini stress tests

2016-12-09 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-2605: Omit the sort and mini stress tests
..


Patch Set 2: Code-Review+1

I'm not going to stand in the way of progress, but for the record:
- We have no root cause for the hang
- We are removing test coverage
- We have no plan of restoring the test coverage

The fact that this test has not caught many failures is not necessarily an 
indication that it's useless. By that definition most tests would be useless.

Think about what would happen if this became an accepted procedure for getting 
rid of "inconvenient" test problems. I concede this case might be an exception, 
just wanted to explain my general reluctance given the facts.

I'm much more in favor of deleting code we don't need instead of leaving it and 
disabling it without understanding what is happening.

-- 
To view, visit http://gerrit.cloudera.org:8080/5401
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibd30abf8215415e0f2830b725e43b005daa2bb2d
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] Add pytest option and skip markers for tests that only run locally.

2016-12-09 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: Add pytest option and skip markers for tests that only run 
locally.
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/5446/1//COMMIT_MSG
Commit Message:

PS1, Line 7: Add pytest option and skip markers for tests that only run locally.
Please add Jira ID.


http://gerrit.cloudera.org:8080/#/c/5446/1/tests/common/skip.py
File tests/common/skip.py:

PS1, Line 120: class SkipIfRemoteTesting:
 :   """Skip marker for those tests that may not work on remote 
clusters."""
 :   local_cluster_only = 
pytest.mark.skipif(pytest.config.option.skip_local_only_tests,
 :   reason="--skip_local_only_tests specified")
I'm fine with the class name. I'm less enthused on the command line name. We 
already have one notion of "local", which is the local filesystem as opposed to 
HDFS et al. Now we are introducing another notion of "local". Is there a 
philosophical argument I'm missing against something like 
"--remote-cluster-tests" ? That is more precise and clear as to the nature of 
the testing.


-- 
To view, visit http://gerrit.cloudera.org:8080/5446
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2057: Better error message for incorrect avro decimal column declaration

2016-12-09 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-2057: Better error message for incorrect avro decimal 
column declaration
..


IMPALA-2057: Better error message for incorrect avro decimal column
declaration

Adding a better error message when logical type is specified at a wrong
level or is not not specified in an avro decimal column declaration.

Change-Id: Iad23706128223b6537d565471ef5d8faa91b0b5a
Reviewed-on: http://gerrit.cloudera.org:8080/5255
Reviewed-by: Bharath Vissapragada 
Tested-by: Internal Jenkins
---
M fe/src/main/java/org/apache/impala/util/AvroSchemaParser.java
1 file changed, 24 insertions(+), 18 deletions(-)

Approvals:
  Bharath Vissapragada: Looks good to me, approved
  Internal Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/5255
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Iad23706128223b6537d565471ef5d8faa91b0b5a
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 


[Impala-ASF-CR] IMPALA-2057: Better error message for incorrect avro decimal column declaration

2016-12-09 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-2057: Better error message for incorrect avro decimal 
column declaration
..


Patch Set 5: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5255
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iad23706128223b6537d565471ef5d8faa91b0b5a
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: No


[Impala-ASF-CR] Add pytest option and skip markers for tests that only run locally.

2016-12-09 Thread David Knupp (Code Review)
David Knupp has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/5446

Change subject: Add pytest option and skip markers for tests that only run 
locally.
..

Add pytest option and skip markers for tests that only run locally.

Add new skip marker to test_mt_dop.TestMtDopParquet.test_parquet_nested

Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f
---
M tests/common/skip.py
M tests/conftest.py
M tests/query_test/test_mt_dop.py
3 files changed, 11 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/5446/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5446
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Id4d6e46dc1e64ad20c727ccb19af7a9f3daf917f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-4638: Run queries with MT DOP through admission control.

2016-12-09 Thread Alex Behm (Code Review)
Alex Behm has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/5447

Change subject: IMPALA-4638: Run queries with MT_DOP through admission control.
..

IMPALA-4638: Run queries with MT_DOP through admission control.

The bug was a simple oversight: A TODO that was not addressed.

Testing: I examined query profiles before and after this
change with and without MT_DOP set. After this patch all
queries (with and without MT_DOP) go through admission control.

Change-Id: I3e05697a59e14617fb6a13ce8e0410820823fd3a
---
M be/src/scheduling/simple-scheduler.cc
1 file changed, 1 insertion(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/5447/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5447
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I3e05697a59e14617fb6a13ce8e0410820823fd3a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 


[Impala-ASF-CR] IMPALA-3524: Don't process spilled partitions with 0 probe rows

2016-12-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3524: Don't process spilled partitions with 0 probe rows
..


Patch Set 2:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/5389/2/be/src/exec/partitioned-hash-join-node.cc
File be/src/exec/partitioned-hash-join-node.cc:

Line 378: 
RETURN_IF_ERROR(input_partition_->build_partition()->build_rows()->PrepareForRead(
We don't even need to do this for many join modes. E.g. for INNER JOIN if the 
probe partition is empty, we can just discard the partition.

I think you actually implemented the optimisation for the harder join types 
(right outer join, etc) instead of the easier types (inner join, etc).


Line 387: runtime_profile_->AddInfoString(
The log message might be a bit confusing, since it's 

Maybe it's not worth exposing in the profile? Or exposing it as a simple 
counter.


Line 649:   if (output_unmatched_pos_ != -1) {
Can you factor out the two cases into different functions?


PS2, Line 651: DCHECK
DCHECK_EQ


Line 656: if (build_rows->num_rows() == build_rows->rows_returned()) {
Can't we skip this check and just look at eos below? BTS::GetNext() has the 
exact same check to return eos.  It's kind of weird that we're doing this and 
ignoring eos.


Line 666:   for (RowBatch::Iterator 
batch_iter(output_unmatched_batch_.get(), output_unmatched_pos_);
Can we just save the batch iterator instead of output_unmatched_pos_?


Line 671: if (join_op_ == TJoinOp::RIGHT_ANTI_JOIN) {
This output row creation is kind of subtle and duplicated - factor out into a 
helper function?


Line 691:   DCHECK(output_build_partitions_.empty());
Why is this true?


http://gerrit.cloudera.org:8080/#/c/5389/2/be/src/exec/partitioned-hash-join-node.h
File be/src/exec/partitioned-hash-join-node.h:

PS2, Line 460: output_unmatched_batch_
Normally we single-quote variable names in comments so they stand out a bit 
more. Many of the surrounding comments don't but we should do it for new ones.


Line 464:   std::unique_ptr output_unmatched_batch_;
We should mention that it's a batch from output_build_partitions_.front()


http://gerrit.cloudera.org:8080/#/c/5389/2/testdata/workloads/functional-query/queries/QueryTest/spilling.test
File testdata/workloads/functional-query/queries/QueryTest/spilling.test:

Line 631:  QUERY
Can you comment which join modes these are testing?


-- 
To view, visit http://gerrit.cloudera.org:8080/5389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I175b32dd9031e51218b38c37693ac3e31dfab47b
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Add a build flag for the undefined behavior sanitizer, aka "ubsan".

2016-12-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: Add a build flag for the undefined behavior sanitizer, aka 
"ubsan".
..


Patch Set 2:

If they're relying on twos-complement we could just cast to unsigned for the 
bit-manipulation, right?

-- 
To view, visit http://gerrit.cloudera.org:8080/5082
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I88c7234bd7c5eb7404490a0913d90470c10835e7
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4631: don't use floating point operations for time unit conversions

2016-12-09 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4631: don't use floating point operations for time unit 
conversions
..


Patch Set 5: Verified-1

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/639/

-- 
To view, visit http://gerrit.cloudera.org:8080/5434
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7237f579b201f5bd3930f66e9c2c8d700c37ffeb
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2605: Omit the sort and mini stress tests

2016-12-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-2605: Omit the sort and mini stress tests
..


Patch Set 2:

I don't think concurrent_select.py is that well suited without some adaption. 
You can run something like the below command on the mini-cluster but it takes 
quite a while to warm up. It would be suitable to run maybe in a post-commit 
job where we can let it run longer.

  ./tests/stress/concurrent_select.py --tpch-db tpch_parquet --max-queries=100

-- 
To view, visit http://gerrit.cloudera.org:8080/5401
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibd30abf8215415e0f2830b725e43b005daa2bb2d
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3989: Display skew warning for poorly formatted Parquet files

2016-12-09 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-3989: Display skew warning for poorly formatted Parquet 
files
..


Patch Set 4:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/5400/4/be/src/exec/hdfs-scan-node-base.h
File be/src/exec/hdfs-scan-node-base.h:

Line 75:   bool poorly_formatted_file_warned;
Presumably you did this to prevent excessive logging, but I think its 
unnecessary. Removing it won't result in excess logging because error messages 
are grouped by their error code (eg. in the shell you'll just see one instance 
of the message followed by '(1 of X similar)', and its better not to clutter up 
this struct.

If you're still worried about excessive logging, you could move the call to 
LogPoorlyFormattedParquetFileWarning to the end of NextRowGroup outside of the 
'while(true)'.

In fact, if you remove this, then LogPoorlyFormattedParquetFileWarning isn't 
really doing much and you could eliminate it and just do the logging inline in 
NextRowGroup.


http://gerrit.cloudera.org:8080/#/c/5400/4/common/thrift/generate_error_codes.py
File common/thrift/generate_error_codes.py:

PS4, Line 315:
Remove spaces, here and below.

Sorry, I know we're inconsistent about our Python style, but we try to mostly 
follow PEP8:
https://www.python.org/dev/peps/pep-0008/#indentation


http://gerrit.cloudera.org:8080/#/c/5400/4/tests/query_test/test_scanners.py
File tests/query_test/test_scanners.py:

PS4, Line 319: 'NumScannersWithMisalignedRowGroups'
NumScannersWithNoReads


PS4, Line 321: """
Move """ to next line.

Another place where we're inconsistent with our Python style:
https://www.python.org/dev/peps/pep-0008/#documentation-strings


PS4, Line 347: num_scanners_misaligned_row_groups
num_scanners_with_no_reads


PS4, Line 349: excuting
executing


PS4, Line 351: 'num_scanners_misaligned_row_groups'
num_scanners_with_no_reads


PS4, Line 352: """
Move to next line.


PS4, Line 362: 'NumScannersWithMisalignedRowGroups:
NumScannersWithNoReads


-- 
To view, visit http://gerrit.cloudera.org:8080/5400
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibf48d978383d73efdade733a892e795ebd53c76a
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4355: random query generator: modify statement execution flow to support DML

2016-12-09 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-4355: random query generator: modify statement execution 
flow to support DML
..


Patch Set 1:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/5387/1//COMMIT_MSG
Commit Message:

PS1, Line 11: which
wording: randomly choose a table on which to execute


PS1, Line 12:  copied to run the DML statement 
I'm not really following the logic. We randomly choose a table because we need 
that table copied?


Line 36:   modes. The first is DML_SETUP: this is a DML statement that needs to 
be
long line. By the way, did you know that there is an automatic way to wrap text 
in vim? set textwidth=72 for example, 
http://stackoverflow.com/questions/823754/how-can-i-wrap-text-to-some-length-in-vim


http://gerrit.cloudera.org:8080/#/c/5387/1/tests/comparison/discrepancy_searcher.py
File tests/comparison/discrepancy_searcher.py:

Line 635: # 1. Don't have to spend more time parsing SHOW CREATE TABLE 
output just to get it
This point is not very clear. Why do we need to parse anything, do we already 
have a table object passed in to this function?


Line 637: # 2. Ensure we copy the table-under-test exactly. This includes 
primary keys,
It's not really clear to me either. Isn't the table object that's passed in 
already exact?


Line 695: result = 
query_result_comparator.compare_query_results(table_copy_statement)
So we copy the entire table for every query that we want to execute? Won't this 
be slow, or are we planning to use small tables for testing?


http://gerrit.cloudera.org:8080/#/c/5387/1/tests/comparison/query.py
File tests/comparison/query.py:

Line 111: self.execution = StatementExecutionMode.SELECT_STATEMENT
Maybe better to set it to None by default, like other variables? This way, you 
won't forget to set the mode when creating the object.


http://gerrit.cloudera.org:8080/#/c/5387/1/tests/comparison/statement_generator.py
File tests/comparison/statement_generator.py:

Line 32: class InsertStatementGenerator(object):
Is the plan to also support all DML statements for this release?


PS1, Line 53: (_,
since you're using the _ variable, you should name it something else.


-- 
To view, visit http://gerrit.cloudera.org:8080/5387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4c63a2223185d0e056cc5713796772e5d1b8414
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4631: don't use floating point operations for time unit conversions

2016-12-09 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4631: don't use floating point operations for time unit 
conversions
..


Patch Set 5: Code-Review+2

Rebase

-- 
To view, visit http://gerrit.cloudera.org:8080/5434
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7237f579b201f5bd3930f66e9c2c8d700c37ffeb
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

2016-12-09 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
..


Patch Set 12: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers

2016-12-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#13).

Change subject: IMPALA-3200: Implement suballocator for splitting buffers
..

IMPALA-3200: Implement suballocator for splitting buffers

This is useful for situations like hash tables, where we want to
make multiple non-spillable allocations of variable size from buffer
pool memory and not incur the overhead of interacting with the global
buffer pool. The allocator subdivides buffers to service allocations
and uses a buddy allocation algorithm to merge freed allocations into
larger chunks. This helps avoid external fragmentation and is quite
effective at reusing memory given the typical doubling allocation
patterns of hash tables in partitioned aggs and joins.

Testing:
The allocator has fairly robust internal consistency checks via
assertions and unique_ptrs. Includes a unit test that exercises
various allocation patterns.

I tested porting hash tables over to allocate memory using an earlier
version of the suballocator, which worked well (was able to run a
wide range of queries successfully with good performance).

Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788
---
M be/src/common/names.h
M be/src/exec/hash-table.h
M be/src/runtime/bufferpool/CMakeLists.txt
A be/src/runtime/bufferpool/suballocator-test.cc
A be/src/runtime/bufferpool/suballocator.cc
A be/src/runtime/bufferpool/suballocator.h
6 files changed, 933 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/4715/13
-- 
To view, visit http://gerrit.cloudera.org:8080/4715
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers

2016-12-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3200: Implement suballocator for splitting buffers
..


Patch Set 13:

Rebased onto the change that moves bufferpool/ under runtime/

-- 
To view, visit http://gerrit.cloudera.org:8080/4715
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers

2016-12-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3200: Implement suballocator for splitting buffers
..


Patch Set 4:

(3 comments)

Sorry for the slow turnaround, got caught up in things.

http://gerrit.cloudera.org:8080/#/c/4715/4//COMMIT_MSG
Commit Message:

Line 14: larger chunks. This helps avoid fragmentation and is quite effective
> I think there are a few ways to get both what I am concerned about (buddy a
Gotcha.

I added the static_assert - that is a good idea.

I think the Atom idea is nice but I think we end up with the same problem with 
large buffers, since the buffer pool only supports powers-of-two.


http://gerrit.cloudera.org:8080/#/c/4715/11/be/src/bufferpool/suballocator-test.cc
File be/src/bufferpool/suballocator-test.cc:

Line 107:   // Make allocations smaller than the buffer size.
> What happened to the const?
The const was incorrect - I pushed a patchset without the fix. The problem is 
that GetUsedReservation() is a non-const method since it acquires the mutex.


http://gerrit.cloudera.org:8080/#/c/4715/4/be/src/bufferpool/suballocator.h
File be/src/bufferpool/suballocator.h:

Line 171:   Suballocation* prev_free_;
> Just so I'm clear, in the current system, a Suballocation may have several 
Yep, that's true. I realise that it's really not obvious why the lifetime of 
'buddy_' and 'prev_free_' is shorter than the unique_ptr. I documented that and 
then described the ownership cases in the class header.


-- 
To view, visit http://gerrit.cloudera.org:8080/4715
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3200: Implement suballocator for splitting buffers

2016-12-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#12).

Change subject: IMPALA-3200: Implement suballocator for splitting buffers
..

IMPALA-3200: Implement suballocator for splitting buffers

This is useful for situations like hash tables, where we want to
make multiple non-spillable allocations of variable size from buffer
pool memory and not incur the overhead of interacting with the global
buffer pool. The allocator subdivides buffers to service allocations
and uses a buddy allocation algorithm to merge freed allocations into
larger chunks. This helps avoid external fragmentation and is quite
effective at reusing memory given the typical doubling allocation
patterns of hash tables in partitioned aggs and joins.

Testing:
The allocator has fairly robust internal consistency checks via
assertions and unique_ptrs. Includes a unit test that exercises
various allocation patterns.

I tested porting hash tables over to allocate memory using an earlier
version of the suballocator, which worked well (was able to run a
wide range of queries successfully with good performance).

Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788
---
M be/src/bufferpool/CMakeLists.txt
A be/src/bufferpool/suballocator-test.cc
A be/src/bufferpool/suballocator.cc
A be/src/bufferpool/suballocator.h
M be/src/common/names.h
M be/src/exec/hash-table.h
6 files changed, 933 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/4715/12
-- 
To view, visit http://gerrit.cloudera.org:8080/4715
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8bfe0e429f67ad273f7c7d0816703a9e6c3da788
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4403: Implement SHOW RANGE PARTITIONS for Kudu tables

2016-12-09 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4403: Implement SHOW RANGE PARTITIONS for Kudu tables
..


Patch Set 11: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/5390
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idf5b2fdd02938a42fa59ec98884e4ac915dd1f65
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4403: Implement SHOW RANGE PARTITIONS for Kudu tables

2016-12-09 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-4403: Implement SHOW RANGE PARTITIONS for Kudu tables
..


Patch Set 11: Code-Review+1

Rebased, ran ParserTest, AnalyzeDDLTest, kudu_alter.test and 
kudu_partition_ddl.test locally.

-- 
To view, visit http://gerrit.cloudera.org:8080/5390
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idf5b2fdd02938a42fa59ec98884e4ac915dd1f65
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4631: don't use floating point operations for time unit conversions

2016-12-09 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4631: don't use floating point operations for time unit 
conversions
..


Patch Set 4: Verified-1

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/636/

-- 
To view, visit http://gerrit.cloudera.org:8080/5434
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7237f579b201f5bd3930f66e9c2c8d700c37ffeb
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: No


[Impala-ASF-CR] Bump Kudu server version to latest master (a70c905006)

2016-12-09 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: Bump Kudu server version to latest master (a70c905006)
..


Bump Kudu server version to latest master (a70c905006)

This also re-enabled kudu_alter.test, which was disabled in IMPALA-4628.

Change-Id: Ie5acdeffea7ed9a68ce0f48d1f68c6c922044704
Reviewed-on: http://gerrit.cloudera.org:8080/5427
Reviewed-by: Matthew Jacobs 
Tested-by: Internal Jenkins
---
M bin/impala-config.sh
M tests/query_test/test_kudu.py
2 files changed, 2 insertions(+), 3 deletions(-)

Approvals:
  Matthew Jacobs: Looks good to me, approved
  Internal Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/5427
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ie5acdeffea7ed9a68ce0f48d1f68c6c922044704
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 


[Impala-ASF-CR] Bump Kudu server version to latest master (a70c905006)

2016-12-09 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: Bump Kudu server version to latest master (a70c905006)
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5427
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie5acdeffea7ed9a68ce0f48d1f68c6c922044704
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4467: Add support for DML statements in stress test

2016-12-09 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-4467: Add support for DML statements in stress test
..


Patch Set 3: Code-Review+1

(5 comments)

http://gerrit.cloudera.org:8080/#/c/5093/2/tests/stress/concurrent_select.py
File tests/stress/concurrent_select.py:

PS2, Line 1417: 
> For performance, it's faster to check if an element exists in a set than a 
Done


PS2, Line 1464:   insert_query.modifies_table = False
> Not sure, is it always going to be the case that a column is updatable if a
It's possible updatale_column_names is a bad name. In any case, it currently 
returns all the columns that are not primary keys, very similar to your 
expression above.


PS2, Line 1466:   insert_query.name = "insert_{0}".format(table.name)
  :   insert_query.db_name = cursor.db_name
  :   insert_query.sql = (
> Done
That makes a notable difference. Thanks.


http://gerrit.cloudera.org:8080/#/c/5093/3/tests/stress/concurrent_select.py
File tests/stress/concurrent_select.py:

PS3, Line 1461: this will still be still
Some extra words here.


PS3, Line 1657: quereis
spelling: queries


-- 
To view, visit http://gerrit.cloudera.org:8080/5093
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4631: don't use floating point operations for time unit conversions

2016-12-09 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-4631: don't use floating point operations for time unit 
conversions
..


Patch Set 4: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/5434
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7237f579b201f5bd3930f66e9c2c8d700c37ffeb
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4630: remove debug webpage easter egg

2016-12-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has submitted this change and it was merged.

Change subject: IMPALA-4630: remove debug webpage easter egg
..


IMPALA-4630: remove debug webpage easter egg

Change-Id: Ic2b1eb876dcec71a56bf76ea5f045818c6cd9a78
Reviewed-on: http://gerrit.cloudera.org:8080/5429
Tested-by: Impala Public Jenkins
Reviewed-by: Tim Armstrong 
---
M www/common-footer.tmpl
1 file changed, 3 insertions(+), 1 deletion(-)

Approvals:
  Impala Public Jenkins: Verified
  Tim Armstrong: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/5429
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ic2b1eb876dcec71a56bf76ea5f045818c6cd9a78
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Martin Grund 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4630: remove debug webpage easter egg

2016-12-09 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4630: remove debug webpage easter egg
..


Patch Set 3: Code-Review+2

carry +2 from dan

-- 
To view, visit http://gerrit.cloudera.org:8080/5429
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic2b1eb876dcec71a56bf76ea5f045818c6cd9a78
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Martin Grund 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2057: Better error message for incorrect avro decimal column declaration

2016-12-09 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-2057: Better error message for incorrect avro decimal 
column declaration
..


Patch Set 5: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/5255
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iad23706128223b6537d565471ef5d8faa91b0b5a
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3989: Display skew warning for poorly formatted Parquet files

2016-12-09 Thread Attila Jeges (Code Review)
Attila Jeges has posted comments on this change.

Change subject: IMPALA-3989: Display skew warning for poorly formatted Parquet 
files
..


Patch Set 4:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/5400/3/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 175:   ADD_COUNTER(scan_node_->runtime_profile(), 
"NumScannersWithNoReads", TUnit::UNIT);
> I think the name here is confusing - it seems like its really counting the 
Done


http://gerrit.cloudera.org:8080/#/c/5400/3/be/src/exec/hdfs-scan-node-base.cc
File be/src/exec/hdfs-scan-node-base.cc:

Line 868: void HdfsScanNodeBase::LogPoorlyFormattedParquetFileWarning(const 
std::string& filename) {
> 'string&'
Done


http://gerrit.cloudera.org:8080/#/c/5400/3/be/src/exec/hdfs-scan-node-base.h
File be/src/exec/hdfs-scan-node-base.h:

Line 279:   void LogPoorlyFormattedParquetFileWarning(const std::string& 
filename);
> 'string&' instead of '&filename'
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5400
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibf48d978383d73efdade733a892e795ebd53c76a
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3989: Display skew warning for poorly formatted Parquet files

2016-12-09 Thread Attila Jeges (Code Review)
Attila Jeges has uploaded a new patch set (#4).

Change subject: IMPALA-3989: Display skew warning for poorly formatted Parquet 
files
..

IMPALA-3989: Display skew warning for poorly formatted Parquet files

Parquet files are scanned in the granularity of row groups. Each row
group belongs to one or more splits and each split is scanned by a
separate parquet scanner.

If some row groups span multiple splits, then we will most likely end
up seeing some scanners having remote reads and some scanners not
performing scans at all. This will attribute to skew across the
cluster where distribution of scans is uneven.

This change adds a counter (NumScannersWithNoReads) to the scan node's
runtime profile to track the number of parquet scanners that end up
doing no reads becuse of poor formatting. It also displays a warning
message when a misaligned row group is found.

Change-Id: Ibf48d978383d73efdade733a892e795ebd53c76a
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M common/thrift/generate_error_codes.py
M tests/query_test/test_scanners.py
6 files changed, 139 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/5400/4
-- 
To view, visit http://gerrit.cloudera.org:8080/5400
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibf48d978383d73efdade733a892e795ebd53c76a
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-3989: Display skew warning for poorly formatted Parquet files

2016-12-09 Thread Attila Jeges (Code Review)
Attila Jeges has uploaded a new patch set (#4).

Change subject: IMPALA-3989: Display skew warning for poorly formatted Parquet 
files
..

IMPALA-3989: Display skew warning for poorly formatted Parquet files

Parquet files are scanned in the granularity of row groups. Each row
group belongs to one or more splits and each split is scanned by a
separate parquet scanner.

If some row groups span multiple splits, then we will most likely end
up seeing some scanners having remote reads and some scanners not
performing scans at all. This will attribute to skew across the
cluster where distribution of scans is uneven.

This change adds a counter (NumScannersWithNoReads) to the scan node's
runtime profile to track the number of parquet scanners that end up
doing no reads becuse of poor formatting. It also displays a warning
message when a misaligned row group is found.

Change-Id: Ibf48d978383d73efdade733a892e795ebd53c76a
---
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M common/thrift/generate_error_codes.py
M tests/query_test/test_scanners.py
6 files changed, 139 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/5400/4
-- 
To view, visit http://gerrit.cloudera.org:8080/5400
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibf48d978383d73efdade733a892e795ebd53c76a
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

2016-12-09 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
..


Patch Set 12: Verified-1

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/632/

-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

2016-12-09 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
..


Patch Set 12: Code-Review+2

fixed merge accidents

-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

2016-12-09 Thread Marcel Kornacker (Code Review)
Hello Impala Public Jenkins, Internal Jenkins, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4418

to look at the new patch set (#12).

Change subject: IMPALA-4014: Introduce query-wide execution state.
..

IMPALA-4014: Introduce query-wide execution state.

This introduces a global structure to coordinate execution
of fragment instances on a backend for a single query.

New classes:
- QueryExecMgr: subsumes FragmentMgr
- QueryState
- FragmentInstanceState: replaces FragmentExecState

Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/exec/catalog-op-executor.cc
M be/src/exec/data-source-scan-node.cc
M be/src/exec/exchange-node.cc
M be/src/exec/hash-table-test.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/union-node.cc
M be/src/exprs/expr-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-tuple-stream-test.cc
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/data-stream-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
R be/src/runtime/fragment-instance-state.cc
A be/src/runtime/fragment-instance-state.h
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
A be/src/runtime/query-exec-mgr.cc
A be/src/runtime/query-exec-mgr.h
A be/src/runtime/query-state.cc
A be/src/runtime/query-state.h
M be/src/runtime/runtime-filter-bank.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/thread-resource-mgr.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/request-pool-service.cc
M be/src/scheduling/simple-scheduler.cc
M be/src/service/CMakeLists.txt
M be/src/service/fe-support.cc
D be/src/service/fragment-exec-state.h
D be/src/service/fragment-mgr.cc
D be/src/service/fragment-mgr.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
A be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-exec-state.cc
M be/src/service/query-exec-state.h
M be/src/testutil/desc-tbl-builder.h
M be/src/udf/udf.cc
M be/src/util/container-util.h
M be/src/util/thread.h
M be/src/util/uid-util.h
M common/thrift/ImpalaInternalService.thrift
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
M fe/src/test/java/org/apache/impala/service/FrontendTest.java
M fe/src/test/java/org/apache/impala/testutil/TestUtils.java
67 files changed, 1,178 insertions(+), 782 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/4418/12
-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil