[Impala-ASF-CR] IMPALA-3200: move bufferpool under runtime

2016-11-21 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-3200: move bufferpool under runtime
..


Patch Set 3: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5165
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3507066fb5cb955f0872d31a4619bd8bb7d647cd
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3200: move bufferpool under runtime

2016-11-21 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-3200: move bufferpool under runtime
..


IMPALA-3200: move bufferpool under runtime

It is arguably a subcomponent of the runtime system

Change-Id: I3507066fb5cb955f0872d31a4619bd8bb7d647cd
Reviewed-on: http://gerrit.cloudera.org:8080/5165
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins
---
M .clang-tidy
M be/CMakeLists.txt
M be/src/runtime/CMakeLists.txt
R be/src/runtime/bufferpool/CMakeLists.txt
R be/src/runtime/bufferpool/buffer-allocator.cc
R be/src/runtime/bufferpool/buffer-allocator.h
R be/src/runtime/bufferpool/buffer-pool-counters.h
R be/src/runtime/bufferpool/buffer-pool-test.cc
R be/src/runtime/bufferpool/buffer-pool.cc
R be/src/runtime/bufferpool/buffer-pool.h
R be/src/runtime/bufferpool/reservation-tracker-counters.h
R be/src/runtime/bufferpool/reservation-tracker-test.cc
R be/src/runtime/bufferpool/reservation-tracker.cc
R be/src/runtime/bufferpool/reservation-tracker.h
M be/src/runtime/mem-tracker.cc
15 files changed, 25 insertions(+), 27 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Tim Armstrong: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/5165
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I3507066fb5cb955f0872d31a4619bd8bb7d647cd
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-2890: Support ALTER TABLE statements for Kudu tables

2016-11-21 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-2890: Support ALTER TABLE statements for Kudu tables
..


Patch Set 2:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/5136/2/common/thrift/JniCatalog.thrift
File common/thrift/JniCatalog.thrift:

Line 185:   4: optional string location
> restore original numbers
Oops, sorry about that. Done


http://gerrit.cloudera.org:8080/#/c/5136/2/fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java
File 
fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java:

Line 88:   throw new AnalysisException("Key-value partition specs are not 
supported for " +
> how about "ALTER TABLE ADD PARTITION is not supported for Kudu tables." or 
Done


http://gerrit.cloudera.org:8080/#/c/5136/2/fe/src/main/java/org/apache/impala/analysis/AlterTableAddRangePartitionStmt.java
File 
fe/src/main/java/org/apache/impala/analysis/AlterTableAddRangePartitionStmt.java:

Line 50: if (ifNotExists_) {
> single line
Done


Line 79: return;
> remove
Done


http://gerrit.cloudera.org:8080/#/c/5136/2/fe/src/main/java/org/apache/impala/analysis/AlterTableAddReplaceColsStmt.java
File 
fe/src/main/java/org/apache/impala/analysis/AlterTableAddReplaceColsStmt.java:

Line 123: if (c.hasEncoding() || c.hasCompression() || 
c.hasBlockSize()) {
> Should we let Kudu handle this case and the one below? Does Kudu behave san
Unfortunately we can't do that in this case. There is no Kudu API call that 
accepts all these options, so we have to reject them here.


http://gerrit.cloudera.org:8080/#/c/5136/2/fe/src/main/java/org/apache/impala/analysis/AlterTableChangeColStmt.java
File fe/src/main/java/org/apache/impala/analysis/AlterTableChangeColStmt.java:

Line 103: if (newColDef_.hasKuduOptions()) {
> we cannot change the default value?
Currently not supported. They are working on changing this behavior but it's 
not there yet.


http://gerrit.cloudera.org:8080/#/c/5136/2/fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java
File 
fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java:

Line 79:   throw new AnalysisException("Key-value partition specs are not 
supported for " +
> ALTER TABLE DROP PARTITION is not supported for Kudu tables
Done


http://gerrit.cloudera.org:8080/#/c/5136/2/fe/src/main/java/org/apache/impala/analysis/AlterTableDropRangePartitionStmt.java
File 
fe/src/main/java/org/apache/impala/analysis/AlterTableDropRangePartitionStmt.java:

Line 32: public class AlterTableDropRangePartitionStmt extends AlterTableStmt {
> code seems very similar to AlterTableAddRangePartitionStmt, should we just 
Done


Line 77: return;
> remove
Done


http://gerrit.cloudera.org:8080/#/c/5136/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

Line 553:   case UPDATE_STATS:
> I think we should be able to avoid the copy+paste by adding an additional c
Good idea. Done


Line 2231:   KuduCatalogOpExecutor.renameTable((KuduTable) tbl,
> I don't think we should rename the Kudu table for external tables. An exter
Done


http://gerrit.cloudera.org:8080/#/c/5136/2/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java:

Line 1831: String[] addDrop = {"add if not exists", "drop if exists"};
> what about just "add" and "drop"
Added. Done


Line 1868: AnalysisError("alter table functional_kudu.testtbl add columns 
(a1 int)",
> How do we know a1 is going to be non-nullable at this point (in analysis)? 
Done


Line 1875: AnalysisError("alter table functional_kudu.testtbl add columns " 
+
> use a loop to cover the cases separately (if you decide to keep these check
As I mentioned in a previous comment, we can't defer these to Kudu (not 
supported API), so they have to be checked during the analysis. Added the tests 
here.


http://gerrit.cloudera.org:8080/#/c/5136/2/testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
File testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test:

Line 52:   distribute by range (id) (partition 1 < values <= 10) stored as kudu
> Test dropping all partitions and scanning/inserting into the empty table.
Done


Line 156: # Insert a row that doesn't have values for the new columns; defaults 
are used
> Sorry this doesn't make sense. I was thinking of ALTERing the default value
No worries, we will have to add tests like that when we have the ability to 
change default values.


Line 157: insert into tbl_to_alter (id,name,vali) values (3, 'test', 200)
> Can you try this same test but with NULLs for the new columns? The insert s
Done


Line 257: create table copy_of_tbl (a int primary key) distribute by hash (a) 
into 3 buckets
> new test for the external table "poi

[Impala-ASF-CR] IMPALA-2890: Support ALTER TABLE statements for Kudu tables

2016-11-21 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has uploaded a new patch set (#4).

Change subject: IMPALA-2890: Support ALTER TABLE statements for Kudu tables
..

IMPALA-2890: Support ALTER TABLE statements for Kudu tables

With this commit, we add support for additional ALTER TABLE statements
against Kudu tables. The new supported ALTER TABLE operations for Kudu are:
- ADD/DROP range partitions. Syntax:
ALTER TABLE  ADD [IF NOT EXISTS] RANGE 
ALTER TABLE  DROP [IF EXISTS] RANGE 
- ADD/DROP/RENAME column. Syntax:
ALTER TABLE  ADD COLUMNS (col_spec, [col_spec, ...])
ALTER TABLE  DROP COLUMN 
ALTER TABLE  CHANGE COLUMN   
- Rename Kudu table using the 'kudu.table_name' table property. Example:
  ALTER TABLE  SET TBLPROPERTY ('kudu.tbl_name'=''),
  will change the underlying Kudu table name to .
- Renaming the HMS/Catalog table entry of a Kudu table is supported using the
  existing ALTER TABLE  RENAME TO  syntax.

Not supported:
- ALTER TABLE  REPLACE COLUMNS

Change-Id: I04bc87e04e05da5cc03edec79d13cedfd2012896
---
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
A 
fe/src/main/java/org/apache/impala/analysis/AlterTableAddDropRangePartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddReplaceColsStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableChangeColStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableSetFileFormatStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableSetLocationStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/DistributeParam.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
M tests/query_test/test_kudu.py
20 files changed, 964 insertions(+), 59 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/36/5136/4
-- 
To view, visit http://gerrit.cloudera.org:8080/5136
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I04bc87e04e05da5cc03edec79d13cedfd2012896
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 


[Impala-ASF-CR] IMPALA-4511: Add missing total time counter() to PFE::Exec()

2016-11-21 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4511: Add missing total_time_counter() to PFE::Exec()
..


Patch Set 1: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5149
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifdd6c059bdf022303920720291ebb319a69d60ec
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4511: Add missing total time counter() to PFE::Exec()

2016-11-21 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4511: Add missing total_time_counter() to PFE::Exec()
..


IMPALA-4511: Add missing total_time_counter() to PFE::Exec()

Also add sanity check in debug build to catch this kind of bug in the
future.

Change-Id: Ifdd6c059bdf022303920720291ebb319a69d60ec
Reviewed-on: http://gerrit.cloudera.org:8080/5149
Reviewed-by: Dan Hecht 
Tested-by: Internal Jenkins
---
M be/src/runtime/plan-fragment-executor.cc
1 file changed, 11 insertions(+), 0 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Dan Hecht: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/5149
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ifdd6c059bdf022303920720291ebb319a69d60ec
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins


[Impala-ASF-CR] IMPALA-4432: Handle internal codegen disabling properly

2016-11-21 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4432: Handle internal codegen disabling properly
..


Patch Set 7: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/5105/7/be/src/runtime/runtime-state.h
File be/src/runtime/runtime-state.h:

PS7, Line 168: The 
This?


PS7, Line 195: or there are expressions in the fragment
 :   ///which cannot be interpreted.
on second thought, maybe #2 makes more sense if worded like you had it: "it's 
not disabled by the internal hint or ..." given how the second clause works. up 
to you if you want to revert.


-- 
To view, visit http://gerrit.cloudera.org:8080/5105
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0b6a9ed723c64ba21b861608583cc9b6607d3397
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3342: Add thread counters to monitor plan fragment execution

2016-11-21 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3342: Add thread counters to monitor plan fragment 
execution
..


Patch Set 10: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4633
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifa88aa6f3371fa42d11ecc122f43c7d83623c300
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yonghyun Hwang
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4172/IMPALA-3653: Improvements to block metadata loading

2016-11-21 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
..


Patch Set 2:

(18 comments)

High-level comments. Let's get through them first and then I can dig in deeper.

http://gerrit.cloudera.org:8080/#/c/5148/2/fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java
File fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java:

Line 27:  * This class encapsulates the logic required for mapping HDFS
Much fluff, simplify to: Maps HDFS storage-UUIDs to per-host 0-based, 
sequential disk ids required by Impala backends.


Line 29:  * by the backend.
required by Impala backends


Line 31: public class DiskIdMapper {
Make this a singleton to clarify the intended use.


Line 36: // This is intentionally implemented as a static object to share 
the storageIDs across
move this part to the class comment


Line 38: // consistent across all the tables, without which, we can 
potentially confuse the
also mention that the static solution requires the least amount of memory


Line 48: private static ConcurrentHashMap
Don't think we need a ConcurrentHashMap or AtomicInteger here, because whenever 
we update this entry, we also need to protect storageIdToInt, so we need a 
coarser-grained lock.


Line 53:  * TODO (Bharath): This method is still prone to some race 
conditions. Fix it before
Correct. I don't think the ConcurrentHashMaps will help us because we need to 
atomically update both maps sometimes.


Line 71: storageIdGenerator.put(host, new AtomicInteger(diskId));
Couple of races here like you already pointed out.


http://gerrit.cloudera.org:8080/#/c/5148/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

Line 294:   // Keeps track of all the parition paths that are filtered to add 
block metadata.
this comment style for classes and functions

/**
 *
 */


Line 368: if 
(!pathMd.filteredParitionPaths.contains(fileStatus.getPath())) continue;
Looks like this contains() and the indexOf() below can become very expensive.

Instead of the 3 arrays in pathMd we should consider using a HashMap from path 
to a class that contains the relevant metadata (FD and format).


Line 423: for (int i=0; i < fileMd.pathFds.size(); ++i) {
i = 0 (add spaces)


Line 458:   // Only build the storage IDs fs instances that support the 
BlockLocation API.
... for fs instances ...


Line 720:   private void loadAllPartitions(
After reading through this a few times I'm still confused about the high-level 
code flow. It would be great to summarize the high-level steps in the function 
comment.

My understanding is that we do something like this:
1. createPartition() for every partition; that populates pathMd; internally 
createPartition() lists all files in the corresponding partition directory
2. load the block location metadata based on the contents of pathMd

The process above seems a bit backwards to me. In the first step we list all 
files under the partition's directory... and then we do the same thing again in 
step 2.

Intuitively, I was thinking it should look something like this:

1. collect all dirs to call DFS.listFiles() on
2. for each dir, call listFiles(), examine the contents and map them to an 
msPartition. Based on that complete info do some filtering/checks and finally 
createPartition().

Another chat would probably help me understand the ideas here better.


Line 777: if (dirsToLoad.contains(partDir) ||
positive is more readable to me

if (!dirsToLoad.contains() && isChildPath()) {
  // Partition with a non-default data directory.
  dirsToLoad.add();
}


http://gerrit.cloudera.org:8080/#/c/5148/2/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
File fe/src/main/java/org/apache/impala/common/FileSystemUtil.java:

Line 275:* Returns true if the filesystem supports BlockLocation API.
What API is it specifically? LocatedFileStatus.getBlockLocations()?


Line 277:   public static boolean supportsBlockLocationAPI(FileSystem fs) {
supportsBlockLocationApi


Line 406:   public static boolean isPathChild(Path p1, Path p2) {
isChildPath(Path p, Path parent)


Line 408: Path parentDir = p1.getParent();
I think this can be improved by using the Path.getDepth(). You can do 
getParent() a fixed number of times and then do one equals() comparison.

(Basically you walk up the path starting at p1 until the parentDir is at the 
same depth as p2)


-- 
To view, visit http://gerrit.cloudera.org:8080/5148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4494: Fix crash in SimpleScheduler

2016-11-21 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4494: Fix crash in SimpleScheduler
..


Patch Set 6: Code-Review+1

(1 comment)

Thanks for your patience, I think this is looking good.

http://gerrit.cloudera.org:8080/#/c/5127/6/be/src/scheduling/simple-scheduler.cc
File be/src/scheduling/simple-scheduler.cc:

Line 250: << be_desc.address;
Could use continue; here to be consistent with the above checks.


-- 
To view, visit http://gerrit.cloudera.org:8080/5127
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6e1196a2fa47e5954c4a190aa326c135d039a77f
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Preview: IMPALA-4467: Add support for CRUD operations in stress test

2016-11-21 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#2).

Change subject: Preview: IMPALA-4467: Add support for CRUD operations in stress 
test
..

Preview: IMPALA-4467: Add support for CRUD operations in stress test

NOTE: This is a preview, I'm only looking for high level comments about
the overall design of the feature.

- Added support for upsert and delete statements. Update and insert
  still need to be done.
- Added support for compute stats.
- Added support for query options to queries.

Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40
---
M tests/stress/concurrent_select.py
M tests/util/calculation_util.py
2 files changed, 254 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/5093/2
-- 
To view, visit http://gerrit.cloudera.org:8080/5093
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-4516: Don't hold process wide lock connection to sessions map lock while cancelling queries

2016-11-21 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4516: Don't hold process wide lock 
connection_to_sessions_map_lock_ while cancelling queries
..


Patch Set 1:

Change looks safe to me. We should update the comment in impala-server.h to 
reflect that it's now a terminal lock

-- 
To view, visit http://gerrit.cloudera.org:8080/5173
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9fe37955027ad9fec3fdbbbcb199245c79bcac71
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3200: move bufferpool under runtime

2016-11-21 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-3200: move bufferpool under runtime
..


Patch Set 3:

Build started: 
http://ec2-35-161-220-160.us-west-2.compute.amazonaws.com:8080/job/gerrit-verify-dryrun/38/

-- 
To view, visit http://gerrit.cloudera.org:8080/5165
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3507066fb5cb955f0872d31a4619bd8bb7d647cd
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4516: Don't hold process wide lock connection to sessions map lock while cancelling queries

2016-11-21 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-4516: Don't hold process wide lock 
connection_to_sessions_map_lock_ while cancelling queries
..


Patch Set 1:

Stress test running successfully for 5 hours so far.

-- 
To view, visit http://gerrit.cloudera.org:8080/5173
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9fe37955027ad9fec3fdbbbcb199245c79bcac71
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4516: Don't hold process wide lock connection to sessions map lock while cancelling queries

2016-11-21 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/5173

Change subject: IMPALA-4516: Don't hold process wide lock 
connection_to_sessions_map_lock_ while cancelling queries
..

IMPALA-4516: Don't hold process wide lock connection_to_sessions_map_lock_ 
while cancelling queries

We hold the connection_to_sessions_map_lock_ while closing multiple
sessions, which could map to a large number of queries, which means an
even larger number of fragments. We hold this process wide lock and a
series of other locks while sending cancel RPCs to all the fragments
that fall under the above mentioned category.

This could slow down the responsiveness to the client by the daemon.
Moreover, holding the lock is unnecessary and we can do without it.

This patch fixes this path by dropping the lock before we call
CloseSessionInternal().

Change-Id: I9fe37955027ad9fec3fdbbbcb199245c79bcac71
---
M be/src/service/impala-server.cc
1 file changed, 18 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/73/5173/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5173
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I9fe37955027ad9fec3fdbbbcb199245c79bcac71
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-4431: Add audit event log control mechanism to prevent disk overflow

2016-11-21 Thread Donghui Xu (Code Review)
Donghui Xu has posted comments on this change.

Change subject: IMPALA-4431: Add audit event log control mechanism to prevent 
disk  overflow
..


Patch Set 11:

(3 comments)

I have modified the code according to your opinion. The minimum number of audit 
event log file should be 1 and 0 represents no limit.

http://gerrit.cloudera.org:8080/#/c/4971/10/be/src/common/logging.cc
File be/src/common/logging.cc:

Line 181:   if (max_log_files < 1) return;
> Why do we not allow 1 here? Is there a particular reason why it wouldn't wo
Done


PS10, Line 182: Check audit event log files
  :   // Build glob pattern for input e.g. 
/tmp/impala_audit_event_log_1.0-*
  :   string fname = strings::Substitute(
> Maybe some of this could go on fewer lines.
Done


Line 188: }
> nit: only one newline here.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4971
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8c3229cbdb6275f969c15258c9ccab6efeb24368
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4431: Add audit event log control mechanism to prevent disk overflow

2016-11-21 Thread Donghui Xu (Code Review)
Donghui Xu has uploaded a new patch set (#11).

Change subject: IMPALA-4431: Add audit event log control mechanism to prevent 
disk  overflow
..

IMPALA-4431: Add audit event log control mechanism to prevent disk 
overflow

There is no limit to the number of audit event log files. When audit
event log is enabled the growing number of log files will fill up the
disk space.

This patch adds checking and rotation mechanism on audit event log
files to prevent file number out of control which causes disk overflow.

Change-Id: I8c3229cbdb6275f969c15258c9ccab6efeb24368
---
M be/src/common/init.cc
M be/src/common/logging.cc
M be/src/common/logging.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
5 files changed, 36 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/4971/11
-- 
To view, visit http://gerrit.cloudera.org:8080/4971
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8c3229cbdb6275f969c15258c9ccab6efeb24368
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-2523: Make HdfsTableSink aware of clustered input

2016-11-21 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-2523: Make HdfsTableSink aware of clustered input
..


IMPALA-2523: Make HdfsTableSink aware of clustered input

IMPALA-2521 introduced clustering for insert statements. This change
makes the HdfsTableSink aware of clustered inputs, so that partitions
are opened, written, and closed one by one.

This change also adds/modifies tests in several ways:

- clustered insert tests switch from selecting all rows from
  alltypessmall to alltypes. Together with varying settings for
  batch_size, this results in a larger number of row batches being
  written.
- clustered insert tests select from alltypes instead of
  functional.alltypes to make sure we also select from various input
  formats.
- clustered insert tests have been added to select from alltypestiny to
  create inserts with 1 and 2 rows per partition respectively.
- exhaustive insert tests now use different values for batch_size: 1,
  16, 0 (meaning default, 1024). This is limited to uncompressed parquet
  files, to maintain a reasonable runtime. On my machine execution of
  test.insert took 1778 seconds, compared to 1002 seconds with the just
  default row batch size.
- There is additional testing in test_insert_behaviour.py to make sure
  that insertion over several row batches only creates one file per
  partition.
- It renames the test_insert method to make it unique in the file and
  allow for effective filtering with -k.
- It adds tests to the Analyzer test suite.

Change-Id: Ibeda0bdabbfe44c8ac95bf7c982a75649e1b82d0
Reviewed-on: http://gerrit.cloudera.org:8080/4863
Reviewed-by: Lars Volker 
Reviewed-by: Tim Armstrong 
Tested-by: Internal Jenkins
---
M be/src/exec/hbase-table-sink.cc
M be/src/exec/hbase-table-writer.cc
M be/src/exec/hbase-table-writer.h
M be/src/exec/hdfs-avro-table-writer.cc
M be/src/exec/hdfs-avro-table-writer.h
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/hdfs-parquet-table-writer.h
M be/src/exec/hdfs-sequence-table-writer.cc
M be/src/exec/hdfs-sequence-table-writer.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/hdfs-table-writer.cc
M be/src/exec/hdfs-table-writer.h
M be/src/exec/hdfs-text-table-writer.cc
M be/src/exec/hdfs-text-table-writer.h
M bin/impala-config.sh
M common/thrift/DataSinks.thrift
M fe/src/main/java/org/apache/impala/analysis/DeleteStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/UpdateStmt.java
M fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java
M fe/src/main/java/org/apache/impala/planner/TableSink.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M testdata/workloads/functional-query/queries/QueryTest/insert.test
M tests/query_test/test_insert.py
M tests/query_test/test_insert_behaviour.py
26 files changed, 457 insertions(+), 161 deletions(-)

Approvals:
  Lars Volker: Looks good to me, but someone else must approve
  Internal Jenkins: Verified
  Tim Armstrong: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/4863
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ibeda0bdabbfe44c8ac95bf7c982a75649e1b82d0
Gerrit-PatchSet: 20
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Ivanfi 


[Impala-ASF-CR] IMPALA-2523: Make HdfsTableSink aware of clustered input

2016-11-21 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-2523: Make HdfsTableSink aware of clustered input
..


Patch Set 19: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4863
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibeda0bdabbfe44c8ac95bf7c982a75649e1b82d0
Gerrit-PatchSet: 19
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4172/IMPALA-3653: Improvements to block metadata loading

2016-11-21 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
..


Patch Set 2:

Moved disk-id-mapping logic to a new class as discussed with Alex. getDiskId() 
still is prone to some race conditions, so I added a TODO. I just sent out the 
code for review while I'm working on it. That particular change is siloed and 
shouldn't affect rest of the review.

-- 
To view, visit http://gerrit.cloudera.org:8080/5148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4172/IMPALA-3653: Improvements to block metadata loading

2016-11-21 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has uploaded a new patch set (#2).

Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
..

IMPALA-4172/IMPALA-3653: Improvements to block metadata loading

This patch improves the block metadata loading (locations and disk
storage IDs) for partitioned and un-partitioned tables in the Catalog
server.

Without this patch:
--
We loop throuh each and every file in the table/partition directories
and call getFileBlockLocations() on it to obtain the block metadata.
This results in large no. of RPC calls to namenode, especially with
tables with large no. of files/partitions.

With this patch:
---
We move the block metadata querying to use listStatus() call which
accepts a directory as input and fetches the 'BlockLocation' objects
for every file recursively in that directory. This improves the
metadata loading in the following ways.

- For non-partitioned tables, we query all the BlockLocations in a
single RPC call in the base table directory and load the corresponding
disk IDs.

- For partitioned tables, we query the BlockLocations for all the
partitions residing under the base table directories in a single RPC
and then load every partition with non-default partition directory
separately.

Also, this patch does away with VolumeIds returned by the HDFS NN
and uses the new StorageIDs returned by the BlockLocation class.
These StorageIDs are UUID strings and hence are mapped to a
per-node 0-based index as expected by the backend. In the upcoming
versions of Hadoop APIs, getFileBlockStorageLocations() is deprecated
and instead the listStatus() returns BlockLocations with storage IDs
embedded. This patch makes use to improvement to reduce an addition
RPC to NN to fetch the storage locations.

Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26
---
A fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
3 files changed, 304 insertions(+), 181 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/5148/2
-- 
To view, visit http://gerrit.cloudera.org:8080/5148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 


[Impala-ASF-CR] IMPALA-4432: Handle internal codegen disabling properly

2016-11-21 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-4432: Handle internal codegen disabling properly
..


Patch Set 6:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/5105/6/be/src/runtime/runtime-state.h
File be/src/runtime/runtime-state.h:

PS6, Line 169: HasScalarFnToCodegen
> Maybe ScalarFnRequiresCodegen() or similar? i.e. to make it clear that not 
Done


Line 171:   /// Returns true if there is hint to disable codegen. This can be 
true for single node
> "a"
Done


Line 180:   /// of the hint and all expressions in the fragment can be 
interpreted.
> given that this depends on the scalar_fns_to_codegen_ state which isn't que
Done. The exact point in which scalar_fn_to_codegen_ is usable may change in an 
upcoming change.


PS6, Line 191: followings
> following.
Done


Line 194:   ///which cannot be interpreted.
> I think this would be clearer without the double negative. eg.:
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5105
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0b6a9ed723c64ba21b861608583cc9b6607d3397
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4432: Handle internal codegen disabling properly

2016-11-21 Thread Michael Ho (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5105

to look at the new patch set (#7).

Change subject: IMPALA-4432: Handle internal codegen disabling properly
..

IMPALA-4432: Handle internal codegen disabling properly

There are some conditions in which codegen is disabled internally
even if it's enabled in the query option. For instance, the single
node optimization or the expression evaluation requests sent from
the FE to the BE. These internal disabling of codegen are advisory
as their purposes are to reduce the latency for tables with no or
very few rows. The internal disabling of codegen doesn't interact
well with UDFs which cannot be interpreted (e.g. IR UDF) as it
conflates the 'disable_codegen' query option set by the user.
As a result, it's hard to differentiate between when codegen is
disabled explicitly by users and when it is disabled internally.

This change fixes the problem above by adding an explicit flag in
TQueryCtx to indicate that codegen is disabled internally. This flag
is only advisory. For cases in which codegen is needed to function,
this internal flag is ignored and if codegen is disabled via query
option, an error is thrown. For this new flag to work with ScalarFnCall,
codegen needs to happen after ScalarFnCall::Prepare() because it's
hard to tell if a fragment contains any UDF that cannot be interpreted
until after ScalarFnCall::Prepare() is called. However, Prepare() needs
the codegen object to codegen so it needs to be created before Prepare().
We can either always create the codegen module or defer codegen to a point
after ScalarFnCall::Prepare(). The former has the downside of introducing
unnecessary latency for say single-node optimization so the latter is
implemented. It is needed as part of IMPALA-4192 any way.

After this change, ScalarFnCall expressions which need to be codegen'd
are inserted into a vector in RuntimeState in ScalarFnCall::Prepare().
Later in the codegen phase, these expressions' GetCodegendComputeFn()
will be called after codegen for operators is done. If any of these
expressions are already codegen'd indirectly by the operators,
GetCodegendComputeFn() will be a no-op. This preserves the behavior
that ScalarFnCall will always be codegen'd even if the fragment
doesn't contain any codegen enabled operators.

Change-Id: I0b6a9ed723c64ba21b861608583cc9b6607d3397
---
M be/src/exec/aggregation-node.cc
M be/src/exec/exchange-node.cc
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/hash-join-node.cc
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-sequence-scanner.cc
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/sort-node.cc
M be/src/exec/topn-node.cc
M be/src/exprs/scalar-fn-call.cc
M be/src/exprs/scalar-fn-call.h
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/service/fe-support.cc
M common/thrift/ImpalaInternalService.thrift
M fe/src/main/java/org/apache/impala/planner/Planner.java
M tests/query_test/test_udfs.py
24 files changed, 180 insertions(+), 117 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/5105/7
-- 
To view, visit http://gerrit.cloudera.org:8080/5105
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0b6a9ed723c64ba21b861608583cc9b6607d3397
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-1430: enable codegen for native UDAs

2016-11-21 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-1430: enable codegen for native UDAs
..


Patch Set 4:

Rebased onto Michael's IMPALA-4432 commit

-- 
To view, visit http://gerrit.cloudera.org:8080/5161
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1430: enable codegen for native UDAs

2016-11-21 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#4).

Change subject: IMPALA-1430: enable codegen for native UDAs
..

IMPALA-1430: enable codegen for native UDAs

This uses the existing infrastructure for codegening builtin UDAs and
for codegening calls to UDFs. GetUdf() is refactored to support both
UDFs and UDAs.

IR UDAs are still not allowed by the frontend. It's unclear if we want
to enable them going forward because of the difficulties in testing and
supporting IR UDFs/UDAs.

Testing:
test_udfs.py tests UDAs with codegen enabled and disabled

Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f
---
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/codegen-anyval.h
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exprs/agg-fn-evaluator.cc
M be/src/exprs/agg-fn-evaluator.h
M be/src/exprs/anyval-util.cc
M be/src/exprs/anyval-util.h
M be/src/exprs/expr.cc
M be/src/exprs/expr.h
M be/src/exprs/scalar-fn-call.cc
M be/src/exprs/scalar-fn-call.h
11 files changed, 158 insertions(+), 124 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/5161/4
-- 
To view, visit http://gerrit.cloudera.org:8080/5161
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id1708eaa96eb76fb9bec5eeabf209f81c88eec2f
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Michael Ho


[Impala-ASF-CR] IMPALA-4283: Ensure Kudu-specific lineage and audit behavior

2016-11-21 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4283: Ensure Kudu-specific lineage and audit behavior
..


Patch Set 1:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/5151/1/fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java
File fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java:

Line 689: if (mentionedColumns.isEmpty()) {
Can we reverse the logic as follows:

if (tbl instanceof KuduTable) {
  List mentionedColumns = insertStmt.getMentionedColumns();
  Preconditions.checkState(!mentionedColumns.isEmpty());
  ...
} else {
  ...
}

That way the Kudu-specific nature seems clearer.

Once this logic is reversed, it almost seems simpler to inline this at the 
single caller (which already has a similar instanceof KuduTable check).


http://gerrit.cloudera.org:8080/#/c/5151/1/fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
File fe/src/main/java/org/apache/impala/analysis/InsertStmt.java:

Line 795: for (int i = 0; i < resultExprs_.size(); ++i) {
We don't really need resultExprs_ right?

for (Integer i: mentionedColumns_) {
  result.add(columns.get(i).getName());
}


http://gerrit.cloudera.org:8080/#/c/5151/1/fe/src/main/java/org/apache/impala/planner/Planner.java
File fe/src/main/java/org/apache/impala/planner/Planner.java:

Line 186:   if (ctx_.isUpdateOrDelete()) return fragments;
Does this line do anything? I though we established that this is an INSERT or 
CTAS in L180.

I think we need to move this check somewhere else.


http://gerrit.cloudera.org:8080/#/c/5151/1/fe/src/test/java/org/apache/impala/analysis/AuditingTest.java
File fe/src/test/java/org/apache/impala/analysis/AuditingTest.java:

Line 374:   public void TestKuduStatements() throws AuthorizationException, 
AnalysisException {
nice!


http://gerrit.cloudera.org:8080/#/c/5151/1/testdata/workloads/functional-planner/queries/PlannerTest/lineage.test
File testdata/workloads/functional-planner/queries/PlannerTest/lineage.test:

Line 4656: # Insert into a Kudu table with a column list specified
Upsert


Line 4661: "queryText":"insert into functional_kudu.alltypes (int_col, id) 
select int_col, id from\nfunctional.alltypes where id < 10",
upsert?


Line 4925: 
What happens for DELETE and UPDATE statements? Might be worth adding a test.


-- 
To view, visit http://gerrit.cloudera.org:8080/5151
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idc4ca1cd63bcfa4370c240a5c4a4126ed6704f4d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Alex Behm 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4432: Handle internal codegen disabling properly

2016-11-21 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4432: Handle internal codegen disabling properly
..


Patch Set 6: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5105/6/be/src/runtime/runtime-state.h
File be/src/runtime/runtime-state.h:

PS6, Line 191: followings
following.


-- 
To view, visit http://gerrit.cloudera.org:8080/5105
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0b6a9ed723c64ba21b861608583cc9b6607d3397
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3200: move bufferpool under runtime

2016-11-21 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3200: move bufferpool under runtime
..


Patch Set 3: Code-Review+2

The Cmake script didn't handle be tests living in a subdirectory - fixed it.

-- 
To view, visit http://gerrit.cloudera.org:8080/5165
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3507066fb5cb955f0872d31a4619bd8bb7d647cd
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3200: move bufferpool under runtime

2016-11-21 Thread Tim Armstrong (Code Review)
Hello Marcel Kornacker, Internal Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5165

to look at the new patch set (#3).

Change subject: IMPALA-3200: move bufferpool under runtime
..

IMPALA-3200: move bufferpool under runtime

It is arguably a subcomponent of the runtime system

Change-Id: I3507066fb5cb955f0872d31a4619bd8bb7d647cd
---
M .clang-tidy
M be/CMakeLists.txt
M be/src/runtime/CMakeLists.txt
R be/src/runtime/bufferpool/CMakeLists.txt
R be/src/runtime/bufferpool/buffer-allocator.cc
R be/src/runtime/bufferpool/buffer-allocator.h
R be/src/runtime/bufferpool/buffer-pool-counters.h
R be/src/runtime/bufferpool/buffer-pool-test.cc
R be/src/runtime/bufferpool/buffer-pool.cc
R be/src/runtime/bufferpool/buffer-pool.h
R be/src/runtime/bufferpool/reservation-tracker-counters.h
R be/src/runtime/bufferpool/reservation-tracker-test.cc
R be/src/runtime/bufferpool/reservation-tracker.cc
R be/src/runtime/bufferpool/reservation-tracker.h
M be/src/runtime/mem-tracker.cc
15 files changed, 25 insertions(+), 27 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/5165/3
-- 
To view, visit http://gerrit.cloudera.org:8080/5165
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3507066fb5cb955f0872d31a4619bd8bb7d647cd
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4502: test partition ddl predicates breaks on non-HDFS filesystems

2016-11-21 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4502: test_partition_ddl_predicates breaks on non-HDFS 
filesystems
..


Patch Set 3: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5164
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8606aa427cb6e50be3395cdde246abb53db5172c
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4502: test partition ddl predicates breaks on non-HDFS filesystems

2016-11-21 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4502: test_partition_ddl_predicates breaks on non-HDFS 
filesystems
..


IMPALA-4502: test_partition_ddl_predicates breaks on non-HDFS filesystems

This is because that test uses 'set cached' and 'set uncached' which
are not supported on non-HDFS filesystems. This patch creates a
separate test file for non-HDFS filesystems with only supported
queries and invokes the right file based on the filesystem.

Change-Id: I8606aa427cb6e50be3395cdde246abb53db5172c
Reviewed-on: http://gerrit.cloudera.org:8080/5164
Reviewed-by: Sailesh Mukil 
Tested-by: Internal Jenkins
---
R 
testdata/workloads/functional-query/queries/QueryTest/partition-ddl-predicates-all-fs.test
A 
testdata/workloads/functional-query/queries/QueryTest/partition-ddl-predicates-hdfs-only.test
M tests/metadata/test_ddl.py
3 files changed, 43 insertions(+), 23 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Sailesh Mukil: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/5164
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I8606aa427cb6e50be3395cdde246abb53db5172c
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-3200: move bufferpool under runtime

2016-11-21 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3200: move bufferpool under runtime
..


Patch Set 2: Verified-1

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/512/

-- 
To view, visit http://gerrit.cloudera.org:8080/5165
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3507066fb5cb955f0872d31a4619bd8bb7d647cd
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4478: Initial Kudu client mem tracking for sink

2016-11-21 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4478: Initial Kudu client mem tracking for sink
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5152
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I47f17a81e4362ab490019382fedc66c25f07080a
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4478: Initial Kudu client mem tracking for sink

2016-11-21 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4478: Initial Kudu client mem tracking for sink
..


IMPALA-4478: Initial Kudu client mem tracking for sink

The Kudu client allocates memory which is not tracked by
Impala. There are several sources, but the most significant
is the memory allocated by the KuduSession on the write
path. This can be >100MB, so it is important to track to
avoid OOM.

Moving forward, we should have a better way to track Kudu
client memory, but for now we must at least handle this
potentially problematic case.

This changes the KuduTableSink to consume 200MB which should
be enough for the 100MB write mutation buffer as well as
100MB worth of errors buffered in the client before Impala
takes ownership of them (and deletes them). This is left as
a flag because it may turn out to be too high for some users
and too low for others. When we have better support from
Kudu (including KUDU-1752), we should simplify this.

TODO: Handle DML w/ small or known resource requirements
(e.g. VALUES specified or query has LIMIT) specially to
avoid over-consumption.

Testing: Have verified acceptable behavior in the stress
test with a simple workload containing DML statements of
moderate cardinality.

Change-Id: I47f17a81e4362ab490019382fedc66c25f07080a
Reviewed-on: http://gerrit.cloudera.org:8080/5152
Reviewed-by: Matthew Jacobs 
Reviewed-by: Dan Hecht 
Tested-by: Internal Jenkins
---
M be/src/exec/kudu-table-sink.cc
1 file changed, 37 insertions(+), 3 deletions(-)

Approvals:
  Matthew Jacobs: Looks good to me, approved
  Internal Jenkins: Verified
  Dan Hecht: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/5152
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I47f17a81e4362ab490019382fedc66c25f07080a
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-2890: Support ALTER TABLE statements for Kudu tables

2016-11-21 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-2890: Support ALTER TABLE statements for Kudu tables
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/5136/2/fe/src/main/java/org/apache/impala/analysis/AlterTableChangeColStmt.java
File fe/src/main/java/org/apache/impala/analysis/AlterTableChangeColStmt.java:

Line 103: if (newColDef_.hasKuduOptions()) {
we cannot change the default value?


http://gerrit.cloudera.org:8080/#/c/5136/2/testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
File testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test:

Line 156: # Insert a row that doesn't have values for the new columns; defaults 
are used
> New test somewhere:
Sorry this doesn't make sense. I was thinking of ALTERing the default value and 
then checking the old and new rows. Looks like that is currently not possible.


-- 
To view, visit http://gerrit.cloudera.org:8080/5136
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I04bc87e04e05da5cc03edec79d13cedfd2012896
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2890: Support ALTER TABLE statements for Kudu tables

2016-11-21 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-2890: Support ALTER TABLE statements for Kudu tables
..


Patch Set 2:

(18 comments)

http://gerrit.cloudera.org:8080/#/c/5136/2/common/thrift/JniCatalog.thrift
File common/thrift/JniCatalog.thrift:

Line 185:   4: optional string location
restore original numbers


http://gerrit.cloudera.org:8080/#/c/5136/2/fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java
File 
fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java:

Line 88:   throw new AnalysisException("Key-value partition specs are not 
supported for " +
how about "ALTER TABLE ADD PARTITION is not supported for Kudu tables." or 
something like that


http://gerrit.cloudera.org:8080/#/c/5136/2/fe/src/main/java/org/apache/impala/analysis/AlterTableAddRangePartitionStmt.java
File 
fe/src/main/java/org/apache/impala/analysis/AlterTableAddRangePartitionStmt.java:

Line 50: if (ifNotExists_) {
single line


Line 79: return;
remove


http://gerrit.cloudera.org:8080/#/c/5136/2/fe/src/main/java/org/apache/impala/analysis/AlterTableAddReplaceColsStmt.java
File 
fe/src/main/java/org/apache/impala/analysis/AlterTableAddReplaceColsStmt.java:

Line 123: if (c.hasEncoding() || c.hasCompression() || 
c.hasBlockSize()) {
Should we let Kudu handle this case and the one below? Does Kudu behave sanely?


http://gerrit.cloudera.org:8080/#/c/5136/2/fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java
File 
fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java:

Line 79:   throw new AnalysisException("Key-value partition specs are not 
supported for " +
ALTER TABLE DROP PARTITION is not supported for Kudu tables


http://gerrit.cloudera.org:8080/#/c/5136/2/fe/src/main/java/org/apache/impala/analysis/AlterTableDropRangePartitionStmt.java
File 
fe/src/main/java/org/apache/impala/analysis/AlterTableDropRangePartitionStmt.java:

Line 32: public class AlterTableDropRangePartitionStmt extends AlterTableStmt {
code seems very similar to AlterTableAddRangePartitionStmt, should we just 
merge them into AlterTableAddDropRangePartitionStmt?


Line 77: return;
remove


http://gerrit.cloudera.org:8080/#/c/5136/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

Line 553:   case UPDATE_STATS:
I think we should be able to avoid the copy+paste by adding an additional check 
to L365 that will make us only go down this patch for ADD_REPLACE_COLUMNS, 
DROP_COLUMN, etc., but not for UPDATE_STATS or SET_TBL_PROPERTIES

Maybe something like altersKuduTable(TAlterTableType type), you get the idea


Line 2231:   KuduCatalogOpExecutor.renameTable((KuduTable) tbl,
I don't think we should rename the Kudu table for external tables. An external 
table is like a pointer, and changing the tblprpperty should just point it to 
another table imo.


http://gerrit.cloudera.org:8080/#/c/5136/2/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java:

Line 1831: String[] addDrop = {"add if not exists", "drop if exists"};
what about just "add" and "drop"


Line 1868: AnalysisError("alter table functional_kudu.testtbl add columns 
(a1 int)",
How do we know a1 is going to be non-nullable at this point (in analysis)? 
That's up to Kudu's default right?

Probably best to add an explicit NOT NULL in any case.


Line 1875: AnalysisError("alter table functional_kudu.testtbl add columns " 
+
use a loop to cover the cases separately (if you decide to keep these checks 
part of analysis and not defer to Kudu)


http://gerrit.cloudera.org:8080/#/c/5136/2/testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
File testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test:

Line 52:   distribute by range (id) (partition 1 < values <= 10) stored as kudu
Test dropping all partitions and scanning/inserting into the empty table.


Line 156: # Insert a row that doesn't have values for the new columns; defaults 
are used
New test somewhere:
* drop new_col2
* re-add new_col2 but with a different default value
* insert a new row, triggering the new default value
* check that old rows have the old default value and new rows have the new 
default value


Line 157: insert into tbl_to_alter (id,name,vali) values (3, 'test', 200)
Can you try this same test but with NULLs for the new columns? The insert 
should give errors.


Line 257: create table copy_of_tbl (a int primary key) distribute by hash (a) 
into 3 buckets
new test for the external table "pointer" behavior


http://gerrit.cloudera.org:8080/#/c/5136/2/tests/query_test/test_kudu.py
File tests/query_test/test_kudu.py:

Line 99: assert kudu_client.table_exists(new_kudu_tbl_name)
also check that the old 

[Impala-ASF-CR] IMPALA-4510: Selectively filter args for metric verification tests

2016-11-21 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-4510: Selectively filter args for metric verification 
tests
..


Patch Set 5: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5135
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I069172f44c1307d55f85779cdb01fecc0ba1799e
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4510: Selectively filter args for metric verification tests

2016-11-21 Thread David Knupp (Code Review)
Hello Michael Brown,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5135

to look at the new patch set (#5).

Change subject: IMPALA-4510: Selectively filter args for metric verification 
tests
..

IMPALA-4510: Selectively filter args for metric verification tests

run-tests.py is a wrapper around impala-py.test. It abstracts away
the need to invoke separate runs for serial tests, parallel tests,
and metric verification tests.

Because it's possible for a user to specify certain test suites,
or even specific tests, on the command line when calling
run-tests.py, it had been necessary to override the command line
args when it came time to run the metric verification tests --
otherwise those other tests/suites would be rerun. Before this
patch, we had simply been stripping away all command line args.

However, that blanket approach causes problems when running tests
against a remote cluster, because we need to retain those command
line args that pertain to the remote cluster.

This patch selectively prunes unwanted command line args for the
last metric verification test stage, keeping the ones that we
need, and also adds extensive documentation for explaining why we
have to go through this fairly odd and elaborate step.

This patch was tested by running a sample test suite locally,
and against a remote cluster. Previously, the metric verification
stage had been failing for remote cluster tests (since they were
defaulting to localhost for services that were only available
remotely.) With the patch, the remote verfification tests were
passing.

Also, while I'm here, add a small change that exits immediately
if the user calls for --help. Before this, we actually still ran
the tests.

Change-Id: I069172f44c1307d55f85779cdb01fecc0ba1799e
---
M tests/run-tests.py
1 file changed, 77 insertions(+), 14 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/5135/5
-- 
To view, visit http://gerrit.cloudera.org:8080/5135
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I069172f44c1307d55f85779cdb01fecc0ba1799e
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-4510: Selectively filter args for metric verification tests

2016-11-21 Thread David Knupp (Code Review)
David Knupp has posted comments on this change.

Change subject: IMPALA-4510: Selectively filter args for metric verification 
tests
..


Patch Set 4:

While I'm in this file, make a quick fix to not run tests if the user just 
passes in --help.

-- 
To view, visit http://gerrit.cloudera.org:8080/5135
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I069172f44c1307d55f85779cdb01fecc0ba1799e
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4432: Handle internal codegen disabling properly

2016-11-21 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4432: Handle internal codegen disabling properly
..


Patch Set 6:

(4 comments)

Thanks, this is much easier to follow now.  It'd be good for Tim to take 
another pass now to make sure it still makes sense to him as well.

http://gerrit.cloudera.org:8080/#/c/5105/6/be/src/runtime/runtime-state.h
File be/src/runtime/runtime-state.h:

PS6, Line 169: HasScalarFnToCodegen
Maybe ScalarFnRequiresCodegen() or similar? i.e. to make it clear that not only 
do we codegen these, but that they require codegen.


Line 171:   /// Returns true if there is hint to disable codegen. This can be 
true for single node
"a"


Line 180:   /// of the hint and all expressions in the fragment can be 
interpreted.
given that this depends on the scalar_fns_to_codegen_ state which isn't 
query-static state, let's document at what point this routine can be used (e.g. 
after Prepare() phase or whatever).


Line 194:   ///which cannot be interpreted.
I think this would be clearer without the double negative. eg.:

1. It's enabled by query option.
2. Enabled by the internal hint and all expressions can be interpreted.


-- 
To view, visit http://gerrit.cloudera.org:8080/5105
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0b6a9ed723c64ba21b861608583cc9b6607d3397
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model

2016-11-21 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs 
from query model
..


Patch Set 4: Code-Review+1

I've simply chosen different names for the objects, and maintained the name 
Query for SELECT statements. We don't need that special pickling anymore. carry 
+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5162
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I38e24da78c49e908449b35f0a6276ebe4236ddba
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model

2016-11-21 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs 
from query model
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5162/1/tests/comparison/leopard/custom_pickle.py
File tests/comparison/leopard/custom_pickle.py:

PS1, Line 8: 
   :   Thanks to 
https://wiki.python.org/moin/UsingPickle/RenamingModules
> I'm testing another patch set that lets us just not use this code at all. I
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5162
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I38e24da78c49e908449b35f0a6276ebe4236ddba
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model

2016-11-21 Thread Michael Brown (Code Review)
Hello Matthew Jacobs,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5162

to look at the new patch set (#4).

Change subject: IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs 
from query model
..

IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model

This patch adds support to the random query generator infrastructure to
model and write SQL INSERTs. It does not actually randomly generate
INSERTs at this time (tracked in IMPALA-4353 and umbrella task
IMPALA-3740) but does provide necessary building blocks to do so.

First, it's necessary to model the INSERTs as part of our data model.
This was done by taking the current notion of a Query and making it a
SelectQuery. We also then create an abstract Query containing some of
the more common methods and attributes. We then model an INSERT query,
INSERT clause, and VALUES clause (IMPALA-4343).

Second, it's necessary to test the basics of this data model. It made
sense to go ahead and implement the necessary SqlWriter methods to write
the SQL for these clauses (IMPALA-4354).

I could then use this writer with some existing and new tests that take
a query written into our data model and write the SQL, verifying they're
correct.

For INSERT into Kudu tables, the equivalent PostgreSQL queries need to
use "ON CONFLICT DO NOTHING", so all existing and new query tests verify
they can be written as PostgreSQL as well.

Testing:
- all the query generator tests pass
- I can run Leopard front_end.py and load older query generator reports,
  browse them, and re-run failed queries
- I can run Leopard controller.py to actually do a query generator
  run
- discrepancy_searcher.py --explain-only ran for hundreds of queries.
  There were no problems writing the SELECT queries

Change-Id: I38e24da78c49e908449b35f0a6276ebe4236ddba
---
M tests/comparison/model_translator.py
M tests/comparison/query.py
M tests/comparison/tests/query_object_testdata.py
M tests/comparison/tests/test_query_objects.py
4 files changed, 623 insertions(+), 68 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/5162/4
-- 
To view, visit http://gerrit.cloudera.org:8080/5162
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I38e24da78c49e908449b35f0a6276ebe4236ddba
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model

2016-11-21 Thread Michael Brown (Code Review)
Hello Matthew Jacobs,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5162

to look at the new patch set (#3).

Change subject: IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs 
from query model
..

IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model

This patch adds support to the random query generator infrastructure to
model and write SQL INSERTs. It does not actually randomly generate
INSERTs at this time (tracked in IMPALA-4353 and umbrella task
IMPALA-3740) but does provide necessary building blocks to do so.

First, it's necessary to model the INSERTs as part of our data model.
This was done by taking the current notion of a Query and making it a
SelectQuery. We also then create an abstract Query containing some of
the more common methods and attributes. We then model an INSERT query,
INSERT clause, and VALUES clause (IMPALA-4343).

Second, it's necessary to test the basics of this data model. It made
sense to go ahead and implement the necessary SqlWriter methods to write
the SQL for these clauses (IMPALA-4354).

I could then use this writer with some existing and new tests that take
a query written into our data model and write the SQL, verifying they're
correct.

For INSERT into Kudu tables, the equivalent PostgreSQL queries need to
use "ON CONFLICT DO NOTHING", so all existing and new query tests verify
they can be written as PostgreSQL as well.

Testing:
- all the query generator tests pass
- I can run Leopard front_end.py and load older query generator reports,
  browse them, and re-run failed queries
- I can run Leopard controller.py to actually do a query generator
  run
- discrepancy_searcher.py --explain-only ran for hundreds of queries.
  There were no problems writing the SELECT queries

Change-Id: I38e24da78c49e908449b35f0a6276ebe4236ddba
---
M tests/comparison/model_translator.py
M tests/comparison/query.py
M tests/comparison/query_flattener.py
M tests/comparison/query_generator.py
M tests/comparison/tests/fake_query.py
M tests/comparison/tests/query_object_testdata.py
M tests/comparison/tests/test_query_objects.py
7 files changed, 627 insertions(+), 69 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/5162/3
-- 
To view, visit http://gerrit.cloudera.org:8080/5162
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I38e24da78c49e908449b35f0a6276ebe4236ddba
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-2890: Support ALTER TABLE statements for Kudu tables

2016-11-21 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has uploaded a new patch set (#3).

Change subject: IMPALA-2890: Support ALTER TABLE statements for Kudu tables
..

IMPALA-2890: Support ALTER TABLE statements for Kudu tables

With this commit, we add support for additional ALTER TABLE statements
against Kudu tables. The new supported ALTER TABLE operations for Kudu are:
- ADD/DROP range partitions. Syntax:
ALTER TABLE  ADD [IF NOT EXISTS] RANGE 
ALTER TABLE  DROP [IF EXISTS] RANGE 
- ADD/DROP/RENAME column. Syntax:
ALTER TABLE  ADD COLUMNS (col_spec, [col_spec, ...])
ALTER TABLE  DROP COLUMN 
ALTER TABLE  CHANGE COLUMN   
- Rename Kudu table using the 'kudu.table_name' table property. Example:
  ALTER TABLE  SET TBLPROPERTY ('kudu.tbl_name'=''),
  will change the underlying Kudu table name to .
- Renaming the HMS/Catalog table entry of a Kudu table is supported using the
  existing ALTER TABLE  RENAME TO  syntax.

Not supported:
- ALTER TABLE  REPLACE COLUMNS

Change-Id: I04bc87e04e05da5cc03edec79d13cedfd2012896
---
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java
A 
fe/src/main/java/org/apache/impala/analysis/AlterTableAddRangePartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddReplaceColsStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableChangeColStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java
A 
fe/src/main/java/org/apache/impala/analysis/AlterTableDropRangePartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableSetFileFormatStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableSetLocationStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/DistributeParam.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
M tests/query_test/test_kudu.py
21 files changed, 998 insertions(+), 61 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/36/5136/3
-- 
To view, visit http://gerrit.cloudera.org:8080/5136
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I04bc87e04e05da5cc03edec79d13cedfd2012896
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 


[Impala-ASF-CR] IMPALA-2890: Support ALTER TABLE statements for Kudu tables

2016-11-21 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-2890: Support ALTER TABLE statements for Kudu tables
..


Patch Set 1:

(1 comment)

Forgotten comment and some minor test fixes.

http://gerrit.cloudera.org:8080/#/c/5136/1/fe/src/main/java/org/apache/impala/analysis/AlterTableAddReplaceColsStmt.java
File 
fe/src/main/java/org/apache/impala/analysis/AlterTableAddReplaceColsStmt.java:

Line 142: throw new AnalysisException("Unsupported column options: " + 
c.toString());
> would be nice to say why they are unsupported (only supported on Kudu table
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5136
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I04bc87e04e05da5cc03edec79d13cedfd2012896
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4514: Fix broken exhaustive builds caused by non-nullable columns

2016-11-21 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4514: Fix broken exhaustive builds caused by 
non-nullable columns
..


Patch Set 1: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5157
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib6f4b4c88ef18b1731b7c9789aad602880e18035
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4514: Fix broken exhaustive builds caused by non-nullable columns

2016-11-21 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4514: Fix broken exhaustive builds caused by 
non-nullable columns
..


IMPALA-4514: Fix broken exhaustive builds caused by non-nullable columns

This commit fixes the broken exhaustive Impala builds. The issue was
caused by Kudu table that didn't properly specify nullability
constraints. Hence, some rows were rejected during data loading causing
some tests to fail.

Change-Id: Ib6f4b4c88ef18b1731b7c9789aad602880e18035
Reviewed-on: http://gerrit.cloudera.org:8080/5157
Reviewed-by: Alex Behm 
Tested-by: Internal Jenkins
---
M testdata/datasets/functional/functional_schema_template.sql
1 file changed, 2 insertions(+), 1 deletion(-)

Approvals:
  Internal Jenkins: Verified
  Alex Behm: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/5157
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ib6f4b4c88ef18b1731b7c9789aad602880e18035
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins


[Impala-ASF-CR] IMPALA-3809: Show Kudu-specific column metadata in DESCRIBE.

2016-11-21 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3809: Show Kudu-specific column metadata in DESCRIBE.
..


Patch Set 3:

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/519/

-- 
To view, visit http://gerrit.cloudera.org:8080/5125
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I83c91b540bc6d27cb4f21535fe12f3f8658c233e
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Internal Jenkins
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2523: Make HdfsTableSink aware of clustered input

2016-11-21 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-2523: Make HdfsTableSink aware of clustered input
..


Patch Set 19:

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/518/

-- 
To view, visit http://gerrit.cloudera.org:8080/4863
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibeda0bdabbfe44c8ac95bf7c982a75649e1b82d0
Gerrit-PatchSet: 19
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4440: lineage timestamps can go backwards across daylight savings transitions

2016-11-21 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4440: lineage timestamps can go backwards across 
daylight savings transitions
..


Patch Set 3: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5129
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34e435fc3511e65bc62906205cb558f2c116a8a9
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Internal Jenkins
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4440: lineage timestamps can go backwards across daylight savings transitions

2016-11-21 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4440: lineage timestamps can go backwards across 
daylight savings transitions
..


IMPALA-4440: lineage timestamps can go backwards across daylight savings 
transitions

Using TimestampValue (or equivalent string representation) for
timestamps that require a point in time doesn't work because the same
time can represent multiple point in times.  For example, the timestamp:
'2016-11-13 01:01 AM' occurred twice last weekend.

Instead, we should use unix time directly rather than trying to derive
unix time from a (timezone-less) timestamp.

Note that there are other questionable uses of TimestampValue for
internal Impala service stuff, but I want to fix them separately as they
are not as important and fixing does add some risk.

While I'm here, remove a template TimestampValue constructor that was
unused and is confusing.

We don't have any end-to-end tests that exercise column lineage, so add
a simple custom cluster test that enables lineage and verifes the start
and end unix times are within appropriate bounds.  The other column
lineage graph fields are at least tested via planner tests.

Automated regression testing for the specifc daylight savings issue is
difficult as we'd have to cross the daylight savings boundary at just
the right time during query execution in order to reproduce
reliably. But open to ideas.

Testing:
- loop the new test overnight without any failures.
- exhaustive run.

Change-Id: I34e435fc3511e65bc62906205cb558f2c116a8a9
Reviewed-on: http://gerrit.cloudera.org:8080/5129
Reviewed-by: Dan Hecht 
Tested-by: Internal Jenkins
---
M be/src/runtime/timestamp-value.h
M be/src/service/impala-server.cc
M common/thrift/ImpalaInternalService.thrift
M fe/src/main/java/org/apache/impala/analysis/ColumnLineageGraph.java
M fe/src/test/java/org/apache/impala/testutil/TestUtils.java
A tests/custom_cluster/test_lineage.py
6 files changed, 86 insertions(+), 30 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Dan Hecht: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/5129
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I34e435fc3511e65bc62906205cb558f2c116a8a9
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Internal Jenkins


[Impala-ASF-CR] IMPALA-3809: Show Kudu-specific column metadata in DESCRIBE.

2016-11-21 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3809: Show Kudu-specific column metadata in DESCRIBE.
..


Patch Set 3:

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/517/

-- 
To view, visit http://gerrit.cloudera.org:8080/5125
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I83c91b540bc6d27cb4f21535fe12f3f8658c233e
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Internal Jenkins
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3809: Show Kudu-specific column metadata in DESCRIBE.

2016-11-21 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3809: Show Kudu-specific column metadata in DESCRIBE.
..


Patch Set 3:

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/516/

-- 
To view, visit http://gerrit.cloudera.org:8080/5125
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I83c91b540bc6d27cb4f21535fe12f3f8658c233e
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Internal Jenkins
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4510: Selectively filter args for metric verification tests

2016-11-21 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-4510: Selectively filter args for metric verification 
tests
..


Patch Set 4: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5135
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I069172f44c1307d55f85779cdb01fecc0ba1799e
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4510: Selectively filter args for metric verification tests

2016-11-21 Thread David Knupp (Code Review)
David Knupp has uploaded a new patch set (#4).

Change subject: IMPALA-4510: Selectively filter args for metric verification 
tests
..

IMPALA-4510: Selectively filter args for metric verification tests

run-tests.py is a wrapper around impala-py.test. It abstracts away
the need to invoke separate runs for serial tests, parallel tests,
and metric verification tests.

Because it's possible for a user to specify certain test suites,
or even specific tests, on the command line when calling
run-tests.py, it had been necessary to override the command line
args when it came time to run the metric verification tests --
otherwise those other tests/suites would be rerun. Before this
patch, we had simply been stripping away all command line args.

However, that blanket approach causes problems when running tests
against a remote cluster, because we need to retain those command
line args that pertain to the remote cluster.

This patch selectively prunes unwanted command line args for the
last metric verification test stage, keeping the ones that we
need, and also adds extensive documentation for explaining why we
have to go through this fairly odd and elaborate step.

This patch was tested by running a sample test suite locally,
and against a remote cluster. Previously, the metric verification
stage had been failing for remote cluster tests (since they were
defaulting to localhost for services that were only available
remotely.) With the patch, the remote verfification tests were
passing.

Change-Id: I069172f44c1307d55f85779cdb01fecc0ba1799e
---
M tests/run-tests.py
1 file changed, 69 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/5135/4
-- 
To view, visit http://gerrit.cloudera.org:8080/5135
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I069172f44c1307d55f85779cdb01fecc0ba1799e
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-2523: Make HdfsTableSink aware of clustered input

2016-11-21 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-2523: Make HdfsTableSink aware of clustered input
..


Patch Set 19:

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/515/

-- 
To view, visit http://gerrit.cloudera.org:8080/4863
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibeda0bdabbfe44c8ac95bf7c982a75649e1b82d0
Gerrit-PatchSet: 19
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4510: Selectively filter args for metric verification tests

2016-11-21 Thread David Knupp (Code Review)
David Knupp has posted comments on this change.

Change subject: IMPALA-4510: Selectively filter args for metric verification 
tests
..


Patch Set 2:

(7 comments)

Thanks for the review.

http://gerrit.cloudera.org:8080/#/c/5135/2//COMMIT_MSG
Commit Message:

PS2, Line 9: impala-pytest
> impala-py.test
Done


PS2, Line 10:  separate individual
> You can just use one of these two words. :)
Done


Line 28: 
> Please mention the testing you performed for this patch.
Done


http://gerrit.cloudera.org:8080/#/c/5135/2/tests/run-tests.py
File tests/run-tests.py:

Line 73:   Modify and return the command line arguments that will be passed to 
pytest.
> Can you also talk about what log_base_name and valid_dirs need to be for a 
Done


PS2, Line 107: # We need to account for the fact that '--foo bar' and 
'--foo=bar' might
 : # be supplied by the user.
 : raw_args = itertools.chain(*[arg.split('=') for arg in 
sys.argv[1:]])
> Crafty. It would help a reader to show a simple before/after example of wha
Done


PS2, Line 110: filtered_args = []
> The name here is strange to me. These args aren't being filtered, but in fa
Done


PS2, Line 113:   try:
 : pytest.config.getvalue(arg.strip('-'))  # Raises 
ValueError if invalid arg
 : filtered_args += [arg, str(raw_args.next())]
 :   except ValueError:
 : continue
> This is related to the L75 docstring comment above: it might make more sens
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5135
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I069172f44c1307d55f85779cdb01fecc0ba1799e
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4510: Selectively filter args for metric verification tests

2016-11-21 Thread David Knupp (Code Review)
David Knupp has uploaded a new patch set (#3).

Change subject: IMPALA-4510: Selectively filter args for metric verification 
tests
..

IMPALA-4510: Selectively filter args for metric verification tests

run-tests.py is a wrapper around impala-py.test. It abstracts away
the need to invoke separate runs for serial tests, parallel tests,
and metric verification tests.

Because it's possible for a user to specify certain test suites,
or even specific tests, on the command line when calling
run-tests.py, it had been necessary to override the command line
args when it came time to run the metric verification tests --
otherwise those other tests/suites would be rerun. Before this
patch, we had simply been stripping away all command line args.

However, that blanket approach causes problems when running tests
against a remote cluster, because we need to retain those command
line args that pertain to the remote cluster.

This patch selectively prunes unwanted command line args for the
last metric verification test stage, keeping the ones that we
need, and also adds extensive documentation for explaining why we
have to go through this fairly odd and elaborate step.

This patch was tested by running a sample test suite locally,
and against a remote cluster. Previously, the metric verification
stage had been failing for remote cluster tests (since they were
defaulting to localhost for services that were only available
remotely.) With the patch, the remote verfification tests were
passing.

Change-Id: I069172f44c1307d55f85779cdb01fecc0ba1799e
---
M tests/run-tests.py
1 file changed, 69 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/5135/3
-- 
To view, visit http://gerrit.cloudera.org:8080/5135
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I069172f44c1307d55f85779cdb01fecc0ba1799e
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 


[Impala-ASF-CR] IMPALA-3809: Show Kudu-specific column metadata in DESCRIBE.

2016-11-21 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3809: Show Kudu-specific column metadata in DESCRIBE.
..


Patch Set 3: Verified-1

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/514/

-- 
To view, visit http://gerrit.cloudera.org:8080/5125
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I83c91b540bc6d27cb4f21535fe12f3f8658c233e
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Internal Jenkins
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model

2016-11-21 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs 
from query model
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5162/1/tests/comparison/leopard/custom_pickle.py
File tests/comparison/leopard/custom_pickle.py:

PS1, Line 8: 
   :   Thanks to 
https://wiki.python.org/moin/UsingPickle/RenamingModules
> If you can't find the license available you'll probably need to check with 
I'm testing another patch set that lets us just not use this code at all. I'll 
mark this done if my testing passes and I update this review to use that patch 
set.


-- 
To view, visit http://gerrit.cloudera.org:8080/5162
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I38e24da78c49e908449b35f0a6276ebe4236ddba
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4494: Fix crash in SimpleScheduler

2016-11-21 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-4494: Fix crash in SimpleScheduler
..


Patch Set 4:

(2 comments)

Thanks for the review. Please see PS 6.

http://gerrit.cloudera.org:8080/#/c/5127/4/be/src/scheduling/simple-scheduler.cc
File be/src/scheduling/simple-scheduler.cc:

Line 261: if (new_backend_config->NumBackends() > 0
> Hmm, yeah if it's possible to get a heartbeat before the initial statestore
Good point.


Line 264:   new_backend_config->AddBackend(local_backend_descriptor_);
> We could just combine this check and the current_membership_ check below, r
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5127
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6e1196a2fa47e5954c4a190aa326c135d039a77f
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4494: Fix crash in SimpleScheduler

2016-11-21 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new patch set (#6).

Change subject: IMPALA-4494: Fix crash in SimpleScheduler
..

IMPALA-4494: Fix crash in SimpleScheduler

The scheduler maintains a local list of active backends, which is
updated through messages from the statestore. Even the local backend
enters this list by registering with the statestore and being included
in a statestore update message. Thus, during restarts it can happen that
a query gets scheduled with exec_at_coord set to true, while the local
backend has not been registered with the scheduler. In this case the IP
address lookup in the internal BackendConfig fails and an empty IP
address is returned, leading to a nullptr dereference down the line.

This change adds an additional check when handling updates from the
statestore to make sure that the backend config always contains the
local backend. It also changes scheduling when exec_at_coord is true to
always use the local backend, irrespective of whether it is present in
the backend config.

Change-Id: I6e1196a2fa47e5954c4a190aa326c135d039a77f
---
M be/src/scheduling/simple-scheduler-test.cc
M be/src/scheduling/simple-scheduler.cc
2 files changed, 91 insertions(+), 30 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/5127/6
-- 
To view, visit http://gerrit.cloudera.org:8080/5127
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6e1196a2fa47e5954c4a190aa326c135d039a77f
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3342: Add thread counters to monitor plan fragment execution

2016-11-21 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-3342: Add thread counters to monitor plan fragment 
execution
..


Patch Set 9: Code-Review+2

Looks good to me.

-- 
To view, visit http://gerrit.cloudera.org:8080/4633
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifa88aa6f3371fa42d11ecc122f43c7d83623c300
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yonghyun Hwang
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3809: Show Kudu-specific column metadata in DESCRIBE.

2016-11-21 Thread Alex Behm (Code Review)
Hello Dimitris Tsirogiannis,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5125

to look at the new patch set (#3).

Change subject: IMPALA-3809: Show Kudu-specific column metadata in DESCRIBE.
..

IMPALA-3809: Show Kudu-specific column metadata in DESCRIBE.

TODO:
- Corresponding changes to DESCRIBE EXTENDED/FORMATTED.

Testing:
A private core/hdfs run passed.

Change-Id: I83c91b540bc6d27cb4f21535fe12f3f8658c233e
---
M common/thrift/Frontend.thrift
M fe/src/main/java/org/apache/impala/analysis/DescribeTableStmt.java
M fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
A testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_stats.test
M tests/query_test/test_kudu.py
9 files changed, 213 insertions(+), 92 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/5125/3
-- 
To view, visit http://gerrit.cloudera.org:8080/5125
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I83c91b540bc6d27cb4f21535fe12f3f8658c233e
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 


[Impala-ASF-CR] IMPALA-3809: Show Kudu-specific column metadata in DESCRIBE.

2016-11-21 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-3809: Show Kudu-specific column metadata in DESCRIBE.
..


Patch Set 2: Code-Review+2

(4 comments)

http://gerrit.cloudera.org:8080/#/c/5125/2//COMMIT_MSG
Commit Message:

PS2, Line 10: Show column encoding and compression once available
> No longer relevant, no?
Done


http://gerrit.cloudera.org:8080/#/c/5125/2/fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java
File fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java:

PS2, Line 39: import 
org.apache.kudu.client.shaded.com.google.common.base.Strings;
> :) looks as if two import statements got merged into one. Anyways, replace 
Wow, really weird. Fixed.


PS2, Line 278: else {
 : defaultValCol.setString_val(kuduColumn.isNullable() ? 
"NULL" : "");
 :   }
> The Kudu team is working  on removing this constraint (i.e. either default 
Leaving the default value blank now.


http://gerrit.cloudera.org:8080/#/c/5125/2/tests/query_test/test_kudu.py
File tests/query_test/test_kudu.py:

PS2, Line 459: [("b", "string", "", "true", "false", "", "AUTO_ENCODING", 
"DEFAULT_COMPRESSION", "0"),
 :  ("c", "string", "", "false", "false", "", 
"AUTO_ENCODING", "DEFAULT_COMPRESSION", "0")]
> long lines
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5125
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I83c91b540bc6d27cb4f21535fe12f3f8658c233e
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4432: Handle internal codegen disabling properly

2016-11-21 Thread Michael Ho (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5105

to look at the new patch set (#6).

Change subject: IMPALA-4432: Handle internal codegen disabling properly
..

IMPALA-4432: Handle internal codegen disabling properly

There are some conditions in which codegen is disabled internally
even if it's enabled in the query option. For instance, the single
node optimization or the expression evaluation requests sent from
the FE to the BE. These internal disabling of codegen are advisory
as their purposes are to reduce the latency for tables with no or
very few rows. The internal disabling of codegen doesn't interact
well with UDFs which cannot be interpreted (e.g. IR UDF) as it
conflates the 'disable_codegen' query option set by the user.
As a result, it's hard to differentiate between when codegen is
disabled explicitly by users and when it is disabled internally.

This change fixes the problem above by adding an explicit flag in
TQueryCtx to indicate that codegen is disabled internally. This flag
is only advisory. For cases in which codegen is needed to function,
this internal flag is ignored and if codegen is disabled via query
option, an error is thrown. For this new flag to work with ScalarFnCall,
codegen needs to happen after ScalarFnCall::Prepare() because it's
hard to tell if a fragment contains any UDF that cannot be interpreted
until after ScalarFnCall::Prepare() is called. However, Prepare() needs
the codegen object to codegen so it needs to be created before Prepare().
We can either always create the codegen module or defer codegen to a point
after ScalarFnCall::Prepare(). The former has the downside of introducing
unnecessary latency for say single-node optimization so the latter is
implemented. It is needed as part of IMPALA-4192 any way.

After this change, ScalarFnCall expressions which need to be codegen'd
are inserted into a vector in RuntimeState in ScalarFnCall::Prepare().
Later in the codegen phase, these expressions' GetCodegendComputeFn()
will be called after codegen for operators is done. If any of these
expressions are already codegen'd indirectly by the operators,
GetCodegendComputeFn() will be a no-op. This preserves the behavior
that ScalarFnCall will always be codegen'd even if the fragment
doesn't contain any codegen enabled operators.

Change-Id: I0b6a9ed723c64ba21b861608583cc9b6607d3397
---
M be/src/exec/aggregation-node.cc
M be/src/exec/exchange-node.cc
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/hash-join-node.cc
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-sequence-scanner.cc
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/partitioned-aggregation-node.cc
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/sort-node.cc
M be/src/exec/topn-node.cc
M be/src/exprs/scalar-fn-call.cc
M be/src/exprs/scalar-fn-call.h
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/service/fe-support.cc
M common/thrift/ImpalaInternalService.thrift
M fe/src/main/java/org/apache/impala/planner/Planner.java
M tests/query_test/test_udfs.py
24 files changed, 178 insertions(+), 117 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/5105/6
-- 
To view, visit http://gerrit.cloudera.org:8080/5105
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0b6a9ed723c64ba21b861608583cc9b6607d3397
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-2890: Support ALTER TABLE statements for Kudu tables

2016-11-21 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-2890: Support ALTER TABLE statements for Kudu tables
..


Patch Set 1:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/5136/1/common/thrift/JniCatalog.thrift
File common/thrift/JniCatalog.thrift:

Line 179:   2: optional CatalogObjects.TRangePartition range_partition_spec
> Seems to make more sense to completely separate the HDFS partition case fro
Done


http://gerrit.cloudera.org:8080/#/c/5136/1/fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java
File 
fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java:

Line 56:   public AlterTableAddPartitionStmt(TableName tableName,
> Seems to make more sense to move this into a new AlterTableAddRangePartitio
Done


http://gerrit.cloudera.org:8080/#/c/5136/1/fe/src/main/java/org/apache/impala/analysis/AlterTableAddReplaceColsStmt.java
File 
fe/src/main/java/org/apache/impala/analysis/AlterTableAddReplaceColsStmt.java:

Line 85:   throw new AnalysisException("ALTER TABLE REPLACE COLUMNS not 
currently " +
> is not supported on Kudu tables.
Done


http://gerrit.cloudera.org:8080/#/c/5136/1/fe/src/main/java/org/apache/impala/analysis/AlterTableChangeColStmt.java
File fe/src/main/java/org/apache/impala/analysis/AlterTableChangeColStmt.java:

Line 109:   if (col.getType() != newColDef_.getType()) {
> use equals()
Done


http://gerrit.cloudera.org:8080/#/c/5136/1/fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java
File 
fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java:

Line 52:   public AlterTableDropPartitionStmt(TableName tableName,
> Separate this into a new AlterTableDropRangePartitionStmt?
Done


http://gerrit.cloudera.org:8080/#/c/5136/1/fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java
File fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java:

Line 99:   throw new AnalysisException(String.format(
> Let's move this check into the respective alter statements that are not sup
Done


http://gerrit.cloudera.org:8080/#/c/5136/1/fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
File fe/src/main/java/org/apache/impala/analysis/ColumnDef.java:

Line 162:   public boolean hasEncoding() { return encoding_ != null; }
> use these in hasKuduOptions()
Done


Line 322: return new ColumnDef(col.getName(), new TypeDef(col.getType()));
> What about all the other column properties like comment/primary_key, etc,? 
Done


http://gerrit.cloudera.org:8080/#/c/5136/1/fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
File fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java:

Line 232: // We shouldn't output the columns for external tables
> good catch
Done


http://gerrit.cloudera.org:8080/#/c/5136/1/fe/src/main/java/org/apache/impala/catalog/KuduTable.java
File fe/src/main/java/org/apache/impala/catalog/KuduTable.java:

Line 174: return ImmutableList.copyOf(rangeDistribution.getColumnNames());
> getColumnNames() already returns an ImmutableList
Thanks. Done


Line 267: ColumnSchema col =
> add helper getColumnById()?
Done


Line 268:   
tableSchema.getColumnByIndex(tableSchema.getColumnIndex(colId));
> indent 4
Done


http://gerrit.cloudera.org:8080/#/c/5136/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

Line 366:   if (tbl instanceof KuduTable) {
> Consider adding a separate switch() only for Kudu tables above, or some oth
Done


Line 2168: if (properties.containsKey(KuduTable.KEY_TABLE_NAME) &&
> This seems to only allow renaming the Kudu table if the old Kudu table alre
The Kudu table name property is always set even if the user does explicitly 
specify it, so I believe the case you mention is handled correctly here. Let me 
know if I misunderstood the comment.


Line 2172:   KuduCatalogOpExecutor.renameTable((KuduTable) tbl,
> will this succeed if the old table doesn't exist?
It will fail but the tbl properties haven't been updated at that point, so the 
HMS entry will not change.


http://gerrit.cloudera.org:8080/#/c/5136/1/fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java:

Line 300:   if (!ifNotExists) {
> we really need finer grained error handling here :(
Yeah, I don't like that either. KuduException is too generic and doesn't tell 
us much in this case.


Line 337:* Returns the bounds of a range partition in two PartialRow, 
RangePartitionBound pairs
> enclose the pair members in <>
Done


Line 393:   public static void dropColumn(KuduTable tbl, String colName)
> I assume we can't drop PK columns. Let's check for that in analysis.
Shouldn't we let Kudu handle this check?


Line 409:   public static void rename

[Impala-ASF-CR] IMPALA-2890: Support ALTER TABLE statements for Kudu tables

2016-11-21 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has uploaded a new patch set (#2).

Change subject: IMPALA-2890: Support ALTER TABLE statements for Kudu tables
..

IMPALA-2890: Support ALTER TABLE statements for Kudu tables

With this commit, we add support for additional ALTER TABLE statements
against Kudu tables. The new supported ALTER TABLE operations for Kudu are:
- ADD/DROP range partitions. Syntax:
ALTER TABLE  ADD [IF NOT EXISTS] RANGE 
ALTER TABLE  DROP [IF EXISTS] RANGE 
- ADD/DROP/RENAME column. Syntax:
ALTER TABLE  ADD COLUMNS (col_spec, [col_spec, ...])
ALTER TABLE  DROP COLUMN 
ALTER TABLE  CHANGE COLUMN   
- Rename Kudu table using the 'kudu.table_name' table property. Example:
  ALTER TABLE  SET TBLPROPERTY ('kudu.tbl_name'=''),
  will change the underlying Kudu table name to .
- Renaming the HMS/Catalog table entry of a Kudu table is supported using the
  existing ALTER TABLE  RENAME TO  syntax.

Not supported:
- ALTER TABLE  REPLACE COLUMNS

Change-Id: I04bc87e04e05da5cc03edec79d13cedfd2012896
---
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java
A 
fe/src/main/java/org/apache/impala/analysis/AlterTableAddRangePartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddReplaceColsStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableChangeColStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java
A 
fe/src/main/java/org/apache/impala/analysis/AlterTableDropRangePartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableSetFileFormatStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableSetLocationStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/DistributeParam.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
M tests/query_test/test_kudu.py
21 files changed, 997 insertions(+), 61 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/36/5136/2
-- 
To view, visit http://gerrit.cloudera.org:8080/5136
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I04bc87e04e05da5cc03edec79d13cedfd2012896
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 


[Impala-ASF-CR] IMPALA-4432: Handle internal codegen disabling properly

2016-11-21 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-4432: Handle internal codegen disabling properly
..


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/5105/5/be/src/exec/exec-node.cc
File be/src/exec/exec-node.cc:

PS5, Line 463: CodegenDisabledByHint(
> is this correct? we might still codegen if there's an IR udf, right? See al
Good point. Fixed in the new patch.


http://gerrit.cloudera.org:8080/#/c/5105/5/be/src/runtime/runtime-state.h
File be/src/runtime/runtime-state.h:

PS5, Line 171: codegen is disabled due to hints
> is this true?  doesn't this just indicate whether the hint would like to di
Rephrased to "if there is hint to disable codegen."


PS5, Line 175: CodegenDisabledByHint
> if my above question is right, then this should be called CodegenDisableHin
The new patch introduces both CodegenDisableHint() and CodegenDisabledByHint().

The former is needed in ScalarFnCall::Prepare(). Ideally, if there is hint to 
disable codegen, we should only codegen expressions which cannot be 
interpreted.  If we use CodegenDisabledByHint() (with the suggested change) in 
ScalarFnCall::Prepare() and once an expression is added to 
scalar_fn_calls_to_codegen_, all subsequent ScalarFnCall expressions will be 
codegen'd. This makes it a bit non-deterministic which expressions will be 
codegen'd in the presence of the disabling hints as it depends on the order of 
when Prepare() is called for the non-interpretable expression.

That said, the expressions may still be codegen'd when the operator is 
codegen'd but this logic may affect fe-support.cc and the above may matter in 
that case.


-- 
To view, visit http://gerrit.cloudera.org:8080/5105
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0b6a9ed723c64ba21b861608583cc9b6607d3397
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.

2016-11-21 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
..


Patch Set 3:

(18 comments)

Looked over the ref count logic so far.

http://gerrit.cloudera.org:8080/#/c/4418/4/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS4, Line 502:  QueryState* qs = 
ExecEnv::GetInstance()->query_exec_mgr()->GetQueryState(query_id_);
ScopedQueryState ?


http://gerrit.cloudera.org:8080/#/c/4418/4/be/src/runtime/query-exec-mgr.cc
File be/src/runtime/query-exec-mgr.cc:

PS4, Line 59: boost::
remove


Line 66: } else {
Why is there no refcnt increment here?


PS4, Line 69: 
remove here and below.


PS4, Line 86: 
ImpaladMetrics::IMPALA_SERVER_NUM_FRAGMENTS_IN_FLIGHT->Increment(1L);
:   ImpaladMetrics::IMPALA_SERVER_NUM_FRAGMENTS->Increment(1L);
:   return Status::OK();
Why is this not in the QueryState? It seems like it should be part of 
RegisterFInstance().


PS4, Line 95: stance completed. in
Call this ExecFInstance() to make it clear that it blocks until instance 
completion.


PS4, Line 115: VLOG_FILE << "GetQueryState(): query_i
You have a bug here - if a fragment instance finishes quickly (perhaps it 
didn't prepare() successfully) the query state will get GC'ed before another 
fragment instance can be started. 

I realise this will be addressed by batching fragment start-up, but until then 
I suggest a workaround like adding the number of expected fragment instances to 
the fragment instance params, so that when the QEM is created it can take the 
right number of references.


PS4, Line 124: id QueryExecMgr::ReleaseQueryState(QueryState* qs) {
 :   VLOG_FILE << "
Prefer atomics rather than heavyweight mutexes for the ref count. Fewer locks 
-> fewer locking bugs.


PS4, Line 126:   << " refcnt=" <<
What are you checking for here - overflow or that the refcnt_ didn't start 
below 0? If it's the latter, move this to line 125 and check before increment 
that its GE(..., 0).


PS4, Line 147: 
How? Only one caller to ReleaseQueryState() can set refcnt == 0.


http://gerrit.cloudera.org:8080/#/c/4418/4/be/src/runtime/query-exec-mgr.h
File be/src/runtime/query-exec-mgr.h:

PS4, Line 47: In both cases it brackets the instance execution with an 
increment/decrement
:   /// of the refcount.
This needs clarification - why is this relevant to callers? I think you mean 
that the QEM retains a reference to the QueryState until this FInstance has 
completed.


PS4, Line 58: Otherwise returns nullptr
AFAICT, GetQueryState() will return a QueryState even if its refcount == 0. It 
should not, because then you can avoid the complex logic in ReleaseQueryState() 
that deals with the possibility that some other caller took a refcount between 
setting the count to 0 and trying to GC the query state.


PS4, Line 66: /// TODO: isn't this available in std:: now?
remove, once we switch to std::mutex we'll get this one as well.


PS4, Line 72: clean up.
Comment on how it's allocated.


PS4, Line 75: 
Clean up what? Does this block until instance has finished?


http://gerrit.cloudera.org:8080/#/c/4418/4/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

PS4, Line 42: released_r
set this to 0. Callers should always manage the refcount with a balanced number 
of increment / decrement operations, otherwise bugs will be harder to 
understand.


http://gerrit.cloudera.org:8080/#/c/4418/4/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

PS4, Line 79: /// don't access query_ctx().desc_tbl, it's most likely for a 
different fragment instance
I don't understand this comment. Is this going to change after batching? 
Otherwise let's think about ways to avoid having to tell a caller not to touch 
a certain part of this data structure.


PS4, Line 107: 
Does access to this need to be synchronised?


-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2523: Make HdfsTableSink aware of clustered input

2016-11-21 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-2523: Make HdfsTableSink aware of clustered input
..


Patch Set 19:

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/510/

-- 
To view, visit http://gerrit.cloudera.org:8080/4863
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibeda0bdabbfe44c8ac95bf7c982a75649e1b82d0
Gerrit-PatchSet: 19
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Ivanfi 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4494: Fix crash in SimpleScheduler

2016-11-21 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4494: Fix crash in SimpleScheduler
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/5127/4/be/src/scheduling/simple-scheduler.cc
File be/src/scheduling/simple-scheduler.cc:

Line 261: if (new_backend_config->NumBackends() > 0
> The reasoning behind the check was that it prevents empty messages from the
Hmm, yeah if it's possible to get a heartbeat before the initial statestore 
message that could cause other problems. E.g. if you have a delta with a single 
subscriber in the first heartbeat, then you could end up scheduling everything 
on that node.


Line 264:   new_backend_config->AddBackend(local_backend_descriptor_);
> So far current_membership_ has mirrored the statestore's view. Adding the l
We could just combine this check and the current_membership_ check below, 
right? It seems ok to preemptively add the local backend here at the same time 
as we add it to be sent to the statestore.


-- 
To view, visit http://gerrit.cloudera.org:8080/5127
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6e1196a2fa47e5954c4a190aa326c135d039a77f
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4502: test partition ddl predicates breaks on non-HDFS filesystems

2016-11-21 Thread Sailesh Mukil (Code Review)
Hello Alex Behm,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5164

to look at the new patch set (#3).

Change subject: IMPALA-4502: test_partition_ddl_predicates breaks on non-HDFS 
filesystems
..

IMPALA-4502: test_partition_ddl_predicates breaks on non-HDFS filesystems

This is because that test uses 'set cached' and 'set uncached' which
are not supported on non-HDFS filesystems. This patch creates a
separate test file for non-HDFS filesystems with only supported
queries and invokes the right file based on the filesystem.

Change-Id: I8606aa427cb6e50be3395cdde246abb53db5172c
---
R 
testdata/workloads/functional-query/queries/QueryTest/partition-ddl-predicates-all-fs.test
A 
testdata/workloads/functional-query/queries/QueryTest/partition-ddl-predicates-hdfs-only.test
M tests/metadata/test_ddl.py
3 files changed, 43 insertions(+), 23 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/5164/3
-- 
To view, visit http://gerrit.cloudera.org:8080/5164
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8606aa427cb6e50be3395cdde246abb53db5172c
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-4502: test partition ddl predicates breaks on non-HDFS filesystems

2016-11-21 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-4502: test_partition_ddl_predicates breaks on non-HDFS 
filesystems
..


Patch Set 3: Code-Review+2

(1 comment)

Carry +2

http://gerrit.cloudera.org:8080/#/c/5164/2/testdata/workloads/functional-query/queries/QueryTest/partition-ddl-predicates-hdfs-only.test
File 
testdata/workloads/functional-query/queries/QueryTest/partition-ddl-predicates-hdfs-only.test:

Line 19: alter table p1_hdfs partition (j<2, k in ("b", "c")) set cached in 
'testPool'
> I think we can get rid of the "show" tests in this file.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5164
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8606aa427cb6e50be3395cdde246abb53db5172c
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3200: move bufferpool under runtime

2016-11-21 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3200: move bufferpool under runtime
..


Patch Set 2: Code-Review+2

Carry +2

-- 
To view, visit http://gerrit.cloudera.org:8080/5165
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3507066fb5cb955f0872d31a4619bd8bb7d647cd
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3200: move bufferpool under runtime

2016-11-21 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3200: move bufferpool under runtime
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5165/1/be/src/runtime/bufferpool/buffer-allocator.h
File be/src/runtime/bufferpool/buffer-allocator.h:

Line 18: #ifndef IMPALA_BUFFER_ALLOCATOR_H
> let's also add a _RUNTIME to the include guards.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5165
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3507066fb5cb955f0872d31a4619bd8bb7d647cd
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3200: move bufferpool under runtime

2016-11-21 Thread Tim Armstrong (Code Review)
Hello Marcel Kornacker,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5165

to look at the new patch set (#2).

Change subject: IMPALA-3200: move bufferpool under runtime
..

IMPALA-3200: move bufferpool under runtime

It is arguably a subcomponent of the runtime system

Change-Id: I3507066fb5cb955f0872d31a4619bd8bb7d647cd
---
M .clang-tidy
M be/CMakeLists.txt
M be/src/runtime/CMakeLists.txt
R be/src/runtime/bufferpool/CMakeLists.txt
R be/src/runtime/bufferpool/buffer-allocator.cc
R be/src/runtime/bufferpool/buffer-allocator.h
R be/src/runtime/bufferpool/buffer-pool-counters.h
R be/src/runtime/bufferpool/buffer-pool-test.cc
R be/src/runtime/bufferpool/buffer-pool.cc
R be/src/runtime/bufferpool/buffer-pool.h
R be/src/runtime/bufferpool/reservation-tracker-counters.h
R be/src/runtime/bufferpool/reservation-tracker-test.cc
R be/src/runtime/bufferpool/reservation-tracker.cc
R be/src/runtime/bufferpool/reservation-tracker.h
M be/src/runtime/mem-tracker.cc
15 files changed, 24 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/5165/2
-- 
To view, visit http://gerrit.cloudera.org:8080/5165
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3507066fb5cb955f0872d31a4619bd8bb7d647cd
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Marcel Kornacker 


[Impala-ASF-CR] IMPALA-4502: test partition ddl predicates breaks on non-HDFS filesystems

2016-11-21 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4502: test_partition_ddl_predicates breaks on non-HDFS 
filesystems
..


Patch Set 2: Code-Review+2

(1 comment)

Thanks for fixing this!

http://gerrit.cloudera.org:8080/#/c/5164/2/testdata/workloads/functional-query/queries/QueryTest/partition-ddl-predicates-hdfs-only.test
File 
testdata/workloads/functional-query/queries/QueryTest/partition-ddl-predicates-hdfs-only.test:

Line 19: show partitions p1_hdfs
I think we can get rid of the "show" tests in this file.


-- 
To view, visit http://gerrit.cloudera.org:8080/5164
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8606aa427cb6e50be3395cdde246abb53db5172c
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2494: Support for byte array-encoded decimals in Parquet scanner

2016-11-21 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-2494: Support for byte array-encoded decimals in Parquet 
scanner
..


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/5115/3/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

Line 1126: switch (slot_desc->type().GetByteSize()) {
I think the code might work out simpler if we added parquet::Type as a template 
parameter to ScalarColumnReader and Encode/Decode.

I think that keeps things more consistent since then type parameters are always 
handled in the same way (rather than handling this one case with a runtime 
check). I think it makes sense conceptually to have both the source parquet 
type and destination Impala types be template parameters (it's mostly 
coincidental that each Impala type mapped to a single Parquet type up until 
now).

It seems like the right framework for future changes of this nature where we 
want to have multiple parquet types mapping to an Impala type - instead of 
having to convince ourselves that the performance regression of the runtime 
check is tolerable each time, we could just use the template parameter.


http://gerrit.cloudera.org:8080/#/c/5115/1/be/src/exec/parquet-common.h
File be/src/exec/parquet-common.h:

PS1, Line 328: sizeof(T))) re
> Hm, I think it's accurate - either it's set (> 0) and fixed, or it's not se
It's not that bad but it did take a bit of thought to understand how "fixed 
length" makes sense when decoding from a variable-length byte array.


PS1, Line 338: cimal Done.
Actually, looking at DecodeFromFixedLenByteArray, it expects this to be >= 1, 
so we need to check that too.


-- 
To view, visit http://gerrit.cloudera.org:8080/5115
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If95171e65aa48f08b08b8e87f4555dc75e867977
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3342: Add thread counters to monitor plan fragment execution

2016-11-21 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3342: Add thread counters to monitor plan fragment 
execution
..


Patch Set 9: Code-Review+1

Looks okay to me once Henry signs off.

-- 
To view, visit http://gerrit.cloudera.org:8080/4633
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifa88aa6f3371fa42d11ecc122f43c7d83623c300
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Yonghyun Hwang
Gerrit-Reviewer: anujphadke 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1788: Fold constant expressions.

2016-11-21 Thread Alex Behm (Code Review)
Alex Behm has uploaded a new patch set (#2).

Change subject: IMPALA-1788: Fold constant expressions.
..

IMPALA-1788: Fold constant expressions.

Adds a new ExprRewriteRule for replacing constant expressions
with their literal equivalent via BE evaluation. Applies the
new rule together with the existing ones on the parse tree,
after analysis.

Limitations
- Constant folding is applied on the unresolved expressions.
  As a result, it only works for expressions that are constant
  within a single query block, as opposed to expressions that
  may become constant after fully substituting inline-view exprs.
- Exprs are not normalized, so some opportunities for constant
  folding are missed for certain expr-tree shapes.

This patch includes the following interesting changes:
- Introduces a timestamp literal that can only be produced
  by constant folding (not expressible directly via SQL).
- To make sure that rewrites have no user-visible effect,
  the original result types and column labels of the top-level
  statement are restored after the rewrites are performed.
- Does not fold exprs if their evaluation resulted in a
  warning or error, or if the resulting value is not
  representable by corresponding FE LiteralExpr.
- Fixes an existing issue with converting strings between
  the FE/BE. String produced in the BE that have characters
  with a value > 127 are not correctly deserialized into a
  Java String via thrift. We detect this case during constant
  folding and abandon folding of such exprs.
- Fixes several issues with detecting/reporting errors in
  NativeEvalConstExprs().
- Cleans up ExprContext::GetValue() into
  ExprContext::GetConstantValue() which clarifies its only use
  of evaluating exprs from the FE.

Testing:
- Modifies expr-test.cc to run all tests through the constant
  folding path.
- Adds basic planner and rewrite rule tests.

Change-Id: If672b703db1ba0bfc26e5b9130161798b40a69e9
---
M be/src/exprs/expr-context.cc
M be/src/exprs/expr-context.h
M be/src/exprs/expr-test.cc
M be/src/exprs/expr.cc
M be/src/exprs/literal.cc
M be/src/exprs/literal.h
M be/src/runtime/runtime-state.h
M be/src/service/fe-support.cc
M common/thrift/Exprs.thrift
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/BoolLiteral.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/LiteralExpr.java
M fe/src/main/java/org/apache/impala/analysis/NullLiteral.java
M fe/src/main/java/org/apache/impala/analysis/PartitionSet.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/analysis/StringLiteral.java
A fe/src/main/java/org/apache/impala/analysis/TimestampLiteral.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsPartitionPruner.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/rewrite/ExprRewriteRule.java
M fe/src/main/java/org/apache/impala/rewrite/ExprRewriter.java
A fe/src/main/java/org/apache/impala/rewrite/FoldConstantsRule.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M fe/src/test/java/org/apache/impala/testutil/TestUtils.java
M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/conjunct-ordering.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M testdata/workloads/functional-planner/queries/PlannerTest/empty.test
M testdata/workloads/functional-planner/queries/PlannerTest/insert.test
M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M testdata/workloads/functional-planner/queries/PlannerTest/outer-joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/subquery-rewrite.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/values.test
M testdata/workloads/functional-query/queries/QueryTest/udf.test
47 files changed, 887 insertions(+), 285 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/cha

[Impala-ASF-CR] PREVIEW: IMPALA-1788: Fold constant expressions during plan generation.

2016-11-21 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: PREVIEW: IMPALA-1788: Fold constant expressions during plan 
generation.
..


Patch Set 1:

(20 comments)

http://gerrit.cloudera.org:8080/#/c/5109/1/be/src/exprs/expr-context.cc
File be/src/exprs/expr-context.cc:

Line 155:   void* value = GetValue(NULL);
> how does this handle/bail out of cases where exprctx is accidentally not a 
It the non-constant expr has a SlotRef it will hit a SIGSEGV  because we are 
passing a NULL TupleRow. We can make this more defensive if you prefer, but it 
might hide bugs. We should really never call this on non-constant exprs.


http://gerrit.cloudera.org:8080/#/c/5109/1/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

Line 1541: // The rewrite rule for extracting conjuncts simplifies these to 
NULL.
> rather than special-casing it here, it would make more sense to special-cas
Not necessary anymore. Types are preserved.


http://gerrit.cloudera.org:8080/#/c/5109/1/be/src/exprs/literal.h
File be/src/exprs/literal.h:

Line 26: #include "runtime/string-value.h"
> duplicated
Done


http://gerrit.cloudera.org:8080/#/c/5109/1/common/thrift/Exprs.thrift
File common/thrift/Exprs.thrift:

Line 41:   TIMESTAMP_LITERAL
> please group all the literals together, so we can easily see what's still m
Done. Also got rid of LITERAL_PRED which isn't used.


http://gerrit.cloudera.org:8080/#/c/5109/1/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
File fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java:

Line 242:   fnName = path.get(path.size() - 1);
> why is this necessary?
To make isConstant() work even when a FunctionCallExpr has not yet been 
analyzed.

Added comment.


http://gerrit.cloudera.org:8080/#/c/5109/1/fe/src/main/java/org/apache/impala/analysis/LiteralExpr.java
File fe/src/main/java/org/apache/impala/analysis/LiteralExpr.java:

Line 204:   if (Double.isNaN(val.double_val) || 
Double.isInfinite(val.double_val)) return null;
> nice catch!
Done. These changes went in as a separate JIRA/commit. Rebased this patch on it.


Line 226:   for (byte b: bytes) {
> single line
Done


Line 233: result = new StringLiteral(strVal, constExpr.getType(), 
false);
> did you add a new test that blows up with the previous default unescaping?
These cases are covered by expr-test.cc.
I also added a few interesting cases to ExprRewriteRulesTest for convenience, 
but they don't give any new coverage in addition to expr-test.cc


http://gerrit.cloudera.org:8080/#/c/5109/1/fe/src/main/java/org/apache/impala/analysis/TimestampLiteral.java
File fe/src/main/java/org/apache/impala/analysis/TimestampLiteral.java:

Line 34:  * produce, so it better to defer to a single source of truth (the BE 
implementation).
> it -> it's
Done


http://gerrit.cloudera.org:8080/#/c/5109/1/fe/src/main/java/org/apache/impala/planner/AggregationNode.java
File fe/src/main/java/org/apache/impala/planner/AggregationNode.java:

Line 164: if (analyzer.getQueryOptions().enable_expr_rewrites && rewriter 
!= null) {
> pass in null for the rewriter if it's disabled, so you don't have to check 
Not relevant anymore in the new approach.


http://gerrit.cloudera.org:8080/#/c/5109/1/fe/src/main/java/org/apache/impala/planner/JoinNode.java
File fe/src/main/java/org/apache/impala/planner/JoinNode.java:

Line 523:   // constant into nested-loop joins.
> refer to cross products, to avoid confusion with future index nlj.
Not relevant anymore in new approach.


http://gerrit.cloudera.org:8080/#/c/5109/1/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
File fe/src/main/java/org/apache/impala/planner/KuduScanNode.java:

Line 373
> do we still need this function for anything?
Added a constant folder to Analyzer.GlobalState and removed 
Expr.foldConstantChildren()


http://gerrit.cloudera.org:8080/#/c/5109/1/fe/src/main/java/org/apache/impala/planner/PlanNode.java
File fe/src/main/java/org/apache/impala/planner/PlanNode.java:

Line 434:   public void initNoRewrite(Analyzer analyzer) throws ImpalaException 
{
> why not just init(Analyzer)? that would also make it clear that you're not 
Not relevant anymore in new approach.


http://gerrit.cloudera.org:8080/#/c/5109/1/fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
File fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java:

Line 577:   emptySetNode.init(ctx_.getRewriter(), analyzer);
> instead of passing this in, maybe each PlanNode should have access to the c
Not relevant anymore in new approach (but we should probably still do that in 
the future)


http://gerrit.cloudera.org:8080/#/c/5109/1/fe/src/main/java/org/apache/impala/rewrite/ExprRewriter.java
File fe/src/main/java/org/apache/impala/rewrite/ExprRewriter.java:

Line 93:   public void rewriteList(List exprs, Analyzer analyzer) throws 
AnalysisException {
> why rewrite as opposed to appl

[Impala-ASF-CR] IMPALA-4478: Initial Kudu client mem tracking for sink

2016-11-21 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4478: Initial Kudu client mem tracking for sink
..


Patch Set 2: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5152/2/be/src/exec/kudu-table-sink.cc
File be/src/exec/kudu-table-sink.cc:

PS2, Line 34: define
oops, I guess this should really be "static const int".


-- 
To view, visit http://gerrit.cloudera.org:8080/5152
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I47f17a81e4362ab490019382fedc66c25f07080a
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model

2016-11-21 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs 
from query model
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5162/1/tests/comparison/leopard/custom_pickle.py
File tests/comparison/leopard/custom_pickle.py:

PS1, Line 8: 
   :   Thanks to 
https://wiki.python.org/moin/UsingPickle/RenamingModules
> How do I go about finding this out? I had previously looked for copyrights 
If you can't find the license available you'll probably need to check with 
lawyers. Jim or Henry can probably point you in the right direction.


-- 
To view, visit http://gerrit.cloudera.org:8080/5162
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I38e24da78c49e908449b35f0a6276ebe4236ddba
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model

2016-11-21 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs 
from query model
..


Patch Set 2: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5162
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I38e24da78c49e908449b35f0a6276ebe4236ddba
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3200: move bufferpool under runtime

2016-11-21 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-3200: move bufferpool under runtime
..


Patch Set 1: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5165/1/be/src/runtime/bufferpool/buffer-allocator.h
File be/src/runtime/bufferpool/buffer-allocator.h:

Line 18: #ifndef IMPALA_BUFFER_ALLOCATOR_H
let's also add a _RUNTIME to the include guards.


-- 
To view, visit http://gerrit.cloudera.org:8080/5165
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3507066fb5cb955f0872d31a4619bd8bb7d647cd
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4478: Initial Kudu client mem tracking for sink

2016-11-21 Thread Matthew Jacobs (Code Review)
Hello Tim Armstrong, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5152

to look at the new patch set (#2).

Change subject: IMPALA-4478: Initial Kudu client mem tracking for sink
..

IMPALA-4478: Initial Kudu client mem tracking for sink

The Kudu client allocates memory which is not tracked by
Impala. There are several sources, but the most significant
is the memory allocated by the KuduSession on the write
path. This can be >100MB, so it is important to track to
avoid OOM.

Moving forward, we should have a better way to track Kudu
client memory, but for now we must at least handle this
potentially problematic case.

This changes the KuduTableSink to consume 200MB which should
be enough for the 100MB write mutation buffer as well as
100MB worth of errors buffered in the client before Impala
takes ownership of them (and deletes them). This is left as
a flag because it may turn out to be too high for some users
and too low for others. When we have better support from
Kudu (including KUDU-1752), we should simplify this.

TODO: Handle DML w/ small or known resource requirements
(e.g. VALUES specified or query has LIMIT) specially to
avoid over-consumption.

Testing: Have verified acceptable behavior in the stress
test with a simple workload containing DML statements of
moderate cardinality.

Change-Id: I47f17a81e4362ab490019382fedc66c25f07080a
---
M be/src/exec/kudu-table-sink.cc
1 file changed, 37 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/5152/2
-- 
To view, visit http://gerrit.cloudera.org:8080/5152
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I47f17a81e4362ab490019382fedc66c25f07080a
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4478: Initial Kudu client mem tracking for sink

2016-11-21 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4478: Initial Kudu client mem tracking for sink
..


Patch Set 2: Code-Review+2

carrying +2

-- 
To view, visit http://gerrit.cloudera.org:8080/5152
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I47f17a81e4362ab490019382fedc66c25f07080a
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model

2016-11-21 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs 
from query model
..


Patch Set 1: Code-Review+1

(3 comments)

Thanks for the review. Carry +1

http://gerrit.cloudera.org:8080/#/c/5162/1/tests/comparison/leopard/custom_pickle.py
File tests/comparison/leopard/custom_pickle.py:

> std file header?
Done


PS1, Line 8: 
   :   Thanks to 
https://wiki.python.org/moin/UsingPickle/RenamingModules
> are we sure licensing is OK?
How do I go about finding this out? I had previously looked for copyrights on 
this page, or on code snippets from the wiki generally, and found none. (The 
only copyright I found was for MoinMoin wiki itself, which isn't what I need.)


http://gerrit.cloudera.org:8080/#/c/5162/1/tests/comparison/query.py
File tests/comparison/query.py:

PS1, Line 43: Since all supported
: queries may have a WITH clause, getting table expressions 
from the WITH clause is
: supported here.
> FYI, other DML statements won't support WITH clauses.
Done. We talked offline about this and it makes sense to leave this as-is, the 
reason being that we could eventually support WITH in DELETE and UPDATE, just 
not now. Since UPSERT is "mostly" INSERT, it currently supports WITH already.


-- 
To view, visit http://gerrit.cloudera.org:8080/5162
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I38e24da78c49e908449b35f0a6276ebe4236ddba
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model

2016-11-21 Thread Michael Brown (Code Review)
Hello Matthew Jacobs,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5162

to look at the new patch set (#2).

Change subject: IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs 
from query model
..

IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model

This patch adds support to the random query generator infrastructure to
model and write SQL INSERTs. It does not actually randomly generate
INSERTs at this time (tracked in IMPALA-4353 and umbrella task
IMPALA-3740) but does provide necessary building blocks to do so.

First, it's necessary to model the INSERTs as part of our data model.
This was done by taking the current notion of a Query and making it a
SelectQuery. We also then create an abstract Query containing some of
the more common methods and attributes. We then model an INSERT query,
INSERT clause, and VALUES clause (IMPALA-4343).

Second, it's necessary to test the basics of this data model. It made
sense to go ahead and implement the necessary SqlWriter methods to write
the SQL for these clauses (IMPALA-4354).

I could then use this writer with some existing and new tests that take
a query written into our data model and write the SQL, verifying they're
correct.

For INSERT into Kudu tables, the equivalent PostgreSQL queries need to
use "ON CONFLICT DO NOTHING", so all existing and new query tests verify
they can be written as PostgreSQL as well.

When last doing an end to end test of the changes, I encounterd a
problem I hadn't anticipated (but should have): the Leopard framework no
longer had the ability to unpickle query objects, because the name had
changed. I found a solution on the Python wiki here

https://wiki.python.org/moin/UsingPickle/RenamingModules

and adapted it to my needs.

Last, I made some flake8 adjustments in a few files where I also changed
the pickling.

Testing:
- all the query generator tests pass
- I can run Leopard front_end.py and load older query generator reports,
  browse them, and re-run failed queries
- I can run Leopard controller.py to actually do a query generator
  run
- discrepancy_searcher.py --explain-only ran for hundreds of queries.
  There were no problems writing the SELECT queries

Change-Id: I38e24da78c49e908449b35f0a6276ebe4236ddba
---
M tests/comparison/leopard/controller.py
A tests/comparison/leopard/custom_pickle.py
M tests/comparison/leopard/front_end.py
M tests/comparison/leopard/job.py
M tests/comparison/leopard/report.py
M tests/comparison/leopard/schedule_item.py
M tests/comparison/model_translator.py
M tests/comparison/query.py
M tests/comparison/query_flattener.py
M tests/comparison/query_generator.py
M tests/comparison/tests/fake_query.py
M tests/comparison/tests/query_object_testdata.py
M tests/comparison/tests/test_query_objects.py
13 files changed, 728 insertions(+), 112 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/5162/2
-- 
To view, visit http://gerrit.cloudera.org:8080/5162
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I38e24da78c49e908449b35f0a6276ebe4236ddba
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Matthew Jacobs 


[Impala-ASF-CR] IMPALA-4511: Add missing total time counter() to PFE::Exec()

2016-11-21 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4511: Add missing total_time_counter() to PFE::Exec()
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/5149
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifdd6c059bdf022303920720291ebb319a69d60ec
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Dan Hecht 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4478: Initial Kudu client mem tracking for sink

2016-11-21 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4478: Initial Kudu client mem tracking for sink
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/5152/1/be/src/exec/kudu-table-sink.cc
File be/src/exec/kudu-table-sink.cc:

PS1, Line 34: 100 * 1024 * 1024
> how about pulling this into a #define and using it here and below, to make 
Done


Line 49: " required (bytes) for a KuduTableSink. This flag is subject to 
change or removal.");
> Can you add something here that indicates that it is typically 2 * kudu_mut
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5152
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I47f17a81e4362ab490019382fedc66c25f07080a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4363: Add Parquet timestamp validation

2016-11-21 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4363: Add Parquet timestamp validation
..


Patch Set 7:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/4968/7/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

Line 473: ErrorMsg msg;
this has a non-trivial destructor, so we should probably only construct it 
after checking NeedsValidation().  Also, why are NeedsValidation() and 
NeedsConversion() mutually exclusive?  Actually, doesn't this break 
--convert_legacy_hive_parquet_utc_timestamps since we'll never take the 
conversion path for timestamp now? (Do we not have tests for that?)  That is, I 
think both NeedsConversion() and NeedsValidation() can be true, and the code 
needs to be adjusted for that.

Also consider tucking lines 475 to 480 inside ValidateSlot() and that way the 
ErrorMsg doesn't have to be passed back and forth, and it'll naturally keep the 
construction after the NeedsValidation() check.


PS7, Line 474: val_ptr
you'll also have to be careful that we pass the right pointer here, depending 
on how you address the comment above.


http://gerrit.cloudera.org:8080/#/c/4968/7/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

Line 147:   /// Verifies that the timestamp date falls into a valid range 
(years 1400..).
update this comment to be clear that we also validate time.


PS7, Line 156: l
doesn't this have to be "ll" (long long)?


PS7, Line 160: time_.ticks() < NUMBER_OF_NANOSECONDS_IN_24H
I'm slightly worried about how this interacts with leap seconds.  Are you sure 
this is always illegal?  If it doesn't crash, what behavior did you see (i.e. 
how do you know this is the correct bounds)?


-- 
To view, visit http://gerrit.cloudera.org:8080/4968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9988449aa0dc0f39fabb91ce6cce0dd8a06e8bcf
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4502: test partition ddl predicates breaks on non-HDFS filesystems

2016-11-21 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-4502: test_partition_ddl_predicates breaks on non-HDFS 
filesystems
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5164/1/tests/metadata/test_ddl.py
File tests/metadata/test_ddl.py:

Line 396: self.run_test_case('QueryTest/partition-ddl-predicates-all-fs', 
vector,
> Good points. I'm more worried about having to copy+paste tests in two diffe
Makes sense. I went with option 1 and made the hdfs-specific file not have a 
lot of repeat queries. It takes ~33 seconds to run overall now on HDFS, so not 
too bad.


-- 
To view, visit http://gerrit.cloudera.org:8080/5164
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8606aa427cb6e50be3395cdde246abb53db5172c
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4502: test partition ddl predicates breaks on non-HDFS filesystems

2016-11-21 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has uploaded a new patch set (#2).

Change subject: IMPALA-4502: test_partition_ddl_predicates breaks on non-HDFS 
filesystems
..

IMPALA-4502: test_partition_ddl_predicates breaks on non-HDFS filesystems

This is because that test uses 'set cached' and 'set uncached' which
are not supported on non-HDFS filesystems. This patch creates a
separate test file for non-HDFS filesystems with only supported
queries and invokes the right file based on the filesystem.

Change-Id: I8606aa427cb6e50be3395cdde246abb53db5172c
---
R 
testdata/workloads/functional-query/queries/QueryTest/partition-ddl-predicates-all-fs.test
A 
testdata/workloads/functional-query/queries/QueryTest/partition-ddl-predicates-hdfs-only.test
M tests/metadata/test_ddl.py
3 files changed, 66 insertions(+), 23 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/5164/2
-- 
To view, visit http://gerrit.cloudera.org:8080/5164
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8606aa427cb6e50be3395cdde246abb53db5172c
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-3200: move bufferpool under runtime

2016-11-21 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/5165

Change subject: IMPALA-3200: move bufferpool under runtime
..

IMPALA-3200: move bufferpool under runtime

It is arguably a subcomponent of the runtime system

Change-Id: I3507066fb5cb955f0872d31a4619bd8bb7d647cd
---
M .clang-tidy
M be/CMakeLists.txt
M be/src/runtime/CMakeLists.txt
R be/src/runtime/bufferpool/CMakeLists.txt
R be/src/runtime/bufferpool/buffer-allocator.cc
R be/src/runtime/bufferpool/buffer-allocator.h
R be/src/runtime/bufferpool/buffer-pool-counters.h
R be/src/runtime/bufferpool/buffer-pool-test.cc
R be/src/runtime/bufferpool/buffer-pool.cc
R be/src/runtime/bufferpool/buffer-pool.h
R be/src/runtime/bufferpool/reservation-tracker-counters.h
R be/src/runtime/bufferpool/reservation-tracker-test.cc
R be/src/runtime/bufferpool/reservation-tracker.cc
R be/src/runtime/bufferpool/reservation-tracker.h
M be/src/runtime/mem-tracker.cc
15 files changed, 14 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/5165/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5165
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I3507066fb5cb955f0872d31a4619bd8bb7d647cd
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4478: Initial Kudu client mem tracking for sink

2016-11-21 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4478: Initial Kudu client mem tracking for sink
..


Patch Set 1: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5152/1/be/src/exec/kudu-table-sink.cc
File be/src/exec/kudu-table-sink.cc:

Line 49: " required (bytes) for a KuduTableSink. This flag is subject to 
change or removal.");
Can you add something here that indicates that it is typically 2 * 
kudu_mutation_buffer_size. Otherwise there's no info in the help text that the 
two options are correlated.


-- 
To view, visit http://gerrit.cloudera.org:8080/5152
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I47f17a81e4362ab490019382fedc66c25f07080a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model

2016-11-21 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs 
from query model
..


Patch Set 1: Code-Review+1

(3 comments)

http://gerrit.cloudera.org:8080/#/c/5162/1/tests/comparison/leopard/custom_pickle.py
File tests/comparison/leopard/custom_pickle.py:

std file header?


PS1, Line 8: 
   :   Thanks to 
https://wiki.python.org/moin/UsingPickle/RenamingModules
are we sure licensing is OK?


http://gerrit.cloudera.org:8080/#/c/5162/1/tests/comparison/query.py
File tests/comparison/query.py:

PS1, Line 43: Since all supported
: queries may have a WITH clause, getting table expressions 
from the WITH clause is
: supported here.
FYI, other DML statements won't support WITH clauses.


-- 
To view, visit http://gerrit.cloudera.org:8080/5162
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I38e24da78c49e908449b35f0a6276ebe4236ddba
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4510: Selectively filter args for metric verification tests

2016-11-21 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-4510: Selectively filter args for metric verification 
tests
..


Patch Set 2:

(8 comments)

I understand the problem, and I think I see what you're doing to fix it, but it 
might make things a tad clearer if you address my comments below. Thanks!

http://gerrit.cloudera.org:8080/#/c/5135/2//COMMIT_MSG
Commit Message:

PS2, Line 9: impala-pytest
impala-py.test


PS2, Line 10:  separate individual
You can just use one of these two words. :)


Line 28: 
Please mention the testing you performed for this patch.


http://gerrit.cloudera.org:8080/#/c/5135/2/tests/run-tests.py
File tests/run-tests.py:

Line 73:   Modify and return the command line arguments that will be passed to 
pytest.
Can you also talk about what log_base_name and valid_dirs need to be for a 
caller?


Line 75:   The raw command line arguments need to be modified because of the 
way our
Should all of this be in the docstring? It seems like you could instead inline 
some of this information in relevant places below.


PS2, Line 107: # We need to account for the fact that '--foo bar' and 
'--foo=bar' might
 : # be supplied by the user.
 : raw_args = itertools.chain(*[arg.split('=') for arg in 
sys.argv[1:]])
Crafty. It would help a reader to show a simple before/after example of what 
sys.argv[1:] starts as, and then what the resulting raw_args is.


PS2, Line 110: filtered_args = []
The name here is strange to me. These args aren't being filtered, but in fact 
they are getting through your filter, right? What's being filtered actually the 
options that raise ValueError below, right?


PS2, Line 113:   try:
 : pytest.config.getvalue(arg.strip('-'))  # Raises 
ValueError if invalid arg
 : filtered_args += [arg, str(raw_args.next())]
 :   except ValueError:
 : continue
This is related to the L75 docstring comment above: it might make more sense 
for a reader to see here that you're saying "If this is a pytest config argh, 
it should be carried along; otherwise it should be filtered"  or something 
similar.


-- 
To view, visit http://gerrit.cloudera.org:8080/5135
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I069172f44c1307d55f85779cdb01fecc0ba1799e
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-HasComments: Yes


  1   2   >