[Impala-ASF-CR] IMPALA-4847: Simplify HdfsTable block metadata loading code

2017-08-10 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-4847: Simplify HdfsTable block metadata loading code
..


Patch Set 1:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/7652/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

PS1, Line 252: for all partitions in 'partitions'
nit: "of all the 'partitions' that ..."


Line 258:*   and disk IDs.
Expand this to mention that we synthesize block metadata for FS that don't 
support disk ids. Also, mention (maybe in the beginning) that 'pathDir' could 
be in HDFS, S3 and ADLS.


PS1, Line 331: if (partitions == null || partitions.isEmpty()) return;
 : RemoteIterator fileStatusIter =
 : FileSystemUtil.listFiles(fs, partDir, false);
 : if (fileStatusIter == null) return;
 : while (fileStatusIter.hasNext()) {
 :   LocatedFileStatus fileStatus = fileStatusIter.next();
 :   if (!FileSystemUtil.isValidDataFile(fileStatus)) continue;
 :   // For the purpose of synthesizing block metadata, we 
assume that all partitions
 :   // with the same location have the same file format.
 :   FileDescriptor fd = 
FileDescriptor.createWithSynthesizedBlockMd(fileStatus,
 :   partitions.get(0).getFileFormat(), hostIndex_);
 :   // Update the partitions' metadata that this file belongs 
to.
 :   for (HdfsPartition partition: partitions) {
 : partition.getFileDescriptors().add(fd);
 :   }
 : }
With the exception of L340, this function resembles loadBlockMetadata. Maybe 
see if there is a nice way to merge these two.


PS1, Line 581: Path tblLocation = 
FileSystemUtil.createFullyQualifiedPath(getHdfsBaseDirPath());
It looks like this is only used in L590. Maybe move it closer to that.


PS1, Line 695: .keys()
remove


-- 
To view, visit http://gerrit.cloudera.org:8080/7652
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I963d647bd2ba11e3843c6ef2ac6be113c74280bf
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4833: Compute precise per-host reservation size

2017-08-10 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4833: Compute precise per-host reservation size
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7630/1/common/thrift/ImpalaInternalService.thrift
File common/thrift/ImpalaInternalService.thrift:

Line 484:   6: i64 min_reservation_bytes
> Are these optional or required? Not sure what the default is in thrift, we 
The default is effectively optional:
https://thrift.apache.org/docs/idl#field-requiredness

While the doc says the default required-ness is good, I agree it's helpful to 
be explicit, especially in our code where we typically don't rely on the 
default. The protocol_version thing is misleading because we've been iterating 
on these interfaces regularly without bumping the version. That said, it looks 
like the convention here is to avoid explicitly requiring these fields, and 
instead indicating they're required in a particular version. (Technically 
that's a good practice, since required fields can be evil. Unfortunately we're 
pretty inconsistent.)


http://gerrit.cloudera.org:8080/#/c/7630/1/common/thrift/Planner.thrift
File common/thrift/Planner.thrift:

Line 66:   7: optional i64 min_reservation_bytes
> It's kind-of confusing that these are optional, since the backend handles t
Done


http://gerrit.cloudera.org:8080/#/c/7630/1/fe/src/main/java/org/apache/impala/planner/Planner.java
File fe/src/main/java/org/apache/impala/planner/Planner.java:

Line 365:   // Different fragments do not synchronize their Open() and 
Close(), so the backend
> I think this comment is important for understand why we sum up the fragment
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7630
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idbcd1e9b1be14edc4017b4907e83f9d56059fbac
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4833: Compute precise per-host reservation size

2017-08-10 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has uploaded a new patch set (#2).

Change subject: IMPALA-4833: Compute precise per-host reservation size
..

IMPALA-4833: Compute precise per-host reservation size

Before this change, the per-host reservation size was computed
by the Planner. However, scheduling happens after planning,
so the Planner must assume that all fragments run on all
hosts, and the reservation size is likely much larger than
it needs to be.

This moves the computation of the per-host reservation size
to the BE where it can be computed more precisely. This also
includes a number of plan/profile changes.

Change-Id: Idbcd1e9b1be14edc4017b4907e83f9d56059fbac
---
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/query-state.cc
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/query-schedule.h
M be/src/scheduling/scheduler.cc
M be/src/scheduling/scheduler.h
M be/src/service/client-request-state.cc
M common/thrift/Frontend.thrift
M common/thrift/ImpalaInternalService.thrift
M common/thrift/Planner.thrift
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level0.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level1.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level2.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level3.test
M testdata/workloads/functional-query/queries/QueryTest/stats-extrapolation.test
M tests/custom_cluster/test_coordinators.py
A tests/custom_cluster/test_mem_reservations.py
24 files changed, 348 insertions(+), 219 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/7630/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7630
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idbcd1e9b1be14edc4017b4907e83f9d56059fbac
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5773: Correctly account for memory used in data stream receiver queue

2017-08-10 Thread Henry Robinson (Code Review)
Henry Robinson has uploaded a new patch set (#2).

Change subject: IMPALA-5773: Correctly account for memory used in data stream 
receiver queue
..

IMPALA-5773: Correctly account for memory used in data stream receiver queue

DataStreamRecvrs keep one or more queues of batches received to provide
some buffering. Each queue has a fixed byte size capacity. The estimate
of the contribution of a new RowBatch to that queue was using the
compressed size of the TRowBatch it would be deserialized from, which is
the wrong value (since the batch is uncompressed after deserialization).

* Add RowBatch::Get[Des|S]erializedSize(const TRowBatch&) to RowBatch
* Fix the estimate to use the uncompressed size.
* Add a DataStreamReceiver child profile to the exchg node so that the
  peak memory used by the receiver can be monitored easily.

Confirmed that the following query:

select count(distinct concat(cast(l_comment as char(120)),
 cast(l_comment as char(120)),
 cast(l_comment as char(120)),
 cast(l_comment as char(120)),
 cast(l_comment as char(120)),
 cast(l_comment as char(120))) from lineitem;

succeeds with a mem-limit of 800Mb. Before this patch it would fail in a
one-node cluster as the datastream recvr would buffer more batches than
the memory limit would allow.

Change-Id: I9e90f9596ee984438e3373af05e84d361702ca6a
---
M be/src/benchmarks/row-batch-serialize-benchmark.cc
M be/src/runtime/data-stream-mgr.cc
M be/src/runtime/data-stream-recvr.cc
M be/src/runtime/data-stream-recvr.h
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
7 files changed, 38 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/7646/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7646
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9e90f9596ee984438e3373af05e84d361702ca6a
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5773: Correctly account for memory used in data stream receiver queue

2017-08-10 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-5773: Correctly account for memory used in data stream 
receiver queue
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7646/1/be/src/runtime/row-batch.cc
File be/src/runtime/row-batch.cc:

PS1, Line 343: int
> Would it make sense to change to int64_t to avoid potential overflows? Not 
Done


http://gerrit.cloudera.org:8080/#/c/7646/1/be/src/runtime/row-batch.h
File be/src/runtime/row-batch.h:

Line 304:   /// deserialized form.
> Maybe mention compression explicitly. E.g. serialized (i.e. compressed) or 
Done - not all serialized batches are compressed, so need to be a bit careful 
there, but have tried to word it clearly.


-- 
To view, visit http://gerrit.cloudera.org:8080/7646
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9e90f9596ee984438e3373af05e84d361702ca6a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4407: Move Impala setup procedures to main repo

2017-08-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4407: Move Impala setup procedures to main repo
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1030/

-- 
To view, visit http://gerrit.cloudera.org:8080/7587
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I728abfa806ecd9461dfb443278c2a464714d984c
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4407: Move Impala setup procedures to main repo

2017-08-10 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-4407: Move Impala setup procedures to main repo
..


Patch Set 6:

> Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1029/

https://issues.apache.org/jira/browse/IMPALA-5765

-- 
To view, visit http://gerrit.cloudera.org:8080/7587
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I728abfa806ecd9461dfb443278c2a464714d984c
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4407: Move Impala setup procedures to main repo

2017-08-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4407: Move Impala setup procedures to main repo
..


Patch Set 6: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1029/

-- 
To view, visit http://gerrit.cloudera.org:8080/7587
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I728abfa806ecd9461dfb443278c2a464714d984c
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] PREVIEW: IMPALA-2615: support [[nodiscard]] on Status

2017-08-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: PREVIEW: IMPALA-2615: support [[nodiscard]] on Status
..


Patch Set 5:

(14 comments)

http://gerrit.cloudera.org:8080/#/c/7253/5/be/CMakeLists.txt
File be/CMakeLists.txt:

Line 73: if (CMAKE_CXX_COMPILER_VERSION VERSION_GREATER 7.0)
> Should this be limited to GCC, too?
My thought was that it's only setting GCC flags, so it shouldn't matter.


Line 74:   #  -Wno-placement-new: avoid spurious warnings from boost::function.
> Can you describe this error a bit more?
Added a link to the Boost fix for it.


Line 75:   #  -faligned-new: new will automatically align types. Otherwise "new 
Counter()" in the
> It is my understanding that new produces things that are as aligned as void
Filed KUDU-2094 against Kudu. It looks like Counter contains class Cell, which 
is meant to be cacheline aligned. So this is a bug in the Kudu util code.


http://gerrit.cloudera.org:8080/#/c/7253/5/be/src/exec/hbase-table-scanner.cc
File be/src/exec/hbase-table-scanner.cc:

Line 764: discard_result(JniUtil::FreeGlobalRef(env, resultscanner_));
> What is the rationale for the difference between this line and 761, where y
This triggered me to look more closely at why DeleteGlobalRef might throw an 
exception. It looks like it can't actually throw an exception (there is no 
THROWS section here: 
http://docs.oracle.com/javase/7/docs/technotes/guides/jni/spec/functions.html)

The pattern of checking for exceptions seems to have started in this commit: 
85b1154ba1c2edcccd673eb92c4d30c9a635442e where the exception checked for was 
actually from the method call.

Then it got copy-pasted and was made into a utility function.


http://gerrit.cloudera.org:8080/#/c/7253/5/be/src/exec/hbase-table-scanner.h
File be/src/exec/hbase-table-scanner.h:

Line 88:   static Status Init() WARN_UNUSED_RESULT;
> Once we move to GCC7, we no longer need WARN_UNUSED_RESULT?
For Status at least it will be automatic. We would just use it for other return 
types. E.g. bool.


http://gerrit.cloudera.org:8080/#/c/7253/5/be/src/rpc/auth-provider.h
File be/src/rpc/auth-provider.h:

Line 51:   /// that
> fix wrap
Done


http://gerrit.cloudera.org:8080/#/c/7253/5/be/src/rpc/thrift-client.cc
File be/src/rpc/thrift-client.cc:

Line 40:   if (!init_status_.ok()) return init_status_;
> RETURN_IF_ERROR?
Done


http://gerrit.cloudera.org:8080/#/c/7253/5/be/src/rpc/thrift-client.h
File be/src/rpc/thrift-client.h:

Line 161:   if (!init_status_.ok()) return;
> and LOG?
I think that should be up to the caller.


http://gerrit.cloudera.org:8080/#/c/7253/5/be/src/runtime/data-stream-recvr.cc
File be/src/runtime/data-stream-recvr.cc:

Line 346:   Status status = mgr_->DeregisterRecvr(fragment_instance_id(), 
dest_node_id());
> const
Done


http://gerrit.cloudera.org:8080/#/c/7253/5/be/src/runtime/fragment-instance-state.cc
File be/src/runtime/fragment-instance-state.cc:

Line 100:   discard_result(WaitForPrepare());  // make sure Prepare() finished
> Why is discard OK here?
It doesn't matter if prepare failed since we're cancelling it


http://gerrit.cloudera.org:8080/#/c/7253/5/be/src/runtime/hbase-table-factory.cc
File be/src/runtime/hbase-table-factory.cc:

Line 95: discard_result(JniUtil::FreeGlobalRef(env, connection_));
> Why is discard OK here?
We can't propagate the status here.

We also shouldn't be running this function outside of backend tests, since the 
HBaseTableFactory is a singleton that's only set up once per daemon process.


http://gerrit.cloudera.org:8080/#/c/7253/5/be/src/scheduling/scheduler-test-util.cc
File be/src/scheduling/scheduler-test-util.cc:

Line 510:   Status status = scheduler_->Init();
> const
Done


http://gerrit.cloudera.org:8080/#/c/7253/5/be/src/service/fe-support.cc
File be/src/service/fe-support.cc:

Line 72:   THROW_IF_ERROR(LlvmCodeGen::InitializeLlvm(true), env, 
JniUtil::internal_exc_class());
> I'm surprised you want to throw in a function with extern "C" linkage.
This propagates a Java exception to the Java code calling into this function 
via JNI. It's done elsewhere in this file.


http://gerrit.cloudera.org:8080/#/c/7253/5/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

Line 697: Status status = profile_logger_->Flush();
> const
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7253
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I972543af2e9f98b12dcbb5479b4c1a7d53952197
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5778: clarify --read size option.

2017-08-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5778: clarify --read_size option.
..


IMPALA-5778: clarify --read_size option.

Remove BTS_BLOCK_OVERFLOW error code, which is no longer used and
referenced --read_size.

Improve the flag description. The output is now:

-read_size ((Advanced) The preferred I/O request size in bytes to issue to
  HDFS or the local filesystem. Increasing the read size will increase
  memory requirements. Decreasing the read size may decrease I/O
  throughput.) type: int32 default: 8388608

Testing:
Tested that Impala built and basic queries could run.

Change-Id: I3c20a9d55f89170b11f569c90b7f2949ddbe4211
Reviewed-on: http://gerrit.cloudera.org:8080/7623
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins
---
M be/src/runtime/disk-io-mgr.cc
M common/thrift/generate_error_codes.py
2 files changed, 6 insertions(+), 5 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Tim Armstrong: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7623
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I3c20a9d55f89170b11f569c90b7f2949ddbe4211
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5778: clarify --read size option.

2017-08-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5778: clarify --read_size option.
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7623
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3c20a9d55f89170b11f569c90b7f2949ddbe4211
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[native-toolchain-CR] IMPALA-5714: Add linker's version script for OpenSSL library

2017-08-10 Thread Michael Ho (Code Review)
Michael Ho has abandoned this change.

Change subject: IMPALA-5714: Add linker's version script for OpenSSL library
..


Abandoned

Decided that we won't need it for KRPC.

-- 
To view, visit http://gerrit.cloudera.org:8080/7484
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: I72b39c5e15db268d35210c013e885f36764f1f89
Gerrit-PatchSet: 6
Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-1478: Improve error message when subquery is used in the ON clause

2017-08-10 Thread Pranay Singh (Code Review)
Pranay Singh has uploaded a new patch set (#4).

Change subject: IMPALA-1478: Improve error message when subquery is used in the 
 ON clause
..

IMPALA-1478: Improve error message when subquery is used in the
 ON clause

Fix: Print the error stating that "Suquery not allowed in the ON clause"
 Add test case for testing the failure when a subquery is used in
 the ON clause.

Change-Id: I0d1dc47987de7ea04402e1ead31d81cddf2f96f2
---
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
2 files changed, 9 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/7588/4
-- 
To view, visit http://gerrit.cloudera.org:8080/7588
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0d1dc47987de7ea04402e1ead31d81cddf2f96f2
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Pranay Singh


[Impala-ASF-CR] PREVIEW: IMPALA-2615: support [[nodiscard]] on Status

2017-08-10 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: PREVIEW: IMPALA-2615: support [[nodiscard]] on Status
..


Patch Set 5:

(14 comments)

Halfway through reviewing; though we should hash some of this out before I do 
the second half.

http://gerrit.cloudera.org:8080/#/c/7253/5/be/CMakeLists.txt
File be/CMakeLists.txt:

Line 73: if (CMAKE_CXX_COMPILER_VERSION VERSION_GREATER 7.0)
Should this be limited to GCC, too?


Line 74:   #  -Wno-placement-new: avoid spurious warnings from boost::function.
Can you describe this error a bit more?


Line 75:   #  -faligned-new: new will automatically align types. Otherwise "new 
Counter()" in the
It is my understanding that new produces things that are as aligned as void * 
(aka 8 bytes). Can you describe a bit more what's going on with this warning?


http://gerrit.cloudera.org:8080/#/c/7253/5/be/src/exec/hbase-table-scanner.cc
File be/src/exec/hbase-table-scanner.cc:

Line 764: discard_result(JniUtil::FreeGlobalRef(env, resultscanner_));
What is the rationale for the difference between this line and 761, where you 
LOG(WARNING)? See also below.


http://gerrit.cloudera.org:8080/#/c/7253/5/be/src/exec/hbase-table-scanner.h
File be/src/exec/hbase-table-scanner.h:

Line 88:   static Status Init() WARN_UNUSED_RESULT;
Once we move to GCC7, we no longer need WARN_UNUSED_RESULT?


http://gerrit.cloudera.org:8080/#/c/7253/5/be/src/rpc/auth-provider.h
File be/src/rpc/auth-provider.h:

Line 51:   /// that
fix wrap


http://gerrit.cloudera.org:8080/#/c/7253/5/be/src/rpc/thrift-client.cc
File be/src/rpc/thrift-client.cc:

Line 40:   if (!init_status_.ok()) return init_status_;
RETURN_IF_ERROR?


http://gerrit.cloudera.org:8080/#/c/7253/5/be/src/rpc/thrift-client.h
File be/src/rpc/thrift-client.h:

Line 161:   if (!init_status_.ok()) return;
and LOG?


http://gerrit.cloudera.org:8080/#/c/7253/5/be/src/runtime/data-stream-recvr.cc
File be/src/runtime/data-stream-recvr.cc:

Line 346:   Status status = mgr_->DeregisterRecvr(fragment_instance_id(), 
dest_node_id());
const


http://gerrit.cloudera.org:8080/#/c/7253/5/be/src/runtime/fragment-instance-state.cc
File be/src/runtime/fragment-instance-state.cc:

Line 100:   discard_result(WaitForPrepare());  // make sure Prepare() finished
Why is discard OK here?


http://gerrit.cloudera.org:8080/#/c/7253/5/be/src/runtime/hbase-table-factory.cc
File be/src/runtime/hbase-table-factory.cc:

Line 95: discard_result(JniUtil::FreeGlobalRef(env, connection_));
Why is discard OK here?


http://gerrit.cloudera.org:8080/#/c/7253/5/be/src/scheduling/scheduler-test-util.cc
File be/src/scheduling/scheduler-test-util.cc:

Line 510:   Status status = scheduler_->Init();
const


http://gerrit.cloudera.org:8080/#/c/7253/5/be/src/service/fe-support.cc
File be/src/service/fe-support.cc:

Line 72:   THROW_IF_ERROR(LlvmCodeGen::InitializeLlvm(true), env, 
JniUtil::internal_exc_class());
I'm surprised you want to throw in a function with extern "C" linkage.


http://gerrit.cloudera.org:8080/#/c/7253/5/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

Line 697: Status status = profile_logger_->Flush();
const


-- 
To view, visit http://gerrit.cloudera.org:8080/7253
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I972543af2e9f98b12dcbb5479b4c1a7d53952197
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Jim Apple 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4407: Move Impala setup procedures to main repo

2017-08-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4407: Move Impala setup procedures to main repo
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1029/

-- 
To view, visit http://gerrit.cloudera.org:8080/7587
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I728abfa806ecd9461dfb443278c2a464714d984c
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4407: Move Impala setup procedures to main repo

2017-08-10 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-4407: Move Impala setup procedures to main repo
..


Patch Set 6: Code-Review+2

rebase carry +2

-- 
To view, visit http://gerrit.cloudera.org:8080/7587
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I728abfa806ecd9461dfb443278c2a464714d984c
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4407: Move Impala setup procedures to main repo

2017-08-10 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-4407: Move Impala setup procedures to main repo
..


Patch Set 5: Code-Review+2

Thanks for doing this! LGTM.

-- 
To view, visit http://gerrit.cloudera.org:8080/7587
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I728abfa806ecd9461dfb443278c2a464714d984c
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] MPALA-5776: Write partial tuple to the correct mempool

2017-08-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: MPALA-5776: Write partial tuple to the correct mempool
..


Patch Set 2:

(10 comments)

This is very subtle. I think the solution works but maybe there are some ways 
we can make it clearer why it works.

http://gerrit.cloudera.org:8080/#/c/7639/2//COMMIT_MSG
Commit Message:

Line 7: MPALA-5776: Write partial tuple to the correct mempool
Missing I.


http://gerrit.cloudera.org:8080/#/c/7639/2/be/src/exec/hdfs-text-scanner.cc
File be/src/exec/hdfs-text-scanner.cc:

Line 169: row_batch->tuple_data_pool()->AcquireData(boundary_pool_.get(), 
false);
Does this even make sense? are there any cases where it's valid for the row 
batch to reference data in the boundary pool directly?

It seems like either this is unnecessary or we should be transferring the 
contents of boundary_pool_ below instead of clearing it.


Line 299: boundary_column_.Clear();
This is subtle because the contents of boundary_column_ are needed until 
WriteFields() below, I think. Maybe it would be clearer to document this or 
even move it below to the point where the data is no longer needed.


PS2, Line 765: batches
I'm not sure that I understand the reference to batches. Do they mean blocks?


Line 806:   partial_tuple_->DeepCopy(tuple_, *scan_node_->tuple_desc(), 
pool);
Can you add a column clarifying what the intent is. E.g. "Copy over all 
materialized slots in the partial tuple".


Line 808:   boundary_row_.Reset();
It might be clearer if we factored out this reset logic for boundary_ and 
partial_tuple_ into a separate function and documented the pre-conditions for 
it. I.e. the row has been fully materialized and all memory has been copied 
into the RowBatch pool (I think that's the precondition but not sure).


PS2, Line 810: emptied
cleared? The comment makes me thing that all the memory has been freed, but 
Clear() just enables recycling of the chunks.


Line 814:   partial_tuple_ = Tuple::Create(tuple_byte_size_, 
boundary_pool_.get());
Is it possible to just initialize partial_tuple_ when it's needed? I.e. before 
we call InitTuple(template_tuple_, partial_tuple_). Then maybe we can get rid 
of partial_tuple_empty_ and represent that as partition_tuple_ == nullptr.


http://gerrit.cloudera.org:8080/#/c/7639/2/be/src/exec/hdfs-text-scanner.h
File be/src/exec/hdfs-text-scanner.h:

Line 188:   /// the boundary pool.
Hah! If only the code had matched the comment...


Line 194:   /// Mem pool for boundary_row_ and boundary_column_.
Needs updating since it also backs partial_tuple_.


-- 
To view, visit http://gerrit.cloudera.org:8080/7639
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I60ba5c113aefd17f697c1888fd46a237ef396540
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5327: Handle return of JNI GetStringUTFChar

2017-08-10 Thread Tianyi Wang (Code Review)
Tianyi Wang has posted comments on this change.

Change subject: IMPALA-5327: Handle return of JNI GetStringUTFChar
..


Patch Set 3:

> (5 comments)

Comments addressed. The code is still a little redundant for resource releasing

-- 
To view, visit http://gerrit.cloudera.org:8080/7642
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I47eed045f21c6ed3a8decf422ce8429fc66c4322
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5327: Handle return of JNI GetStringUTFChar

2017-08-10 Thread Tianyi Wang (Code Review)
Tianyi Wang has uploaded a new patch set (#3).

Change subject: IMPALA-5327: Handle return of JNI GetStringUTFChar
..

IMPALA-5327: Handle return of JNI GetStringUTFChar

GetStringUTFChars may return NULL or throw exception and it is not
handled in two places. This patch checks the return value/exception,
logs if there is error and return error to the caller.

Change-Id: I47eed045f21c6ed3a8decf422ce8429fc66c4322
---
M be/src/util/jni-util.cc
M be/src/util/jni-util.h
M be/src/util/logging-support.cc
3 files changed, 49 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/7642/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7642
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I47eed045f21c6ed3a8decf422ce8429fc66c4322
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5327: Handle return of JNI GetStringUTFChar

2017-08-10 Thread Tianyi Wang (Code Review)
Tianyi Wang has uploaded a new patch set (#2).

Change subject: IMPALA-5327: Handle return of JNI GetStringUTFChar
..

IMPALA-5327: Handle return of JNI GetStringUTFChar

GetStringUTFChars may return NULL or throw exception and it is not
handled in two places. This patch checks the return value/exception,
logs if there is error and return error to the caller.

Change-Id: I47eed045f21c6ed3a8decf422ce8429fc66c4322
---
M be/src/util/jni-util.cc
M be/src/util/jni-util.h
M be/src/util/logging-support.cc
3 files changed, 47 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/7642/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7642
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I47eed045f21c6ed3a8decf422ce8429fc66c4322
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5412 Fix scan result with partitions on same file

2017-08-10 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5412 Fix scan result with partitions on same file
..


Patch Set 3:

(1 comment)

> (2 comments)
 > 
 > Gabor told me that he won't be able to work on this for around 10
 > days. From my point of view this is pretty close, so I think we
 > should consider filing a follow-on JIRA for the extra test coverage
 > and getting this in soonish to give it time to bake.

Works for me.

http://gerrit.cloudera.org:8080/#/c/7625/3/be/src/exec/hdfs-text-scanner.cc
File be/src/exec/hdfs-text-scanner.cc:

PS3, Line 589:   c
nit: 4 spaces


-- 
To view, visit http://gerrit.cloudera.org:8080/7625
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5677: limit clean page memory consumption

2017-08-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7653

Change subject: IMPALA-5677: limit clean page memory consumption
..

IMPALA-5677: limit clean page memory consumption

Adds the following flag:

  -buffer_pool_clean_pages_limit ((Advanced) Limit on bytes of clean spilled
pages that will be accumulated in the buffer pool. Specified as number of
bytes ('[bB]?'), megabytes ('[mM]'), gigabytes
('[gG]'), or percentage of the buffer pool limit ('%').
Defaults to bytes if no unit is given..) type: string default: "10%"

This helps prevent Impala accumulating excessive amounts of clean pages,
which can cause some problems in practice:
* The OS buffer cache is squeezed, reducing I/O performance from HDFS
  and potentially reducing the ability of the OS to absorb writes from
  Impala without blocking.
* Impala process memory consumption can expand more than users or admins
  might expect. E.g. if one query is running with a mem_limit of 1GB,
  many people will be surprised if the process inflates to the full
  process limit of 100GB. Impala doesn't provide any guarantees except
  from staying within the process mem_limit, but this could be a
  surprising divergence from past behaviour.

Observability:
A new metric buffer-pool.clean-pages-limit is added.

Testing:
Added a backend test to directly test that clean pages are evicted.
Ran in a loop to flush out any flakiness.

Ran exhaustive tests.

Change-Id: Ib6b687ab4bdddf07d9d6c997ff814aa3976042f9
---
M be/src/common/global-flags.cc
M be/src/runtime/bufferpool/buffer-allocator-test.cc
M be/src/runtime/bufferpool/buffer-allocator.cc
M be/src/runtime/bufferpool/buffer-allocator.h
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/bufferpool/buffer-pool.cc
M be/src/runtime/bufferpool/buffer-pool.h
M be/src/runtime/bufferpool/suballocator-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/test-env.cc
M be/src/util/memory-metrics.cc
M be/src/util/memory-metrics.h
M common/thrift/metrics.json
14 files changed, 234 insertions(+), 72 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/7653/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7653
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib6b687ab4bdddf07d9d6c997ff814aa3976042f9
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4847: Simplify HdfsTable block metadata loading code

2017-08-10 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7652

Change subject: IMPALA-4847: Simplify HdfsTable block metadata loading code
..

IMPALA-4847: Simplify HdfsTable block metadata loading code

This commit is a part of ground work for the upcoming multi
threaded block metadata loading patches.

The patch for IMPALA-4172 introduced code that groups the block
location requests for partition directories that reside under the
table directory into a single call to the NN in order to reduce the
number of RPCs. However, it turns out that the hdfs client library
internally makes one RPC per directory thus defeating the
purpose of optimization. Also, this made the code unnecessarily
complex since we need to map each file to its corresponding partition
at runtime.

This patch undos that optimization and makes HDFS calls per partition,
which is much easier to understand. This also helps the upcoming patch
on multi threaded block metadata loading since there is much less shared
state when loading multiple partitions in parallel.

Change-Id: I963d647bd2ba11e3843c6ef2ac6be113c74280bf
---
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
1 file changed, 36 insertions(+), 105 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/7652/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7652
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I963d647bd2ba11e3843c6ef2ac6be113c74280bf
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 


[Impala-ASF-CR] IMPALA-5412 Fix scan result with partitions on same file

2017-08-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5412 Fix scan result with partitions on same file
..


Patch Set 3:

(2 comments)

Gabor told me that he won't be able to work on this for around 10 days. From my 
point of view this is pretty close, so I think we should consider filing a 
follow-on JIRA for the extra test coverage and getting this in soonish to give 
it time to bake.

http://gerrit.cloudera.org:8080/#/c/7625/3/tests/metadata/test_partition_metadata.py
File tests/metadata/test_partition_metadata.py:

PS3, Line 86: Imapala
There's still an extra 'a' :)


PS3, Line 125:  
Trailing whitespace.


-- 
To view, visit http://gerrit.cloudera.org:8080/7625
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5327: Handle return of JNI GetStringUTFChar

2017-08-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5327: Handle return of JNI GetStringUTFChar
..


Patch Set 1:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/7642/1/be/src/util/jni-util.cc
File be/src/util/jni-util.cc:

Line 183: LOG_AT_LEVEL(google::FATAL) << get_str_fail_message;
LOG(FATAL) actually kills the process. How about LOG(ERROR) instead?


PS1, Line 184: MEM_ALLOC_FAILED
This error message requires a parameter (see the string in 
common/thrift/generate_error_codes.py). How about just returning 
Status(get_str_fail_message);


Line 193:   LOG_AT_LEVEL(google::FATAL) << get_str_fail_message;
Same here.


PS1, Line 202: prefix + msg_str)
We usually prefer Substitute() for cases like this. E.g.

  return Status(Substitute("$0: $1", prefix, msg_str));


http://gerrit.cloudera.org:8080/#/c/7642/1/be/src/util/logging-support.cc
File be/src/util/logging-support.cc:

Line 48:   auto get_str_fail_message = "GetStringUTFChars failed. Probable OOM 
on JVM side";
Same comments here. How about we add a function to the JNIUtil class that does 
this to avoid repeating the logic. E.g.

Status GetStringUTFChars();


-- 
To view, visit http://gerrit.cloudera.org:8080/7642
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I47eed045f21c6ed3a8decf422ce8429fc66c4322
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4407: Move Impala setup procedures to main repo

2017-08-10 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-4407: Move Impala setup procedures to main repo
..


Patch Set 5:

Sailesh since you also took a look at this and left comments, do you want to 
give the +2 after taking another look?

-- 
To view, visit http://gerrit.cloudera.org:8080/7587
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I728abfa806ecd9461dfb443278c2a464714d984c
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] PREVIEW: IMPALA-2615: support [[nodiscard]] on Status

2017-08-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#5).

Change subject: PREVIEW: IMPALA-2615: support [[nodiscard]] on Status
..

PREVIEW: IMPALA-2615: support [[nodiscard]] on Status

This is the set of changes required to get Impala to compile on
GCC 7 using the [[nodiscard]] attribute, which generates a warning
whenever a status is dropped. It is not enabled on the current
default compiler GCC 4.9.2 or Clang 3.8 so I added WARN_UNUSED_RESULT
in various classes so that we can catch the dropped statuses with
our current toolchain.

The changes are:
* Use the new [[nodiscard]] attribute and fix all the dropped
  statuses. Many were innocuous or very improbably but some appear to
  be actual bugs. Adds a discard_result() function that explicitly
  ignores the result of a function.
* Fix miscellaneous compile errors and warnings.
* Remove use of ptr_vector, which pulls in headers with deprecated
  things.
* Fix a memory lifetime bug with default_fs_ (it was masked by
  the old refcounted std::string implementation).

Change-Id: I972543af2e9f98b12dcbb5479b4c1a7d53952197
---
M be/CMakeLists.txt
M be/src/benchmarks/bit-packing-benchmark.cc
M be/src/benchmarks/expr-benchmark.cc
M be/src/benchmarks/hash-benchmark.cc
M be/src/benchmarks/network-perf-benchmark.cc
M be/src/benchmarks/row-batch-serialize-benchmark.cc
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-util.cc
M be/src/catalog/catalogd-main.cc
M be/src/codegen/codegen-symbol-emitter.cc
M be/src/codegen/llvm-codegen-test.cc
M be/src/common/compiler-util.h
M be/src/common/init.cc
M be/src/common/status.h
M be/src/exec/hbase-scan-node.cc
M be/src/exec/hbase-table-scanner.cc
M be/src/exec/hbase-table-scanner.h
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/kudu-scan-node.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-table-sink.h
M be/src/exec/kudu-util.h
M be/src/experiments/compression-test.cc
M be/src/exprs/expr-codegen-test.cc
M be/src/exprs/expr-test.cc
M be/src/rpc/auth-provider.h
M be/src/rpc/thrift-client.cc
M be/src/rpc/thrift-client.h
M be/src/runtime/buffered-tuple-stream-test.cc
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/client-cache.cc
M be/src/runtime/client-cache.h
M be/src/runtime/collection-value-builder.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/data-stream-recvr.cc
M be/src/runtime/data-stream-test.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/hbase-table-factory.cc
M be/src/runtime/parallel-executor.cc
M be/src/runtime/tmp-file-mgr-test.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/scheduling/scheduler-test-util.cc
M be/src/scheduling/scheduler-test.cc
M be/src/service/client-request-state.cc
M be/src/service/fe-support.cc
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/impalad-main.cc
M be/src/service/query-options-test.cc
M be/src/statestore/statestore.cc
M be/src/statestore/statestore.h
M be/src/statestore/statestored-main.cc
M be/src/testutil/death-test-util.h
M be/src/testutil/fault-injection-util.cc
M be/src/testutil/impalad-query-executor.cc
M be/src/testutil/in-process-servers.cc
M be/src/testutil/in-process-servers.h
M be/src/util/benchmark.cc
M be/src/util/bit-util-test.cc
M be/src/util/codec.h
M be/src/util/filesystem-util.h
M be/src/util/hdfs-util-test.cc
M be/src/util/jni-util.h
M be/src/util/memory-metrics.h
M be/src/util/metrics-test.cc
M be/src/util/network-util.h
M be/src/util/parquet-reader.cc
M be/src/util/runtime-profile.cc
M be/src/util/runtime-profile.h
M be/src/util/thread-pool.h
M be/src/util/thread.cc
M be/src/util/thread.h
77 files changed, 380 insertions(+), 278 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/7253/5
-- 
To view, visit http://gerrit.cloudera.org:8080/7253
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I972543af2e9f98b12dcbb5479b4c1a7d53952197
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5764: Allow overriding packaged components

2017-08-10 Thread Zach Amsden (Code Review)
Zach Amsden has abandoned this change.

Change subject: IMPALA-5764: Allow overriding packaged components
..


Abandoned

Gerrit created another change for some reason.

-- 
To view, visit http://gerrit.cloudera.org:8080/7648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: If3e4748aae54d3c4f723591bfd2ce87594338cfd
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 


[Impala-ASF-CR] IMPALA-5778: clarify --read size option.

2017-08-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5778: clarify --read_size option.
..


Patch Set 2: Code-Review+2

rebase

-- 
To view, visit http://gerrit.cloudera.org:8080/7623
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3c20a9d55f89170b11f569c90b7f2949ddbe4211
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5773: Correctly account for memory used in data stream receiver queue

2017-08-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5773: Correctly account for memory used in data stream 
receiver queue
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7646/1/be/src/runtime/row-batch.cc
File be/src/runtime/row-batch.cc:

PS1, Line 343: int
Would it make sense to change to int64_t to avoid potential overflows? Not sure 
if that causes problems at any callsites.


http://gerrit.cloudera.org:8080/#/c/7646/1/be/src/runtime/row-batch.h
File be/src/runtime/row-batch.h:

Line 304:   /// deserialized form.
Maybe mention compression explicitly. E.g. serialized (i.e. compressed) or 
deserialized (i.e. uncompressed).


-- 
To view, visit http://gerrit.cloudera.org:8080/7646
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9e90f9596ee984438e3373af05e84d361702ca6a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5764: Allow overriding packaged components

2017-08-10 Thread Zach Amsden (Code Review)
Zach Amsden has uploaded a new patch set (#2).

Change subject: IMPALA-5764: Allow overriding packaged components
..

IMPALA-5764: Allow overriding packaged components

For allowing multiple different distributions to build against the same
codebase, allow overriding various environment variables as well as what
packages to download into the toolchain.  This allows multiple sets of
packaged components to exist in the toolchain and testdata concurrently,
and to easily swap between them using environment overrides.

For example, one could specify overrides to packages and disambiguate
the resulting database paths:

PACKAGED_COMPONENTS_NAME=apache_components
PACKAGED_COMPONENTS_PATH='/apache_bucket/'
PACKAGED_COMPONENTS=avro,hadoop,hbase,hive,sentry,parquet
IMPALA_AVRO_VERSION=2.0.2-apache-local
IMPALA_HADOOP_VERSION=3.0.0-apache-experimental
IMPALA_HBASE_VERSION=1.2.0-apache-SNAPSHOT
IMPALA_HIVE_VERSION=1.1.0-apache-SNAPSHOT
IMPALA_SENTRY_VERSION=1.5.1-apache-SNAPSHOT
IMPALA_PARQUET_VERSION=1.5.0-apache-SNAPSHOT

And these packages would be downloaded from the '/apache_components/'
S3 bucket into the toolchain's apache_componenets directory.

Change-Id: I95c2662e6f62adc924cc5de7a371202126046545
---
M README.md
M bin/bootstrap_toolchain.py
M bin/impala-config.sh
M testdata/cluster/admin
4 files changed, 61 insertions(+), 49 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/7581/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I95c2662e6f62adc924cc5de7a371202126046545
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-5412 Fix scan result with partitions on same file

2017-08-10 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has posted comments on this change.

Change subject: IMPALA-5412 Fix scan result with partitions on same file
..


Patch Set 1:

(20 comments)

http://gerrit.cloudera.org:8080/#/c/7625/1/be/src/exec/base-sequence-scanner.cc
File be/src/exec/base-sequence-scanner.cc:

Line 90:   context->partition_descriptor()->id(),stream_->filename()));
> nit:space after ,
Done


PS1, Line 163: ,s
> space
Done


PS1, Line 309: 
> 4 spaces
Done


http://gerrit.cloudera.org:8080/#/c/7625/1/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 613: context_->partition_descriptor()->id(), filename());
> 4 spaces
Done


http://gerrit.cloudera.org:8080/#/c/7625/1/be/src/exec/hdfs-scan-node-base.cc
File be/src/exec/hdfs-scan-node-base.cc:

PS1, Line 266: ,
> space
Done


PS1, Line 266: string
> Do we need to call the string constructor explicitly? Seems a bit weird.
Done


PS1, Line 266: string
> remove string constructor here
Done


http://gerrit.cloudera.org:8080/#/c/7625/1/be/src/exec/hdfs-scan-node-base.h
File be/src/exec/hdfs-scan-node-base.h:

Line 197:   /// Allocate a new scan range object, stored in the runtime state's 
object pool. For
> This comment needs updating since partition_id is now required.
Done


PS1, Line 237:   /// Returns nullptr if the search doesn't find the descriptor.
> not true; it has a dcheck
Done


PS1, Line 360: pair_hash
> misleading name because the hash is computed on the value.
apparently, unordered_map doesn't accept pair as key by default, only if a 
custom hash is provided.


Line 360:   struct pair_hash {
> We have some utilities like this already in util/container-util.h, let's mo
Done


Line 363: return std::hash{}(p.second);
> Let's hash both values in the pair and combine them with boost::hash_combin
Done


PS1, Line 369: pair
> Can you typedef this and using that here and in the .cc file ?
Done


PS1, Line 381: pair
> same
Done


http://gerrit.cloudera.org:8080/#/c/7625/1/tests/metadata/test_partition_metadata.py
File tests/metadata/test_partition_metadata.py:

Line 71: data = self.execute_scalar("select sum(i), sum(j) from %s" % 
FQ_TBL_NAME)
> Can we run this query with num_nodes=1 to verify that the same bug doesn't 
Done


PS1, Line 81: ad
> as?
Done


PS1, Line 81: imapala
> Impala
Done


PS1, Line 109: output
> 'output' isn't clear (it doesn't match up with the queries). say this is wh
Done


Line 110: # (note, that shortcoming on avro writer would result the 
inserted value as NULL,
> How about we only query the second column to avoid this complication?
Done


PS1, Line 122: output:
 : # [NULL, 1] 3 times
 : # [NULL, 2] 3 times
> again, this is confusing
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7625
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5764: Allow overriding packaged components

2017-08-10 Thread Zach Amsden (Code Review)
Zach Amsden has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7648

Change subject: IMPALA-5764: Allow overriding packaged components
..

IMPALA-5764: Allow overriding packaged components

For allowing multiple different distributions to build against the same
codebase, allow overriding various environment variables as well as what
packages to download into the toolchain.  This allows multiple sets of
packaged components to exist in the toolchain and testdata concurrently,
and to easily swap between them using environment overrides.

For example, one could specify overrides to packages and disambiguate
the resulting database paths:

PACKAGED_COMPONENTS_NAME=apache_components
PACKAGED_COMPONENTS_PATH='/apache_bucket/'
PACKAGED_COMPONENTS=avro,hadoop,hbase,hive,sentry,parquet
IMPALA_AVRO_VERSION=2.0.2-apache-local
IMPALA_HADOOP_VERSION=3.0.0-apache-experimental
IMPALA_HBASE_VERSION=1.2.0-apache-SNAPSHOT
IMPALA_HIVE_VERSION=1.1.0-apache-SNAPSHOT
IMPALA_SENTRY_VERSION=1.5.1-apache-SNAPSHOT
IMPALA_PARQUET_VERSION=1.5.0-apache-SNAPSHOT

And these packages would be downloaded from the '/apache_components/'
S3 bucket into the toolchain's apache_componenets directory.

Change-Id: If3e4748aae54d3c4f723591bfd2ce87594338cfd
---
M README.md
M bin/bootstrap_toolchain.py
M bin/impala-config.sh
M testdata/cluster/admin
4 files changed, 61 insertions(+), 49 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/7648/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7648
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: If3e4748aae54d3c4f723591bfd2ce87594338cfd
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 


[Impala-ASF-CR] IMPALA-5412 Fix scan result with partitions on same file

2017-08-10 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has uploaded a new patch set (#3).

Change subject: IMPALA-5412 Fix scan result with partitions on same file
..

IMPALA-5412 Fix scan result with partitions on same file

The maps storing file descriptors and file metadata were using
filename as a key. Multiple partitions pointing to the same
filesystem location resulted that these map entries were
occasionally overwritted by the other partition poing to
the same.

As a solution the map key was enhanced to contain a pair of
partition ID and file name.

Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
---
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/scanner-context.cc
M be/src/util/container-util.h
M tests/metadata/test_partition_metadata.py
8 files changed, 120 insertions(+), 38 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/7625/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7625
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5743: Support TLS version configuration for Thrift servers

2017-08-10 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-5743: Support TLS version configuration for Thrift 
servers
..


Patch Set 1:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/7606/1/be/src/rpc/thrift-server-test.cc
File be/src/rpc/thrift-server-test.cc:

PS1, Line 257: TEST(SslTest, StringToProtocol) {
> Please add a brief description explaining what this test does, especially b
Not yet, but I'll run tests on CentOS 6 once the toolchain changes have been 
published.


PS1, Line 341: // AES256 is v1.2+ only.
> Do we know if thrift bubbles up sensible errors for cipher-SSL version inco
Do you mean if the cipher is incompatible with the TLS version requested? In 
that case yes - it says that no matching cipher is found, which is not perfect 
but a good start.


http://gerrit.cloudera.org:8080/#/c/7606/1/be/src/rpc/thrift-server.h
File be/src/rpc/thrift-server.h:

PS1, Line 31: #include "rpc/auth-provider.h"
> We can change this to a forward declare right?
Done


PS1, Line 32: #include "util/metrics.h"
> Same as above
This one is a bit harder because of the templated definitions that are trickier 
to forward-declare.


Line 165:   /// is used only for password-protected .PEM files.
> Should have caught this in the other review, but please add a comment for t
Done


http://gerrit.cloudera.org:8080/#/c/7606/1/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

PS1, Line 181: Supported versions are "
 : #if OPENSSL_VERSION_NUMBER >= 0x1000100L
 : "TLSv1.0, TLSv1.1 and TLSv1.2");
 : #else
 : "TLSv1.0");
 : #endif
> Should we also mention what the strings representing these different versio
These are the strings that represent those versions :)


-- 
To view, visit http://gerrit.cloudera.org:8080/7606
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4c68a6c9658ddbfbe8025f2021fd5ed7a9dec5a5
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5743: Support TLS version configuration for Thrift servers

2017-08-10 Thread Henry Robinson (Code Review)
Henry Robinson has uploaded a new patch set (#2).

Change subject: IMPALA-5743: Support TLS version configuration for Thrift 
servers
..

IMPALA-5743: Support TLS version configuration for Thrift servers

* Add --ssl_minimum_version which controls the minimum SSL/TLS version
  that clients and servers will use when negotiating a secure
  connection.
* Two kinds of version specification are allowed: 'TLSv1.1' enables
  TLSv1.1 and all subsequent verisons. 'TLSv1.1_only' enables only
  TLSv1.1. The latter is not exposed in user-facing text as it is
  typically only used for testing.
* Handle case where platform may not support TLSv1.1 or v1.2 by checking
  OpenSSL version number.
* Bump Thrift toolchain version to -p10.

Testing:
* New tests in thrift-server-test.cc. In particular, test all 36
  configurations of client and server protocol versions, and ensure that
  the expected successes or failures are seen.

Change-Id: I4c68a6c9658ddbfbe8025f2021fd5ed7a9dec5a5
---
M be/src/rpc/thrift-client.cc
M be/src/rpc/thrift-client.h
M be/src/rpc/thrift-server-test.cc
M be/src/rpc/thrift-server.cc
M be/src/rpc/thrift-server.h
M be/src/service/impala-server.cc
M bin/impala-config.sh
7 files changed, 179 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/7606/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7606
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4c68a6c9658ddbfbe8025f2021fd5ed7a9dec5a5
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 


[native-toolchain-CR] IMPALA-5743: Allow TLS version configuration

2017-08-10 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-5743: Allow TLS version configuration
..


Patch Set 3: Code-Review+2 Verified+1

Had to #ifdef code for openssl versions that don't support TLSv1.1 or later. 
This passed a full toolchain build.

-- 
To view, visit http://gerrit.cloudera.org:8080/7558
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ida75e74682606eefcc59a17cb2dd2b4e71862e9c
Gerrit-PatchSet: 3
Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[native-toolchain-CR] IMPALA-5743: Allow TLS version configuration

2017-08-10 Thread Henry Robinson (Code Review)
Hello Sailesh Mukil,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7558

to look at the new patch set (#3).

Change subject: IMPALA-5743: Allow TLS version configuration
..

IMPALA-5743: Allow TLS version configuration

* Backport THRIFT-2258 to allow for configuration of supported TLS
  versions.
* Rework patch so that either one specific version may be selected, or
  all versions including and following a specific version. So TLSv1_1
  only allows TLS 1.1 connections, but TLSv1_1_plus allows TLS 1.1 and
  TLS 1.2 connections. This makes testing easier.
* SSLv2 and v3 are explicitly disabled, always, as per previous
  patch. They are insecure and superseded by TLS.
* Disable building Thrift's tutorial to slightly improve build times.

Thrift patch is available at:

https://github.com/henryr/thrift/commit/2bdc74de0a0874129e68371796362b1130227e42

Change-Id: Ida75e74682606eefcc59a17cb2dd2b4e71862e9c
---
M buildall.sh
M source/thrift/build.sh
A 
source/thrift/thrift-0.9.0-patches/0010-THRIFT-2258-Add-TLS-configuration.patch
3 files changed, 136 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/58/7558/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7558
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ida75e74682606eefcc59a17cb2dd2b4e71862e9c
Gerrit-PatchSet: 3
Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-5773: Correctly account for memory used in data stream receiver queue

2017-08-10 Thread Henry Robinson (Code Review)
Henry Robinson has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7646

Change subject: IMPALA-5773: Correctly account for memory used in data stream 
receiver queue
..

IMPALA-5773: Correctly account for memory used in data stream receiver queue

DataStreamRecvrs keep one or more queues of batches received to provide
some buffering. Each queue has a fixed byte size capacity. The estimate
of the contribution of a new RowBatch to that queue was using the
compressed size of the TRowBatch it would be deserialized from, which is
the wrong value (since the batch is uncompressed after deserialization).

* Add RowBatch::Get[Des|S]erializedSize(const TRowBatch&) to RowBatch
* Fix the estimate to use the uncompressed size.
* Add a DataStreamReceiver child profile to the exchg node so that the
  peak memory used by the receiver can be monitored easily.

Confirmed that the following query:

select count(distinct concat(cast(l_comment as char(120)),
 cast(l_comment as char(120)),
 cast(l_comment as char(120)),
 cast(l_comment as char(120)),
 cast(l_comment as char(120)),
 cast(l_comment as char(120))) from lineitem;

succeeds with a mem-limit of 800Mb. Before this patch it would fail in a
one-node cluster as the datastream recvr would buffer more batches than
the memory limit would allow.

Change-Id: I9e90f9596ee984438e3373af05e84d361702ca6a
---
M be/src/benchmarks/row-batch-serialize-benchmark.cc
M be/src/runtime/data-stream-mgr.cc
M be/src/runtime/data-stream-recvr.cc
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
6 files changed, 31 insertions(+), 24 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/7646/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7646
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I9e90f9596ee984438e3373af05e84d361702ca6a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 


[Impala-ASF-CR] IMPALA-4737: Prevent SIGUSR1 from killing daemons when minidumps are disabled

2017-08-10 Thread John Sherman (Code Review)
John Sherman has posted comments on this change.

Change subject: IMPALA-4737: Prevent SIGUSR1 from killing daemons when 
minidumps are disabled
..


Patch Set 2:

Should any startup scripts be modified to trap '' SIGUSR1 before launching the 
daemons since there is still technically a race from daemon startup to when the 
mini-dump signal handler is registered? (Though from what I can tell the 
typical usage of the SIGUSR1 mini-dump handler would very likely never 
encounter this race, so maybe not worth fixing?)

-- 
To view, visit http://gerrit.cloudera.org:8080/7631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I13d866a2eec832500131954a7f693c33585ea51e
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: John Sherman 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5764: Allow overriding packaged components

2017-08-10 Thread Zach Amsden (Code Review)
Zach Amsden has posted comments on this change.

Change subject: IMPALA-5764: Allow overriding packaged components
..


Patch Set 1:

needs rebase against merged changes, coming presently

-- 
To view, visit http://gerrit.cloudera.org:8080/7581
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I95c2662e6f62adc924cc5de7a371202126046545
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4794: Partition distinct expr for skew data

2017-08-10 Thread Tianyi Wang (Code Review)
Tianyi Wang has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7643

Change subject: IMPALA-4794: Partition distinct expr for skew data
..

IMPALA-4794: Partition distinct expr for skew data

Currently in an aggregation with grouping and distinct expr, there
are 2 aggregation phases. The first phase aggregates data and
partitions data with grouping expr and then do first merge phase and
second phase. The performance depends on the partitioning of grouping
expr, which could be skew. This patch partitions data in the first
phase by (grouping expr, distinct expr). It introduces an additional
exchange node and a merging node but the data is supposed to be more
balanced with finer grained partitioning and the performance is
supposed to be more stable.

Testing: In planner test, some plans with distinct aggregation change.
The pattern is that the distinct expr is added to exchange node,
followed by an additional aggregation and exchange node.

Change-Id: I7bdada0e328b555900c7b7ff8aabc8eb15ae8fa9
---
M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test
M testdata/workloads/functional-planner/queries/PlannerTest/distinct.test
M testdata/workloads/functional-planner/queries/PlannerTest/insert.test
M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test
7 files changed, 192 insertions(+), 104 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/7643/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7643
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I7bdada0e328b555900c7b7ff8aabc8eb15ae8fa9
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang 


[Impala-ASF-CR] IMPALA-4669: [SECURITY] Add security library to build

2017-08-10 Thread Henry Robinson (Code Review)
Henry Robinson has uploaded a new patch set (#15).

Change subject: IMPALA-4669: [SECURITY] Add security library to build
..

IMPALA-4669: [SECURITY] Add security library to build

* Minor compilation fix
* Add krb5 as a non-toolchain dependency
* Handle legacy versions of libkrb5.so by providing implementation of
  krb5_is_config_principal().
* Link against openssl from the toolchain if 1.0.0 or higher not found
  on build machine.

Change-Id: I4f327810066bee7f3ac107b0295480fb9ed45e14
---
M CMakeLists.txt
M LICENSE.txt
M be/CMakeLists.txt
M be/src/common/config.h.in
M be/src/kudu/security/init.cc
M be/src/kudu/security/token_verifier.cc
M cmake_modules/FindKerberos.cmake
7 files changed, 112 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/17/5717/15
-- 
To view, visit http://gerrit.cloudera.org:8080/5717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4f327810066bee7f3ac107b0295480fb9ed45e14
Gerrit-PatchSet: 15
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 


[Impala-ASF-CR] IMPALA-5327: Handle return of JNI GetStringUTFChar

2017-08-10 Thread Tianyi Wang (Code Review)
Tianyi Wang has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7642

Change subject: IMPALA-5327: Handle return of JNI GetStringUTFChar
..

IMPALA-5327: Handle return of JNI GetStringUTFChar

GetStringUTFChars may return NULL or throw exception and it is not
handled in two places. This patch checks the return value/exception,
logs if there is error and return error to the caller.

Change-Id: I47eed045f21c6ed3a8decf422ce8429fc66c4322
---
M be/src/util/jni-util.cc
M be/src/util/logging-support.cc
2 files changed, 34 insertions(+), 11 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/7642/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7642
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I47eed045f21c6ed3a8decf422ce8429fc66c4322
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang 


[Impala-ASF-CR] IMPALA-4669: [SECURITY] Add security library to build

2017-08-10 Thread Henry Robinson (Code Review)
Henry Robinson has uploaded a new patch set (#14).

Change subject: IMPALA-4669: [SECURITY] Add security library to build
..

IMPALA-4669: [SECURITY] Add security library to build

* Minor compilation fix
* Set toolchain version to include gflags 2.2.0 and glog 0.3.4-p2
* Add krb5 as a dependency
* Handle legacy versions of libkrb5.so by providing implementation of 
krb5_is_config_principal().
* Link against openssl from the toolchain if 1.0.1 or higher not found
  on build machine.

Change-Id: I4f327810066bee7f3ac107b0295480fb9ed45e14
---
M CMakeLists.txt
M LICENSE.txt
M be/CMakeLists.txt
M be/src/common/config.h.in
M be/src/kudu/security/init.cc
M be/src/kudu/security/token_verifier.cc
M bin/bootstrap_toolchain.py
M cmake_modules/FindKerberos.cmake
8 files changed, 116 insertions(+), 20 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/17/5717/14
-- 
To view, visit http://gerrit.cloudera.org:8080/5717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4f327810066bee7f3ac107b0295480fb9ed45e14
Gerrit-PatchSet: 14
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 


[Impala-ASF-CR] IMPALA-4737: Prevent SIGUSR1 from killing daemons when minidumps are disabled

2017-08-10 Thread Lars Volker (Code Review)
Hello Sailesh Mukil,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7631

to look at the new patch set (#2).

Change subject: IMPALA-4737: Prevent SIGUSR1 from killing daemons when 
minidumps are disabled
..

IMPALA-4737: Prevent SIGUSR1 from killing daemons when minidumps are disabled

If a user disabled minidumps before this change, we did not register the
signal handler for SIGUSR1 at all. Sending SIGUSR1 to a daemon would
subsequently kill it.

This change registers the SIG_IGN handler to ignore the signal if
minidumps are disabled.

Change-Id: I13d866a2eec832500131954a7f693c33585ea51e
---
M be/src/util/minidump.cc
M tests/custom_cluster/test_breakpad.py
2 files changed, 33 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/31/7631/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I13d866a2eec832500131954a7f693c33585ea51e
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-4737: Prevent SIGUSR1 from killing daemons when minidumps are disabled

2017-08-10 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-4737: Prevent SIGUSR1 from killing daemons when 
minidumps are disabled
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7631/1//COMMIT_MSG
Commit Message:

PS1, Line 13: a dummy handler
> nit: registers the SIG_IGN handler.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I13d866a2eec832500131954a7f693c33585ea51e
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4737: Prevent SIGUSR1 from killing daemons when minidumps are disabled

2017-08-10 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-4737: Prevent SIGUSR1 from killing daemons when 
minidumps are disabled
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7631/1//COMMIT_MSG
Commit Message:

PS1, Line 13: a dummy handler
nit: registers the SIG_IGN handler.


-- 
To view, visit http://gerrit.cloudera.org:8080/7631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I13d866a2eec832500131954a7f693c33585ea51e
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4737: Prevent SIGUSR1 from killing daemons when minidumps are disabled

2017-08-10 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-4737: Prevent SIGUSR1 from killing daemons when 
minidumps are disabled
..


Patch Set 1: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7631
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I13d866a2eec832500131954a7f693c33585ea51e
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5352: Age out unused file handles from the cache

2017-08-10 Thread Joe McDonnell (Code Review)
Joe McDonnell has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7640

Change subject: IMPALA-5352: Age out unused file handles from the cache
..

IMPALA-5352: Age out unused file handles from the cache

Currently, a file handle in the file handle cache will
only be evicted if the cache reaches its capacity. This
means that file handles can be retained for an indefinite
amount of time. This is true even for files that have
been deleted, replaced, or modified. Since a file handle
maintains a file descriptor for local files, this can
prevent the disk space from being freed. Additionally,
unused file handles are wasted memory.

This adds code to evict file handles that have been
unused for longer than a specified threshold. A thread
periodically checks the file handle cache to see if
any file handle should be evicted. The threshold is
specified by 'unused_file_handle_timeout_mins'; it
defaults to 6 hours.

This adds a test to custom_cluster/test_hdfs_fd_caching.py
to verify the eviction behavior.

Change-Id: Iefe04b3e2e22123ecb8b3e494934c93dfb29682e
---
M be/src/runtime/disk-io-mgr-handle-cache.h
M be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr.cc
M tests/custom_cluster/test_hdfs_fd_caching.py
4 files changed, 138 insertions(+), 19 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/7640/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7640
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Iefe04b3e2e22123ecb8b3e494934c93dfb29682e
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 


[Impala-ASF-CR] IMPALA-4407: Move Impala setup procedures to main repo

2017-08-10 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-4407: Move Impala setup procedures to main repo
..


Patch Set 5: Code-Review+1

(1 comment)

Carry +1s from Michael B. and Lars.

http://gerrit.cloudera.org:8080/#/c/7587/4/bin/bootstrap_development.sh
File bin/bootstrap_development.sh:

PS4, Line 23:  and im
> double-word
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7587
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I728abfa806ecd9461dfb443278c2a464714d984c
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4407: Move Impala setup procedures to main repo

2017-08-10 Thread Jim Apple (Code Review)
Hello Michael Brown, Lars Volker,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7587

to look at the new patch set (#5).

Change subject: IMPALA-4407: Move Impala setup procedures to main repo
..

IMPALA-4407: Move Impala setup procedures to main repo

Before this change, Impala has relied on a chef setup in
https://github.com/awleblang/impala-setup for setting up a development
environment. This has a number of downsides:

1. It makes understanding what the script is doing difficult: there
are 40k or so lines in that repo last I checked.

2. It makes porting to new distributions difficult unless the
providers of various chef "recipes" have already ported their code.

3. It makes coordinated changes between the main repo and the
impala-setup repo more awkward.

This patch adds a shell script to replace that repo. It works on
Ubuntu 14.04 and 16.04, while impala-setup repo only works on 14.04
and the now-unmaintained Ubuntu 15.04.

Change-Id: I728abfa806ecd9461dfb443278c2a464714d984c
---
M bin/bootstrap_development.sh
1 file changed, 165 insertions(+), 39 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/7587/5
-- 
To view, visit http://gerrit.cloudera.org:8080/7587
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I728abfa806ecd9461dfb443278c2a464714d984c
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4407: Move Impala setup procedures to main repo

2017-08-10 Thread Michael Brown (Code Review)
Michael Brown has posted comments on this change.

Change subject: IMPALA-4407: Move Impala setup procedures to main repo
..


Patch Set 4: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7587/3/bin/bootstrap_development.sh
File bin/bootstrap_development.sh:

PS3, Line 145:   sudo sed -ri 's/local +all +all +peer/local al
> Changed so it doesn't run tests - I think it might be confusing otherwise t
I like your compromise.


http://gerrit.cloudera.org:8080/#/c/7587/4/bin/bootstrap_development.sh
File bin/bootstrap_development.sh:

PS4, Line 23: and and
double-word


-- 
To view, visit http://gerrit.cloudera.org:8080/7587
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I728abfa806ecd9461dfb443278c2a464714d984c
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5778: clarify --read size option.

2017-08-10 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5778: clarify --read_size option.
..


Patch Set 1: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7623
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3c20a9d55f89170b11f569c90b7f2949ddbe4211
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Balazs Jeszenszky 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4407: Move Impala setup procedures to main repo

2017-08-10 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-4407: Move Impala setup procedures to main repo
..


Patch Set 3:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/7587/3/bin/bootstrap_development.sh
File bin/bootstrap_development.sh:

PS3, Line 21: system
: # configurations, so it is best to run this in a fresh install. 
It also sets up the
: # .bashrc of the calling user with some environment variables
> I know that the impala-setup rep didn't do this, but do you think it would 
Done


Line 57:  maven ninja-build ntp ntpdate python-dev python-setuptools 
postgresql
> Please add the following too:
Done


PS3, Line 90: # IMPALA-3932, IMPALA-3926
: SET_LD_LIBRARY_PATH='export 
LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu/${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}'
: echo "$SET_LD_LIBRARY_PATH" >> ~/.bashrc
: eval "$SET_LD_LIBRARY_PATH"
> 1. Is this needed for Ubuntu 14? I don't do this on my personal Ubuntu 14 s
1. Fixed

2. Agreed. While we could grep .bashrc for this, that will sometimes show it as 
being already done when, in fact, changes that are below that line in the 
.bashrc override it. I have moved the output location to 
impala-config.local.sh, but I think the same applies.


PS3, Line 96: sudo -u postgres psql -c "CREATE ROLE hiveuser LOGIN PASSWORD 
'password';" postgres
> If you run this script twice, this line always fails. Could this be enhance
Done


PS3, Line 115: echo "NoHostAuthenticationForLocalhost yes" >> ~/.ssh/config
> It might be polite to have a prompt for this since it's a security change.
Added one at the top


PS3, Line 118: echo "127.0.0.1 $(hostname -s) $(hostname)" | sudo tee -a 
/etc/hosts
 : sudo sed -i 's/127.0.1.1/127.0.0.1/g' /etc/hosts
> This will keep appending to /etc/hosts if you run the script more than once
I agree. See above about .bashrc, though


PS3, Line 124: echo "* - nofile 1048576" | sudo tee -a /etc/security/limits.conf
> 1. This keeps appending to limits.conf
1. See above
2. Added #TODO. It can't just be $(whoami), because it might need that for 
postgres or root.


PS3, Line 127: export IMPALA_HOME="$(pwd)"
> What do you think about setting this in .bashrc also?
Done


PS3, Line 131: git clone https://github.com/cloudera/impala-lzo.git
 : ln -s impala-lzo Impala-lzo
 : git clone https://github.com/cloudera/hadoop-lzo.git
> This fails if you run the script twice. Could you add -d tests here similar
Done


PS3, Line 139: if [[ $VERSION = "14.04" ]]
 : then
 :   unset LD_LIBRARY_PATH
 : fi
> Oh, this sort of answers my question above. What about when the user create
Fixed


PS3, Line 145: time -p ./buildall.sh -noclean -format -testdata
> I'm torn about this, because it takes hours. Maybe consider ./buildall.sh -
Changed so it doesn't run tests - I think it might be confusing otherwise to 
users who think they have bootstrapped but then can't run a test because the 
data isn't loaded, especially because loading data is somewhat uncommon in my 
experience as a step to get a working dev environment (among other open source 
projects).this


-- 
To view, visit http://gerrit.cloudera.org:8080/7587
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I728abfa806ecd9461dfb443278c2a464714d984c
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4407: Move Impala setup procedures to main repo

2017-08-10 Thread Jim Apple (Code Review)
Hello Lars Volker,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7587

to look at the new patch set (#4).

Change subject: IMPALA-4407: Move Impala setup procedures to main repo
..

IMPALA-4407: Move Impala setup procedures to main repo

Before this change, Impala has relied on a chef setup in
https://github.com/awleblang/impala-setup for setting up a development
environment. This has a number of downsides:

1. It makes understanding what the script is doing difficult: there
are 40k or so lines in that repo last I checked.

2. It makes porting to new distributions difficult unless the
providers of various chef "recipes" have already ported their code.

3. It makes coordinated changes between the main repo and the
impala-setup repo more awkward.

This patch adds a shell script to replace that repo. It works on
Ubuntu 14.04 and 16.04, while impala-setup repo only works on 14.04
and the now-unmaintained Ubuntu 15.04.

Change-Id: I728abfa806ecd9461dfb443278c2a464714d984c
---
M bin/bootstrap_development.sh
1 file changed, 165 insertions(+), 39 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/7587/4
-- 
To view, visit http://gerrit.cloudera.org:8080/7587
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I728abfa806ecd9461dfb443278c2a464714d984c
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5412 Fix scan result with partitions on same file

2017-08-10 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has uploaded a new patch set (#2).

Change subject: IMPALA-5412 Fix scan result with partitions on same file
..

IMPALA-5412 Fix scan result with partitions on same file

The maps storing file descriptors and file metadata were using filename as a 
key.
Multiple partitions pointing to the same filesystem location resulted that these
map entries were occasionally overwritted by the other partition poing to the 
same.

As a solution the map key was enhanced to contain a pair of partition ID and 
file name.

Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
---
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/scanner-context.cc
M tests/metadata/test_partition_metadata.py
7 files changed, 109 insertions(+), 34 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/25/7625/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7625
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie74b305377248045c0d87b911943e1cabb7223e9
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5768: Better developer documentation

2017-08-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5768: Better developer documentation
..


Patch Set 6: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5768: Better developer documentation

2017-08-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has submitted this change and it was merged.

Change subject: IMPALA-5768: Better developer documentation
..


IMPALA-5768: Better developer documentation

Guide to important environment variables for
build, impala paths and config cleanup.

Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368
Reviewed-on: http://gerrit.cloudera.org:8080/7350
Tested-by: Impala Public Jenkins
Reviewed-by: Tim Armstrong 
---
M README.md
M bin/impala-config.sh
2 files changed, 75 insertions(+), 25 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Tim Armstrong: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I16d34cb4fa0c60c5ad6d9c8764cc0ec21c5cb368
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zach Amsden