[Impala-ASF-CR] IMPALA-3981: Fixed the bug of "Crash when access statestore/catalog memz webpage"

2016-08-15 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-3981: Fixed the bug of "Crash when access 
statestore/catalog memz webpage"
..


Patch Set 1:

I think we can do something a bit cleaner, after thinking about it. 

Everywhere that AddDefaultPathHandler() is called, there should be a 
MetricGroup* object that is the top-level metric group for the process. You can 
pass that into AddDefaultPathHandler() just like the process mem tracker, and 
use that to get at the JVM metrics - if they exist. 

To do that you should rewrite 

webserver->RegisterUrlCallback("/memz","memz.tmpl",
bind(, process_mem_tracker, _1, _2));

to something like

  auto callback = [metric_group, process_mem_tracker](const auto& args, auto* 
doc) {
MemUsageHandler(process_mem_tracker, metric_group, args, doc);
  }
  webserver->RegisterUrlCallback("/memz", callback);

(This is using C++11's new lambda syntax which I find much cleaner than 
boost::bind).

You should deal with the cases where a) metric_group == NULL and b) there is no 
"jvm" subgroup in the metric group.

-- 
To view, visit http://gerrit.cloudera.org:8080/3998
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If59b10f20044d1a468f27810a3029fe18fb19f29
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Kathy Sun 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3981: Fixed the bug of "Crash when access statestore/catalog memz webpage"

2016-08-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3981: Fixed the bug of "Crash when access 
statestore/catalog memz webpage"
..


Patch Set 1:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/3998/1/be/src/util/default-path-handlers.cc
File be/src/util/default-path-handlers.cc:

Line 112: 
No need to add the extra line.


Line 114:   if(ExecEnv::GetInstance()!= NULL){
Need extra whitespace.


  if (ExecEnv::GetInstance() != NULL) {


Line 115: 
ExecEnv::GetInstance()->metrics()->GetChildGroup("jvm")->ToJson(false, 
document, );
Long line - wrap it before 


Line 117: for(SizeType i = 0; i < jvm["metrics"].Size(); ++i){
Missing spaces here from before the change:

for (SizeType i = 0; i < jvm["metrics"].Size(); ++i) {


Line 118:   if (strstr(jvm["metrics"][i]["name"].GetString(), "total") != 
nullptr){
Missing space before {


-- 
To view, visit http://gerrit.cloudera.org:8080/3998
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If59b10f20044d1a468f27810a3029fe18fb19f29
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Kathy Sun 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3201: reservation implementation for new buffer pool

2016-08-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3201: reservation implementation for new buffer pool
..


Patch Set 3:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/3993/3/be/src/bufferpool/reservation-tracker.h
File be/src/bufferpool/reservation-tracker.h:

PS3, Line 32: e.g. bytes of memory
> is this really "e.g." and "resources" given the tight coupling with MemTrac
The coupling with MemTracker is optional: if the mem_tracker is NULL, 
everything works. There's no reason you couldn't use the code to account for 
some other resource if mem_tracker_ is NULL.

I've gone back and forth about this. I don't usually like speculative 
generality, but there's very little about the code or abstraction that's 
specific to memory reservations. I think it would be unfortunate if we have a 
need in the future to track other resource reservations and had to reinvent the 
wheel. I'm happy to change this.


Line 39: /// directly, or distributed to children of the tracker for its own 
reservation.
> are there cases where a client would use reservation directly from a non-le
If an exec node subdivides its allocation using ReservationTrackers below the 
node-level tracker. I think it naturally occurs there, since you would usually 
count some usage again the exec node's tracker.


PS3, Line 41: consume
> to avoid confusion with memtracker consume/release, how about using the ter
Done. Missed it on an earlier pass.


PS3, Line 51: consumption
> usage
Done


PS3, Line 55: optionally
> what cases do we not want to have an associated MemTracker?
I noticed the same thing about the invariant. Hopefully the new version is 
better.

I don't think it makes sense to associate the root MemTracker and 
ReservationTracker right now, since the root MemTracker usage is derived from a 
metric value rather than maintained in the usual way. Eventually if all the 
memory is allocated from the buffer pool we could change that (or get rid of 
the MemTrackers entirely).

Also for ReservationTrackers below the node-level ReservationTracker. E.g. if 
we're allocating temporary trackers to subdivide the reservation. I think it 
would be inconvenient to allocate many corresponding MemTrackers.


PS3, Line 79: InitChildTracker
> For InitRootTracker/InitChildTracker(), maybe these should be static method
Right, it's already enforced with a DCHECK.

The main reason I did it this way is to avoid having to make another heap 
allocation and decide whether it should return a raw/unique/scoped/whatever 
pointer. I don't feel strongly, I think it works ok either way, so I could 
switch it to return a unique_ptr (or something like that).


PS3, Line 83: The reservation must be completely
:   /// released before calling Close().
> This seems to contradict the previous sentence.  Oh, I guess you mean the r
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/3993
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I35cc89e863efb4cc506657bfdaaaf633a10bbab6
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3201: reservation implementation for new buffer pool

2016-08-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#4).

Change subject: IMPALA-3201: reservation implementation for new buffer pool
..

IMPALA-3201: reservation implementation for new buffer pool

This patch implements the reservation bookkeeping logic for the new
buffer pool.

The ReservationTrackers integrate with MemTrackers by counting
reservations as consumption against a MemTracker. This reflects that
fact that the reservations are 'hard': the memory is committed and
cannot be used for other purposes.

Includes new reservation mechanism where reservations are always
guaranteed and buffers can't be pinned without a reservation.
Reservations are tracked via a hierarchy of ReservationTrackers.

Includes basic tests for buffer pool client registration and various
reservation operations.

Change-Id: I35cc89e863efb4cc506657bfdaaaf633a10bbab6
---
M be/CMakeLists.txt
A be/src/bufferpool/CMakeLists.txt
A be/src/bufferpool/buffer-pool-test.cc
A be/src/bufferpool/buffer-pool.cc
M be/src/bufferpool/buffer-pool.h
A be/src/bufferpool/reservation-tracker.cc
A be/src/bufferpool/reservation-tracker.h
M be/src/runtime/mem-tracker.h
M be/src/util/uid-util.h
9 files changed, 1,058 insertions(+), 44 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/3993/4
-- 
To view, visit http://gerrit.cloudera.org:8080/3993
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I35cc89e863efb4cc506657bfdaaaf633a10bbab6
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-1619, IMPALA-3018: Address various small memory allocation related bugs

2016-08-15 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-1619, IMPALA-3018: Address various small memory 
allocation related bugs
..


Patch Set 2: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/3807
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6eb9a4472a65cf68edb0323b13d745277ead2e1d
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-1619, IMPALA-3018: Address various small memory allocation related bugs

2016-08-15 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-1619, IMPALA-3018: Address various small memory 
allocation related bugs
..


IMPALA-1619, IMPALA-3018: Address various small memory allocation related bugs

This patch addresses a potential overflow in calculation FreePool::Rellocate()
and its handling of zero-length allocations. This patch also adds code to
gracefully handle malloc() failures when initializing/resizing hash tables.

Change-Id: I6eb9a4472a65cf68edb0323b13d745277ead2e1d
Reviewed-on: http://gerrit.cloudera.org:8080/3807
Reviewed-by: Tim Armstrong 
Reviewed-by: Dan Hecht 
Tested-by: Internal Jenkins
---
M be/src/exec/hash-table.cc
M be/src/runtime/free-pool-test.cc
M be/src/runtime/free-pool.h
3 files changed, 16 insertions(+), 3 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Dan Hecht: Looks good to me, approved
  Tim Armstrong: Looks good to me, but someone else must approve



-- 
To view, visit http://gerrit.cloudera.org:8080/3807
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I6eb9a4472a65cf68edb0323b13d745277ead2e1d
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3470: DecompressorTest is flaky.

2016-08-15 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-3470: DecompressorTest is flaky.
..


IMPALA-3470: DecompressorTest is flaky.

Make sure the random number is greater than 0 and stream is indeed truncated.

Change-Id: I7ebaa403abf45e31f38d6cf4e557d6274d877a8a
Reviewed-on: http://gerrit.cloudera.org:8080/3954
Reviewed-by: Juan Yu 
Tested-by: Internal Jenkins
---
M be/src/util/decompress-test.cc
1 file changed, 2 insertions(+), 1 deletion(-)

Approvals:
  Juan Yu: Looks good to me, approved
  Internal Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/3954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I7ebaa403abf45e31f38d6cf4e557d6274d877a8a
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Juan Yu 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3470: DecompressorTest is flaky.

2016-08-15 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3470: DecompressorTest is flaky.
..


Patch Set 3: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/3954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7ebaa403abf45e31f38d6cf4e557d6274d877a8a
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Juan Yu 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-CR](cdh5-trunk) IMPALA-2809: Improve ByteSwap with builtin function or SSSE3 or AVX2.

2016-08-15 Thread Youwei Wang (Code Review)
Hello Jim Apple, Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3081

to look at the new patch set (#40).

Change subject: IMPALA-2809: Improve ByteSwap with builtin function or SSSE3 or 
AVX2.
..

IMPALA-2809: Improve ByteSwap with builtin function or SSSE3 or AVX2.

Using SSSE3/AVX2 intrinsic to accelerate the function
"static inline void ByteSwap(void* dst, const void* src, int len)" of BitUtil 
class,
and a scalar byte-swap routine is added as fallback.
Also the runtime selector for CPUs of different capacity is included,
as well as performance test and data verification.
Brief performance comparison is listed here:
CPU: Intel(R) Core(TM) i5-4460  CPU@3.20GHz
Result:
I0725 20:47:02.402506  2078 bswap-benchmark.cc:117] Machine Info: Intel(R) 
Core(TM) i5-4460  CPU @ 3.20GHz
ByteSwap benchmark:Function  iters/ms   10%ile   50%ile   90%ile 
10%ile 50%ile 90%ile
 
(relative) (relative) (relative)
-
 FastScalar675  725  731
 1X 1X 1X
  SSSE3   6.12e+03  6.2e+03 6.23e+03  
9.06X  8.55X  8.53X
   AVX2   1.87e+04 1.88e+04 1.89e+04  
27.7X  25.9X  25.9X
   SIMD   1.82e+04 1.88e+04 1.89e+04
27X  25.9X  25.9X

Change-Id: I392ed5a8d5683f30f161282c228c1aedd7b648c1
---
M be/src/benchmarks/CMakeLists.txt
A be/src/benchmarks/bswap-benchmark.cc
M be/src/exprs/string-functions-ir.cc
M be/src/util/bit-util-test.cc
M be/src/util/bit-util.cc
M be/src/util/bit-util.h
6 files changed, 378 insertions(+), 53 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/81/3081/40
-- 
To view, visit http://gerrit.cloudera.org:8080/3081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I392ed5a8d5683f30f161282c228c1aedd7b648c1
Gerrit-PatchSet: 40
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Youwei Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Youwei Wang 


[Impala-CR](cdh5-trunk) IMPALA-2809: Improve ByteSwap with builtin function or SSSE3 or AVX2.

2016-08-15 Thread Youwei Wang (Code Review)
Hello Jim Apple, Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3081

to look at the new patch set (#39).

Change subject: IMPALA-2809: Improve ByteSwap with builtin function or SSSE3 or 
AVX2.
..

IMPALA-2809: Improve ByteSwap with builtin function or SSSE3 or AVX2.

Using SSSE3/AVX2 intrinsic to accelerate the function
"static inline void ByteSwap(void* dst, const void* src, int len)" of BitUtil 
class,
and a scalar byte-swap routine is added as fallback.
Also the runtime selector for CPUs of different capacity is included,
as well as performance test and data verification.
Brief performance comparison is listed here:
CPU: Intel(R) Core(TM) i5-4460  CPU@3.20GHz
Result:
I0725 20:47:02.402506  2078 bswap-benchmark.cc:117] Machine Info: Intel(R) 
Core(TM) i5-4460  CPU @ 3.20GHz
ByteSwap benchmark:Function  iters/ms   10%ile   50%ile   90%ile 
10%ile 50%ile 90%ile
 
(relative) (relative) (relative)
-
 FastScalar675  725  731
 1X 1X 1X
  SSSE3   6.12e+03  6.2e+03 6.23e+03  
9.06X  8.55X  8.53X
   AVX2   1.87e+04 1.88e+04 1.89e+04  
27.7X  25.9X  25.9X
   SIMD   1.82e+04 1.88e+04 1.89e+04
27X  25.9X  25.9X

Change-Id: I392ed5a8d5683f30f161282c228c1aedd7b648c1
---
M be/src/benchmarks/CMakeLists.txt
A be/src/benchmarks/bswap-benchmark.cc
M be/src/exprs/string-functions-ir.cc
M be/src/util/bit-util-test.cc
M be/src/util/bit-util.cc
M be/src/util/bit-util.h
6 files changed, 378 insertions(+), 53 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/81/3081/39
-- 
To view, visit http://gerrit.cloudera.org:8080/3081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I392ed5a8d5683f30f161282c228c1aedd7b648c1
Gerrit-PatchSet: 39
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Youwei Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Youwei Wang 


[Impala-CR](cdh5-trunk) IMPALA-2809: Improve ByteSwap with builtin function or SSSE3 or AVX2.

2016-08-15 Thread Youwei Wang (Code Review)
Youwei Wang has posted comments on this change.

Change subject: IMPALA-2809: Improve ByteSwap with builtin function or SSSE3 or 
AVX2.
..


Patch Set 38:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/3081/38/be/src/util/bit-util-test.cc
File be/src/util/bit-util-test.cc:

Line 99: void TestByteswapSimdUnit(const int64_t CpuFlag) {
> name formatting
Done


Line 192:   if (LIKELY(CpuInfo::IsSupported(CpuInfo::SSSE3))) {
> remove likely, this is not performance-critical
Done


Line 203: TestByteswapSimdUnit(CpuInfo::SSSE3);
> we already did this above
Done


Line 211:   for (int i = 0; i <= 32; ++i) {
> single line
Done


http://gerrit.cloudera.org:8080/#/c/3081/38/be/src/util/bit-util.cc
File be/src/util/bit-util.cc:

Line 19:   // ByteSwapScalarLoop is only used in bit-util.cc, so put it in this 
anonymous
> make it static instead? or get remove the extra level of indentation.
Done


Line 25: for (int i = 0; i < len; ++i) {
> single line
Done


Line 135: // This constant is concluded from the definition of _mm_set_epi8;
> included?
Hi Marcel. Since this constant is figured out by the definition of 
_mm_set_epi8, so I guess using the word "conclude" here may be a proper choice. 
Thank you.


Line 144: _mm_loadu_si128(reinterpret_cast(src)), 
mask128i));
> indent 4 spaces (and elsewhere)
Done


Line 149: inline void SimdByteSwap::ByteSwapAVX2_Unit(uint8_t* dst, const 
uint8_t* src) {
> should this be called ByteSwap256 or something like that? i had no idea wha
Done


http://gerrit.cloudera.org:8080/#/c/3081/38/be/src/util/bit-util.h
File be/src/util/bit-util.h:

Line 244: // Declaration for the namespace of SIMD byteswap utilities
> we don't use that elsewhere. make it a class with static functions instead.
Done


Line 246:   // Extra type declarations
> i don't understand this comment
Done


Line 249:   // Declaration for scalar utility functions
> don't prefix the comment for a declaration with 'Declaration for'. also, th
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/3081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I392ed5a8d5683f30f161282c228c1aedd7b648c1
Gerrit-PatchSet: 38
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Youwei Wang 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Youwei Wang 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3981: Fixed the bug of "Crash when access statestore/catalog memz webpage"

2016-08-15 Thread Kathy Sun (Code Review)
Kathy Sun has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/3998

Change subject: IMPALA-3981: Fixed the bug of "Crash when access 
statestore/catalog memz webpage"
..

IMPALA-3981: Fixed the bug of "Crash when access statestore/catalog
memz webpage"

Add ExecEnv instance not NULL check

Testing: Ran locally and looked at impalad/memz, statestored/memz
and catalogd/memz

Change-Id: If59b10f20044d1a468f27810a3029fe18fb19f29
---
M be/src/util/default-path-handlers.cc
M www/memz.tmpl
2 files changed, 11 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/3998/1
-- 
To view, visit http://gerrit.cloudera.org:8080/3998
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: If59b10f20044d1a468f27810a3029fe18fb19f29
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Kathy Sun 


Re: Error encounted when doing 'bin/load_data.py -w all'

2016-08-15 Thread Amos Bird
Hi Alex,
Thanks for the reply! I have indeed get them all up and running. Here is
the jps list:

13184
12833 LlamaAMMain
14081 HQuorumPeer
40870 Jps
11302 DataNode
15081 RunJar
15722 RunJar
13259
11980 NodeManager
11244 DataNode
14700 HRegionServer
11278 NameNode
14353 HRegionServer
11218 DataNode
11923 NodeManager
11955 ResourceManager
12982 Bootstrap
14166 HMaster
11896 NodeManager
16089 RunJar
14527 HRegionServer

I also played around a few SQLs inside impala. It works fine. In fact I
was able to create the test data before. I have no clue what happened in
the newest impala build.


Alex Behm writes:

> Hi Amos!
>
> load-data.py assumes that you have a running cluster. You need to first get
> these working:
> testdata/bin/run-all.sh
> bin/start-impala-cluster.py
>
> The first command starts all dependent services like HDFS, YARN, Hive
> Metastore, Hive HS2, etc.
> The second command starts an Impala mini-cluster with 3 nodes. This command
> assumes all dependent services are already running/
>
> Hope it helps!
>
> Alex
>
> On Mon, Aug 15, 2016 at 5:20 AM, Amos Bird  wrote:
>
>>
>> I was trying to build a new test warehouse. After successfully running
>> 'bin/create_testdata.sh', I did 'bin/load_data.py -w all'. Unfortunately it
>> ended up with this:
>>
>> ERROR : Job Submission failed with exception 'java.io.IOException(java.
>> util.concurrent.ExecutionException: java.io.IOException: Cannot create an
>> instance of InputFormat class org.apache.hadoop.mapred.TextInputFormat as
>> specified in mapredWork!)'
>> java.io.IOException: java.util.concurrent.ExecutionException:
>> java.io.IOException: Cannot create an instance of InputFormat class
>> org.apache.hadoop.mapred.TextInputFormat as specified in mapredWork!
>>   at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(
>> CombineHiveInputFormat.java:544)
>>   at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(
>> JobSubmitter.java:332)
>>   at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(
>> JobSubmitter.java:324)
>>   at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(
>> JobSubmitter.java:200)
>>   at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1307)
>>   at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1304)
>>   at java.security.AccessController.doPrivileged(Native Method)
>>   at javax.security.auth.Subject.doAs(Subject.java:422)
>>   at org.apache.hadoop.security.UserGroupInformation.doAs(
>> UserGroupInformation.java:1693)
>>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1304)
>>   at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:578)
>>   at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:573)
>>   at java.security.AccessController.doPrivileged(Native Method)
>>   at javax.security.auth.Subject.doAs(Subject.java:422)
>>   at org.apache.hadoop.security.UserGroupInformation.doAs(
>> UserGroupInformation.java:1693)
>>   at org.apache.hadoop.mapred.JobClient.submitJobInternal(
>> JobClient.java:573)
>>   at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:564)
>>   at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(
>> ExecDriver.java:430)
>>   at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(
>> MapRedTask.java:137)
>>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
>>   at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(
>> TaskRunner.java:100)
>>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1782)
>>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1539)
>>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1318)
>>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1127)
>>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1120)
>>   at org.apache.hive.service.cli.operation.SQLOperation.
>> runQuery(SQLOperation.java:191)
>>   at org.apache.hive.service.cli.operation.SQLOperation.access$
>> 100(SQLOperation.java:79)
>>   at org.apache.hive.service.cli.operation.SQLOperation$2$1.
>> run(SQLOperation.java:245)
>>   at java.security.AccessController.doPrivileged(Native Method)
>>   at javax.security.auth.Subject.doAs(Subject.java:422)
>>   at org.apache.hadoop.security.UserGroupInformation.doAs(
>> UserGroupInformation.java:1693)
>>   at org.apache.hive.service.cli.operation.SQLOperation$2.run(
>> SQLOperation.java:258)
>>   at java.util.concurrent.Executors$RunnableAdapter.
>> call(Executors.java:511)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>   at java.util.concurrent.ThreadPoolExecutor.runWorker(
>> ThreadPoolExecutor.java:1142)
>>   at java.util.concurrent.ThreadPoolExecutor$Worker.run(
>> ThreadPoolExecutor.java:617)
>>   at java.lang.Thread.run(Thread.java:745)
>> Caused by: java.util.concurrent.ExecutionException: java.io.IOException:
>> Cannot create an instance of InputFormat class 
>> org.apache.hadoop.mapred.TextInputFormat
>> as specified in mapredWork!
>>   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>>   at 

[Impala-ASF-CR] IMPALA-3201: buffer pool header only

2016-08-15 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3201: buffer pool header only
..


Patch Set 1: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/3992
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id771dea2eb4c1aa13c30d59e8b184a7d1bca8d34
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3201: buffer pool header only

2016-08-15 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-3201: buffer pool header only
..


IMPALA-3201: buffer pool header only

Implementation will follow in a subsequent commit.

The BufferPool interface and reservation bookkeeping are documented.

Includes documentation for some guarantees of new reservation mechanism
where reservations are always guaranteed and pages can't be pinned and
buffers can't be allocated without without a reservation. Reservations
are tracked via a hierarchy of ReservationTrackers.

Change-Id: Id771dea2eb4c1aa13c30d59e8b184a7d1bca8d34
Reviewed-on: http://gerrit.cloudera.org:8080/3992
Reviewed-by: Tim Armstrong 
Tested-by: Internal Jenkins
---
A be/src/bufferpool/buffer-pool.h
1 file changed, 318 insertions(+), 0 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Tim Armstrong: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/3992
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Id771dea2eb4c1aa13c30d59e8b184a7d1bca8d34
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3201: reservation implementation for new buffer pool

2016-08-15 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3201: reservation implementation for new buffer pool
..


Patch Set 3:

(7 comments)

Some initial comments. I think if you clean up some of the terminology it will 
be easier to follow and do the rest of the review.

http://gerrit.cloudera.org:8080/#/c/3993/3/be/src/bufferpool/reservation-tracker.h
File be/src/bufferpool/reservation-tracker.h:

PS3, Line 32: e.g. bytes of memory
is this really "e.g." and "resources" given the tight coupling with MemTracker? 
 Maybe we should just say "memory reservations" to be clearer about what this 
is.


Line 39: /// directly, or distributed to children of the tracker for its own 
reservation.
are there cases where a client would use reservation directly from a non-leaf 
node in the hierarchy?


PS3, Line 41: consume
to avoid confusion with memtracker consume/release, how about using the term 
"use" (to go along with IncreaseUsage()/DecreaseUsage())?


PS3, Line 51: consumption
usage


PS3, Line 55: optionally
what cases do we not want to have an associated MemTracker?
and it seems the last two can be specified as a more precise single invariant 
(though I'm not sure since I don't quite follow the last one).


PS3, Line 79: InitChildTracker
For InitRootTracker/InitChildTracker(), maybe these should be static methods 
that also construct the ReservationTracker() and then make the constructor 
private.  That way it's clear that each reservation tracker should be 
initialized to exactly one of these "personalities" (I assume that's the case).


PS3, Line 83: The reservation must be completely
:   /// released before calling Close().
This seems to contradict the previous sentence.  Oh, I guess you mean the 
reservation usage must be returned before calling Close()?


-- 
To view, visit http://gerrit.cloudera.org:8080/3993
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I35cc89e863efb4cc506657bfdaaaf633a10bbab6
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Remove dead and untested code

2016-08-15 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: Remove dead and untested code
..


Patch Set 1: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/3989
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I49c27cbfef03ef97befa9a607b3d8d7ac6e22a43
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] Remove dead and untested code

2016-08-15 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: Remove dead and untested code
..


Remove dead and untested code

Remove some code that the code coverage build revealed was untested.
It is all dead, aside from SetErrorMsg(), which only had one callsite.

Change-Id: I49c27cbfef03ef97befa9a607b3d8d7ac6e22a43
Reviewed-on: http://gerrit.cloudera.org:8080/3989
Reviewed-by: Dan Hecht 
Tested-by: Internal Jenkins
---
M be/src/common/status.h
M be/src/exprs/scalar-fn-call.cc
M be/src/runtime/coordinator.cc
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/runtime/raw-value.cc
M be/src/runtime/raw-value.h
M be/src/service/query-exec-state.h
8 files changed, 2 insertions(+), 220 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Dan Hecht: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/3989
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I49c27cbfef03ef97befa9a607b3d8d7ac6e22a43
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3962: Clean up scratch tuple batch on scan failures

2016-08-15 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-3962: Clean up scratch tuple batch on scan failures
..


Patch Set 1:

To answer the question of why the stress test hit this: the query was cancelled 
(I noticed that while debugging it last week but couldn't quite complete the 
whole picture). The cancellation was noticed eventually by 
HdfsParquetScanner::AssembleCollection() which eventually bubbles 
'continue_execution == false' up the stack to the loop in AssembleRows(), 
causing this bug. So, it may indeed be a good idea to add some debug actions 
there for debug builds.

-- 
To view, visit http://gerrit.cloudera.org:8080/3991
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If1e27a1517d09ccaabdae1492b7e1fbf661ae3e5
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3952: Clear scratch batch mem pool if Open() failed.

2016-08-15 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3952: Clear scratch batch mem pool if Open() failed.
..


Patch Set 3: Verified-1

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/89/

-- 
To view, visit http://gerrit.cloudera.org:8080/3953
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I55f32ed698a5b6fed8c28af1391aa07e1560e782
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3962: Clean up scratch tuple batch on scan failures

2016-08-15 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-3962: Clean up scratch tuple batch on scan failures
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/3991/1/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 343:   context_->ClearStreams();
> why is this under !advance_row_group_?  If we did this here, then it seems 
Yes. I am not 100% sure it's side-effect free (e.g. are these calls idempotent 
?). Let me think more about this.


Line 373:   if (context_->cancelled()) return Status::CANCELLED;
> why is this needed? won't the next GetNextInternal() notice this (inside Co
In case of parse error in AssembleRows(), we may not have called CommitRows().  
However, it may be worth some more rethinking of how the code should be 
structured so we can avoid this redundant check here.


Line 513: *skip_row_group = true;
> doesn't this path have the same bug?
Yes, missed that after rebase.


-- 
To view, visit http://gerrit.cloudera.org:8080/3991
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If1e27a1517d09ccaabdae1492b7e1fbf661ae3e5
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3964: Fix crash when a count(*) is performed on a nested collection.

2016-08-15 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3964: Fix crash when a count(*) is performed on a nested 
collection.
..


Patch Set 4: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/3940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id0955c85d18dfba4bd29a35ec95d0355da050607
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Christopher Channing 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Christopher Channing 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3952: Clear scratch batch mem pool if Open() failed.

2016-08-15 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-3952: Clear scratch batch mem pool if Open() failed.
..


Patch Set 3: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/3953
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I55f32ed698a5b6fed8c28af1391aa07e1560e782
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3953: Fixes for KuduScanNode BE test failure

2016-08-15 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3953: Fixes for KuduScanNode BE test failure
..


Patch Set 3: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/3888
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I16be206c60a692d2a26d719de8cc45e859b06e97
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3953: Fixes for KuduScanNode BE test failure

2016-08-15 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-3953: Fixes for KuduScanNode BE test failure
..


IMPALA-3953: Fixes for KuduScanNode BE test failure

After a previous fix for IMPALA-3857, KuduScanNodeTest
TestLimitsAreEnforced (BE test) occasionally throws when a
scanner thread takes a lock_ that isn't valid, crashing the
process.

It looks like the issue is likely that TestScanEmptyString
isn't closing its KuduScanNode, and a lingering
ScannerThread may end up touching invalid memory later.

This fixes the test case and also:
1) Adds some missing synchronization in KuduScanNode which
   was found in the process of investigating this issue.
2) Adds a DCHECK on ~KuduScanNode() to ensure it was closed.

This was tested by running KuduScanNodeTest in a loop for 5
hours. Without the fix, the failure was produced within
several hours.

Change-Id: I16be206c60a692d2a26d719de8cc45e859b06e97
Reviewed-on: http://gerrit.cloudera.org:8080/3888
Reviewed-by: Tim Armstrong 
Reviewed-by: Dan Hecht 
Tested-by: Internal Jenkins
---
M be/src/exec/kudu-scan-node-test.cc
M be/src/exec/kudu-scan-node.cc
2 files changed, 4 insertions(+), 0 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Dan Hecht: Looks good to me, approved
  Tim Armstrong: Looks good to me, but someone else must approve



-- 
To view, visit http://gerrit.cloudera.org:8080/3888
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I16be206c60a692d2a26d719de8cc45e859b06e97
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3936: BufferedBlockMgr fixes for Pin() while write in flight.

2016-08-15 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3936: BufferedBlockMgr fixes for Pin() while write in 
flight.
..


Patch Set 4: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/3832
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4be4fad8e6f2303db19ea1e2bd0f13523781ae8e
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3936: BufferedBlockMgr fixes for Pin() while write in flight.

2016-08-15 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-3936: BufferedBlockMgr fixes for Pin() while write in 
flight.
..


IMPALA-3936: BufferedBlockMgr fixes for Pin() while write in flight.

Fix multiple bugs that could occur when a block was unpinned then pinned
again while the write was in flight. There were two problems:

1. A block's buffer could be transferred while a write is in flight,
  leaving the block in an invalid state. The fix is to wait for
  the in-flight write to complete before transferring the buffer.
2. On certain code paths in WriteComplete(), condition variables weren't
  signalled, leading to threads waiting for write completion not being
  woken up. The fix is to clarify when condition variables will be
  signalled and ensure that the appropriate condition variables are
  always signalled when the write completes.

Testing:
Added a targeted unit test that exercises these code paths using a
debug option that controls timing of writes.

Reran the stress test configuration that reproducibly triggered the
bug: TPC-H query 18 on a release build with a single impalad.
It succeeded.

Change-Id: I4be4fad8e6f2303db19ea1e2bd0f13523781ae8e
Reviewed-on: http://gerrit.cloudera.org:8080/3832
Reviewed-by: Tim Armstrong 
Tested-by: Internal Jenkins
---
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-block-mgr.cc
M be/src/runtime/buffered-block-mgr.h
3 files changed, 121 insertions(+), 27 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Tim Armstrong: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/3832
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I4be4fad8e6f2303db19ea1e2bd0f13523781ae8e
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3893, IMPALA-3901: impala-shell prints incorrect coordinator address, overly verbose

2016-08-15 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-3893, IMPALA-3901: impala-shell prints incorrect 
coordinator address, overly verbose
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3994/2/shell/impala_shell.py
File shell/impala_shell.py:

Line 1112: return self._execute_stmt(query, print_web_link=True)
I've added print_web_link=True here as I UPDATE and DELETE statements are 
executed through here and don't have a separate "do_" function.


-- 
To view, visit http://gerrit.cloudera.org:8080/3994
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9d167b66f2dd8629e40a7094d21ea7ce6b43d23b
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3893, IMPALA-3901: impala-shell prints incorrect coordinator address, overly verbose

2016-08-15 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has uploaded a new patch set (#2).

Change subject: IMPALA-3893, IMPALA-3901: impala-shell prints incorrect 
coordinator address, overly verbose
..

IMPALA-3893, IMPALA-3901: impala-shell prints incorrect coordinator address, 
overly verbose

The webserver address was always configured as 0.0.0.0 (meaning that
the webserver could be reached on any IP for that machine) unless
otherwise specified. This is not a correct value to dispay to the
user. This patch returns the hostname of the node, when requested,
if the webserver host address is 0.0.0.0.

This patch also does not print the coordinator link for very simple
queries, as it's not necessary and is unnecessarily verbose.

This patch also does away with pinging the impalad an extra time per
query for finding the host time and webserver address. It instead
remembers the webserver address at connect time and displays client
local time for every query instead.

Change-Id: I9d167b66f2dd8629e40a7094d21ea7ce6b43d23b
---
M be/src/service/impala-beeswax-server.cc
M be/src/util/webserver.cc
M common/thrift/ImpalaService.thrift
M shell/impala_client.py
M shell/impala_shell.py
5 files changed, 30 insertions(+), 23 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/3994/2
-- 
To view, visit http://gerrit.cloudera.org:8080/3994
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9d167b66f2dd8629e40a7094d21ea7ce6b43d23b
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-3893, IMPALA-3901: impala-shell prints incorrect coordinator address, overly verbose

2016-08-15 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/3994

Change subject: IMPALA-3893, IMPALA-3901: impala-shell prints incorrect 
coordinator address, overly verbose
..

IMPALA-3893, IMPALA-3901: impala-shell prints incorrect coordinator address, 
overly verbose

The webserver address was always configured as 0.0.0.0 (meaning that
the webserver could be reached on any IP for that machine) unless
otherwise specified. This is not a correct value to dispay to the
user. This patch returns the hostname of the node, when requested,
if the webserver host address is 0.0.0.0.

This patch also does not print the coordinator link for very simple
queries, as it's not necessary and is unnecessarily verbose.

This patch also does away with pinging the impalad an extra time per
query for finding the host time and webserver address. It instead
remembers the webserver address at connect time and displays client
local time for every query instead.

Change-Id: I9d167b66f2dd8629e40a7094d21ea7ce6b43d23b
---
M be/src/service/impala-beeswax-server.cc
M be/src/util/webserver.cc
M common/thrift/ImpalaService.thrift
M shell/impala_client.py
M shell/impala_shell.py
5 files changed, 27 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/3994/1
-- 
To view, visit http://gerrit.cloudera.org:8080/3994
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I9d167b66f2dd8629e40a7094d21ea7ce6b43d23b
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-3201: reservation implementation for new buffer pool

2016-08-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3201: reservation implementation for new buffer pool
..


Patch Set 3:

I cut back this patch so that it doesn't actually include the bulk of the 
buffer pool implementation, just client registration and the ReservationTracker 
stuff. I'm going to rework the buffer pool implementation separately.

All the unit tests pass.

-- 
To view, visit http://gerrit.cloudera.org:8080/3993
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I35cc89e863efb4cc506657bfdaaaf633a10bbab6
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3201: reservation implementation for new buffer pool

2016-08-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#3).

Change subject: IMPALA-3201: reservation implementation for new buffer pool
..

IMPALA-3201: reservation implementation for new buffer pool

This patch implements the reservation bookkeeping logic for the new
buffer pool.

The ReservationTrackers integrate with MemTrackers by counting
reservations as consumption against a MemTracker. This reflects that
fact that the reservations are 'hard': the memory is committed and
cannot be used for other purposes.

Includes new reservation mechanism where reservations are always
guaranteed and buffers can't be pinned without a reservation.
Reservations are tracked via a hierarchy of ReservationTrackers.

Includes basic tests for buffer pool client registration and various
reservation operations.

Change-Id: I35cc89e863efb4cc506657bfdaaaf633a10bbab6
---
M be/CMakeLists.txt
A be/src/bufferpool/CMakeLists.txt
A be/src/bufferpool/buffer-pool-test.cc
A be/src/bufferpool/buffer-pool.cc
M be/src/bufferpool/buffer-pool.h
A be/src/bufferpool/reservation-tracker.cc
A be/src/bufferpool/reservation-tracker.h
M be/src/runtime/mem-tracker.h
M be/src/util/uid-util.h
9 files changed, 1,058 insertions(+), 44 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/3993/3
-- 
To view, visit http://gerrit.cloudera.org:8080/3993
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I35cc89e863efb4cc506657bfdaaaf633a10bbab6
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3962: Clean up scratch tuple batch on scan failures

2016-08-15 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-3962: Clean up scratch tuple batch on scan failures
..


Patch Set 1:

We probably need to create a malformed table with bad data in some columns.

-- 
To view, visit http://gerrit.cloudera.org:8080/3991
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If1e27a1517d09ccaabdae1492b7e1fbf661ae3e5
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3964: Fix crash when a count(*) is performed on a nested collection.

2016-08-15 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-3964: Fix crash when a count(*) is performed on a nested 
collection.
..


Patch Set 4: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/3940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id0955c85d18dfba4bd29a35ec95d0355da050607
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Christopher Channing 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Christopher Channing 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-CR](cdh5-trunk) IMPALA-3629: Codegen TransferScratchTuples() in hdfs-parquet-scanner

2016-08-15 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-3629: Codegen TransferScratchTuples() in 
hdfs-parquet-scanner
..


Patch Set 8:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/3774/8/be/src/exec/hdfs-avro-scanner.cc
File be/src/exec/hdfs-avro-scanner.cc:

PS8, Line 1022:   Function* fn =
  :   codegen->GetFunction(IRFunction::DECODE_AVRO_DATA, true);
1 line


PS8, Line 1025:   int replaced = codegen->ReplaceCallSites(fn, 
materialize_tuple_fn,
  :   "MaterializeTuple");
1 line


PS8, Line 1033:   replaced = codegen->ReplaceCallSites(fn, eval_conjuncts_fn,
  :   "EvalConjuncts");
1 line


http://gerrit.cloudera.org:8080/#/c/3774/8/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

PS8, Line 582: (HdfsScanNode* node,
 : const vector& conjunct_ctxs,
 : llvm::Function** process_scratch_batch_fn
can some of these lines be combined?


PS8, Line 592: 
 :   llvm::Function* fn =
 :   codegen->GetFunction(IRFunction::PROCESS_SCRATCH_BATCH, 
true);
1l


PS8, Line 601: 
 :   int replaced = codegen->ReplaceCallSites(fn, eval_conjuncts_fn,
 :   "EvalConjuncts");
1


-- 
To view, visit http://gerrit.cloudera.org:8080/3774
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic327e437c7cd2b3f92cdb11c1e907bfee2d44ee8
Gerrit-PatchSet: 8
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3201: reservation implementation for new buffer pool

2016-08-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#2).

Change subject: IMPALA-3201: reservation implementation for new buffer pool
..

IMPALA-3201: reservation implementation for new buffer pool

This patch implements the reservation bookkeeping logic for the new
buffer pool.

The ReservationTrackers integrate with MemTrackers by counting
reservations as consumption against a MemTracker. This reflects that
fact that the reservations are 'hard': the memory is committed and
cannot be used for other purposes.

Includes new reservation mechanism where reservations are always
guaranteed and buffers can't be pinned without a reservation.
Reservations are tracked via a hierarchy of ReservationTrackers.

Includes basic tests for buffer pool client registration and various
reservation operations.

Change-Id: I35cc89e863efb4cc506657bfdaaaf633a10bbab6
---
M be/CMakeLists.txt
A be/src/bufferpool/CMakeLists.txt
A be/src/bufferpool/buffer-pool-test.cc
A be/src/bufferpool/buffer-pool.cc
M be/src/bufferpool/buffer-pool.h
A be/src/bufferpool/reservation-tracker.cc
A be/src/bufferpool/reservation-tracker.h
M be/src/runtime/mem-tracker.h
8 files changed, 1,054 insertions(+), 44 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/3993/2
-- 
To view, visit http://gerrit.cloudera.org:8080/3993
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I35cc89e863efb4cc506657bfdaaaf633a10bbab6
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Tim Armstrong 


[Impala-CR](cdh5-trunk) IMPALA-3201: headers and reservation logic for new buffer pool

2016-08-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3201: headers and reservation logic for new buffer pool
..


Patch Set 23:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/2569/23/be/src/bufferpool/buffer-pool.h
File be/src/bufferpool/buffer-pool.h:

PS23, Line 107: duplicating it
> Isn't this made moot now that DuplicateHandle() is removed (and also clarif
Yeah I just flushed out my old comments.


-- 
To view, visit http://gerrit.cloudera.org:8080/2569
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I35cc89e863efb4cc506657bfdaaaf633a10bbab6
Gerrit-PatchSet: 23
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3856,IMPALA-3871: Fix BinaryPredicate normalization for Kudu

2016-08-15 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has uploaded a new patch set (#2).

Change subject: IMPALA-3856,IMPALA-3871: Fix BinaryPredicate normalization for 
Kudu
..

IMPALA-3856,IMPALA-3871: Fix BinaryPredicate normalization for Kudu

Change-Id: Iae7612433a2e27f8887abe6624f9ee0f4867b934
---
M fe/src/main/java/com/cloudera/impala/analysis/BinaryPredicate.java
M fe/src/main/java/com/cloudera/impala/analysis/Expr.java
M fe/src/main/java/com/cloudera/impala/analysis/LiteralExpr.java
A testdata/workloads/functional-query/queries/QueryTest/kudu-expr.test
M tests/query_test/test_kudu.py
5 files changed, 84 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/3986/2
-- 
To view, visit http://gerrit.cloudera.org:8080/3986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iae7612433a2e27f8887abe6624f9ee0f4867b934
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 


Code formatting with clang-format

2016-08-15 Thread Jim Apple
I would like to have a clang-format config file in our directory to help
new contributors understand how to format code and have a tool to do it for
them. Through the time I've been sending patches I've been accumulating a
.clang-format file that seems to minimize the style comments I get. You can
see it here:

https://gerrit.cloudera.org/#/c/3886

And you can save it and upload it to play with here:

http://zed0.co.uk/clang-format-configurator/

I would love to hear your thoughts.


[Impala-CR](cdh5-trunk) IMPALA-3201: headers and reservation logic for new buffer pool

2016-08-15 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3201: headers and reservation logic for new buffer pool
..


Patch Set 23:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/2569/23/be/src/bufferpool/buffer-pool.h
File be/src/bufferpool/buffer-pool.h:

PS23, Line 107: duplicating it
> I think it's implied by the DuplicateHandle() comment stating that the new 
Isn't this made moot now that DuplicateHandle() is removed (and also clarified 
given that PageHandle's can't be transferred)?


-- 
To view, visit http://gerrit.cloudera.org:8080/2569
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I35cc89e863efb4cc506657bfdaaaf633a10bbab6
Gerrit-PatchSet: 23
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Add .clang-format for Impala's C++ style

2016-08-15 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: Add .clang-format for Impala's C++ style
..


Patch Set 1:

> I ran this over a few patches. Some small things I noticed:
 > 
 > 1. The arg-breaking behaviour is different: the formatter prefers
 > to put all arguments on one line even if that means adding a
 > newline before the first one.
 
BinPackParameters can change this, along with 
AllowAllParametersOfDeclarationOnNextLine. If both are false, instead of all 
parameters being on the next line, every parameter will get its own line. That 
doesn't seem closer to our style, though. There may not be a closer 
clang-format config for this.

 > 2. for (auto& foo: bar) -> for (auto& foo : bar). I don't care
 > about this one, all the same to me.

I don't see a way to configure this one, unfortunately.

 > 3. Indentation on constructor member initialisation lists is by two
 > spaces, not four (i.e. line up with the ':'). Again, I don't care
 > about that.

This seems to match, for instance, analytic-eval-node.cc -- the colon is two 
space from the left margin (i.e. starts in column 3), and the member variables 
start in column 5.

This is what I get with this config with 
http://zed0.co.uk/clang-format-configurator/

I think I must be misunderstanding.

 > I don't see any problems with these small changes, but you might
 > want to socialise this a bit on impala-dev@ before committing the
 > formatter. It would be excellent for this to be as close to our
 > canonical style as possible, even if that means changing our
 > canonical style a bit.

OK, I'll email dev@.

-- 
To view, visit http://gerrit.cloudera.org:8080/3886
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I274c5993c7be344fc4b7729d21a13da993f9f3aa
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-CR](cdh5-trunk) IMPALA-3201: headers and reservation logic for new buffer pool

2016-08-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has abandoned this change.

Change subject: IMPALA-3201: headers and reservation logic for new buffer pool
..


Abandoned

Moved to https://gerrit.cloudera.org/#/c/3993/

-- 
To view, visit http://gerrit.cloudera.org:8080/2569
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: I35cc89e863efb4cc506657bfdaaaf633a10bbab6
Gerrit-PatchSet: 24
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3201: headers and reservation logic for new buffer pool

2016-08-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3201: headers and reservation logic for new buffer pool
..


Patch Set 1:

See https://gerrit.cloudera.org/#/c/2569/ for history. 

This isn't ready for review, I'm going to work on rebasing onto the header 
changes.

-- 
To view, visit http://gerrit.cloudera.org:8080/3993
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I35cc89e863efb4cc506657bfdaaaf633a10bbab6
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3470: DecompressorTest is flaky.

2016-08-15 Thread Juan Yu (Code Review)
Juan Yu has posted comments on this change.

Change subject: IMPALA-3470: DecompressorTest is flaky.
..


Patch Set 3: Code-Review+2

Rebase, Carry Dan's +2

-- 
To view, visit http://gerrit.cloudera.org:8080/3954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7ebaa403abf45e31f38d6cf4e557d6274d877a8a
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Juan Yu 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3201: headers and reservation logic for new buffer pool

2016-08-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/3993

Change subject: IMPALA-3201: headers and reservation logic for new buffer pool
..

IMPALA-3201: headers and reservation logic for new buffer pool

This patch implements the core interface and reservation bookkeeping
logic for the new buffer pool. It is far from fully functional
but is capable of tracking reservations and allocating buffers based
on those reservations.

The ReservationTrackers integrate with MemTrackers by counting
reservations as consumption against a MemTracker. This reflects that
fact that the reservations are 'hard': the memory is committed and
cannot be used for other purposes.

Includes new reservation mechanism where reservations are always
guaranteed and buffers can't be pinned without a reservation.
Reservations are tracked via a hierarchy of ReservationTrackers.

Locks are fine-grained so that the buffer pool can scale to many
concurrent queries.

Includes basic tests for buffer pool setup, allocation and reservations.

Change-Id: I35cc89e863efb4cc506657bfdaaaf633a10bbab6
---
M be/CMakeLists.txt
A be/src/bufferpool/CMakeLists.txt
A be/src/bufferpool/buffer-allocator.cc
A be/src/bufferpool/buffer-allocator.h
A be/src/bufferpool/buffer-pool-test.cc
A be/src/bufferpool/buffer-pool.cc
A be/src/bufferpool/buffer-pool.h
A be/src/bufferpool/reservation-tracker.cc
A be/src/bufferpool/reservation-tracker.h
M be/src/runtime/mem-tracker.h
M be/src/util/internal-queue.h
M be/src/util/uid-util.h
M common/thrift/generate_error_codes.py
13 files changed, 2,183 insertions(+), 41 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/3993/1
-- 
To view, visit http://gerrit.cloudera.org:8080/3993
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I35cc89e863efb4cc506657bfdaaaf633a10bbab6
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-CR](cdh5-trunk) IMPALA-3201: headers and reservation logic for new buffer pool

2016-08-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#24).

Change subject: IMPALA-3201: headers and reservation logic for new buffer pool
..

IMPALA-3201: headers and reservation logic for new buffer pool

This patch implements the core interface and reservation bookkeeping
logic for the new buffer pool. It is far from fully functional
but is capable of tracking reservations and allocating buffers based
on those reservations.

The ReservationTrackers integrate with MemTrackers by counting
reservations as consumption against a MemTracker. This reflects that
fact that the reservations are 'hard': the memory is committed and
cannot be used for other purposes.

Includes new reservation mechanism where reservations are always
guaranteed and buffers can't be pinned without a reservation.
Reservations are tracked via a hierarchy of ReservationTrackers.

Locks are fine-grained so that the buffer pool can scale to many
concurrent queries.

Includes basic tests for buffer pool setup, allocation and reservations.

Change-Id: I35cc89e863efb4cc506657bfdaaaf633a10bbab6
---
M be/CMakeLists.txt
A be/src/bufferpool/CMakeLists.txt
A be/src/bufferpool/buffer-allocator.cc
A be/src/bufferpool/buffer-allocator.h
A be/src/bufferpool/buffer-pool-test.cc
A be/src/bufferpool/buffer-pool.cc
A be/src/bufferpool/buffer-pool.h
A be/src/bufferpool/reservation-tracker.cc
A be/src/bufferpool/reservation-tracker.h
M be/src/runtime/mem-tracker.h
M be/src/util/internal-queue.h
M be/src/util/uid-util.h
M common/thrift/generate_error_codes.py
13 files changed, 2,183 insertions(+), 41 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/69/2569/24
-- 
To view, visit http://gerrit.cloudera.org:8080/2569
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I35cc89e863efb4cc506657bfdaaaf633a10bbab6
Gerrit-PatchSet: 24
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 


[Impala-CR](cdh5-trunk) IMPALA-3201: headers and reservation logic for new buffer pool

2016-08-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3201: headers and reservation logic for new buffer pool
..


Patch Set 23:

(8 comments)

Addressed the comments. I will now move to Impala-ASF and reconcile it with the 
header patch.

http://gerrit.cloudera.org:8080/#/c/2569/23/be/src/bufferpool/buffer-allocator.cc
File be/src/bufferpool/buffer-allocator.cc:

PS23, Line 19: BufferAllocator::BufferAllocator(int64_t min_buffer_len): 
min_buffer_len_(min_buffer_len) {}
> nit: long line.
Done


PS23, Line 28: int64_t len
> Is this needed or are you intentionally leaving this for extension to mmap(
Exactly, this is just a stub implementation now but I wanted to have the more 
general interface.


http://gerrit.cloudera.org:8080/#/c/2569/23/be/src/bufferpool/buffer-pool.cc
File be/src/bufferpool/buffer-pool.cc:

PS23, Line 40: Decrement
> Increment
Done


PS23, Line 55: --pin_count
> DCHECK_GT(pin_count, 0); before this line ?!
Done


PS23, Line 354: "Buffer bytes limit $0 of buffer "
> nit: Long line
Done


http://gerrit.cloudera.org:8080/#/c/2569/23/be/src/bufferpool/buffer-pool.h
File be/src/bufferpool/buffer-pool.h:

Line 66: ///   stays mapped to the same buffer.
> Do pinning and unpinning always happen via a page handle ? If so, can you p
I think the "Page Handles" section more-or-less covers this - all page 
operations go through a page handle. Added a brief clarification here that 
clients don't interact with pages directly.


PS23, Line 101:  BufferPool::CloseHandle()
> Does closing a page handle implicitly unpin the page it references ?
Answered in the next section of this comment and also in the CloseHandle() 
method comment: "A page is destroyed once the last open handle is closed via 
CloseHandle()."

There's another corner-case where maybe you have two handles - one pinned and 
another unpinned, and you close the pinned one. In that case the underlying 
page becomes implicitly unpinned. This is ok since you couldn't access the 
page's buffer via the unpinned handle anyway.


PS23, Line 107: duplicating it
> Reading this header file, it's not entirely clear how handle's duplication 
I think it's implied by the DuplicateHandle() comment stating that the new 
handle starts with pin count 0. I added an additional clarification. The idea 
is to separate the duplication of handles from operations on reservations.


-- 
To view, visit http://gerrit.cloudera.org:8080/2569
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I35cc89e863efb4cc506657bfdaaaf633a10bbab6
Gerrit-PatchSet: 23
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3201: buffer pool header only

2016-08-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3201: buffer pool header only
..


Patch Set 1:

I'm planning to address them once we get to the implementation. There was a lot 
of churn in the interface so I think the interface comments were not directly 
applicable.

-- 
To view, visit http://gerrit.cloudera.org:8080/3992
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id771dea2eb4c1aa13c30d59e8b184a7d1bca8d34
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] PREVIEW: IMPALA-2550 Introduce query-wide execution context.

2016-08-15 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: PREVIEW: IMPALA-2550 Introduce query-wide execution context.
..


Patch Set 5:

(25 comments)

http://gerrit.cloudera.org:8080/#/c/3817/5/be/src/runtime/fragment-exec-state.h
File be/src/runtime/fragment-exec-state.h:

Line 32: /// Execution state of a single plan fragment.
what differentiates this from RuntimeState. ie, are these really two separate 
abstractions (and if so, what are the differentiating characteristics)?


Line 40:   boost::mem_fn(::ReportStatusCb),
lambda


Line 42:   client_cache_(exec_env->impalad_client_cache()), 
exec_params_(params) {
break into two lines, all other members are on separate lines


Line 80:   // TODO: make pointer to shared per-query state
what does this todo mean?


http://gerrit.cloudera.org:8080/#/c/3817/5/be/src/runtime/plan-fragment-executor.h
File be/src/runtime/plan-fragment-executor.h:

Line 64: class PlanFragmentExecutor {
needs new name. FragmentInstanceExecutor?


http://gerrit.cloudera.org:8080/#/c/3817/3/be/src/runtime/query-exec-state.h
File be/src/runtime/query-exec-state.h:

Line 56:   const DescriptorTbl& desc_tbl() const { return *desc_tbl_; }
> 'Accept' was just a way to say that it was getting it from the QueryExecMgr
my point was: don't use superfluous ('incoming': all fragment instances are 
incoming) words or words that convey no meaning ('accept'?) in names, they 
don't help the reader of your code.


http://gerrit.cloudera.org:8080/#/c/3817/5/be/src/runtime/query-exec-state.h
File be/src/runtime/query-exec-state.h:

Line 55:   ObjectPool* obj_pool() const { return obj_pool_.get(); }
.get()?


Line 79:   QueryExecMgr* query_exec_mgr_;
presumably "not owned"


Line 85:   ObjectPool* obj_pool_;
owner?


Line 101:   /// referenced as a shared_ptr to allow asynchronous calls to 
CancelPlanFragment()
why do we need a shared_ptr? we should get rid of implicit mem mgmt with 
shared_ptrs. anyone with an active FragmentExecState should also be included in 
the ref count.

also, this doesn't need to be a map, you can use an array/vector instead (and 
use instance_id - query_id as the index.


Line 108:   std::shared_ptr GetFragmentExecState(
remove shared_ptr or unique_ptr since we're now switching to an explicit mem 
mgmt model


Line 115:   void FragmentInstanceExecThread(const TUniqueId& 
fragment_instance_id);
why not just ExecInstance?


http://gerrit.cloudera.org:8080/#/c/3817/5/be/src/service/client-request-exec-state.h
File be/src/service/client-request-exec-state.h:

Line 44: /// Execution state of a query. This captures everything necessary
clarify comment. this isn't just the "execution state".


http://gerrit.cloudera.org:8080/#/c/3817/5/be/src/service/query-exec-mgr.h
File be/src/service/query-exec-mgr.h:

Line 40: /// offer itself up for destruction when it is no longer being used 
and the
'no longer being used' is vague. mention (and make concrete in code) the 
reference counting mechanism.


Line 59:   /// no-op (because the fragment is unregistered).
you're also introducing terminology 'registered' and 'unregistered'. what are 
the semantics of that? is this distinct from live/refcount > 0 and 
not-live/refcount == 0?


Line 60:   Status RegisterFragmentInstanceWithQuery(const 
TExecPlanFragmentParams& params);
why doesn't this return the qes? you'd typically want to access that right 
after this call.


Line 62:   /// Return a pointer to the QueryExecState if one can be found for 
the given query id
describe the effect on the reference count (a function comment should describe 
the externally observable behavior of a function, and a change in the refcount 
is one of the main side effects of calling this function).

this is also missing elsewhere.


Line 69:   /// Decrements query_exec_state->num_active_references and schedules 
it for destruction
reference counting is a central concept here, it needs to be described in a 
class comment.

also, i couldn't find the reference count anywhere so far.

also, why isn't this a member function of qes itself?


Line 75:   Status GetQueryId(const TUniqueId& fragment_instance_id, TUniqueId* 
query_id);
why do we need this?


Line 87:   /// This class c'tor takes a query_id and guarantees the caller one 
of two things:
i think something like a lock_guard might be more appropriate - it guarantees 
unlocking/decreasing the refcount, but it doesn't mediate access. what you have 
will get unwieldy very quickly.


Line 97:   /// TODO: Is this really necessary? The downside is that this class 
needs to duplicate
i don't understand the purpose of preventing access to the qes.

transfer ownership where?


Line 99:   class ScopedQueryExecStateRef {
why is this local to qem?


Line 124: QueryExecMgr* query_exec_mgr_;
why do you need this?


Line 126: const TUniqueId query_id_;
do you need this if query_exec_state_ is null?


Line 137:   typedef 

[Impala-ASF-CR] IMPALA-3201: buffer pool header only

2016-08-15 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-3201: buffer pool header only
..


Patch Set 1:

Please feel free to address them in the next patch though.

-- 
To view, visit http://gerrit.cloudera.org:8080/3992
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id771dea2eb4c1aa13c30d59e8b184a7d1bca8d34
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


Logging and Debugging Impala C++ UDF

2016-08-15 Thread Manish Nema
Hi Folks,
 I am using CDH 5.7 , I am porting existing DB2 implementation of UDF to
Impala, I was wondering how to debug/log information from UDF, could
somebody help me with that

Thanks,
Manish


[Impala-ASF-CR] IMPALA-3201: buffer pool header only

2016-08-15 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-3201: buffer pool header only
..


Patch Set 1:

Sorry to jump in late at this point but there are also some comments at 
https://gerrit.cloudera.org/#/c/2569

-- 
To view, visit http://gerrit.cloudera.org:8080/3992
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id771dea2eb4c1aa13c30d59e8b184a7d1bca8d34
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] Add .clang-format for Impala's C++ style

2016-08-15 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: Add .clang-format for Impala's C++ style
..


Patch Set 1:

I ran this over a few patches. Some small things I noticed:

1. The arg-breaking behaviour is different: the formatter prefers to put all 
arguments on one line even if that means adding a newline before the first one.

2. for (auto& foo: bar) -> for (auto& foo : bar). I don't care about this one, 
all the same to me.

3. Indentation on constructor member initialisation lists is by two spaces, not 
four (i.e. line up with the ':'). Again, I don't care about that. 

I don't see any problems with these small changes, but you might want to 
socialise this a bit on impala-dev@ before committing the formatter. It would 
be excellent for this to be as close to our canonical style as possible, even 
if that means changing our canonical style a bit.

-- 
To view, visit http://gerrit.cloudera.org:8080/3886
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I274c5993c7be344fc4b7729d21a13da993f9f3aa
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3201: buffer pool header only

2016-08-15 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3201: buffer pool header only
..


Patch Set 1:

Just so we have the trail: https://gerrit.cloudera.org/#/c/3688/

-- 
To view, visit http://gerrit.cloudera.org:8080/3992
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id771dea2eb4c1aa13c30d59e8b184a7d1bca8d34
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3201: buffer pool header only

2016-08-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3201: buffer pool header only
..


Patch Set 1: Code-Review+2

Carry +2 from cdh5-trunk

-- 
To view, visit http://gerrit.cloudera.org:8080/3992
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id771dea2eb4c1aa13c30d59e8b184a7d1bca8d34
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-CR](cdh5-trunk) IMPALA-3201: buffer pool header only

2016-08-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has abandoned this change.

Change subject: IMPALA-3201: buffer pool header only
..


Abandoned

move to  http://gerrit.cloudera.org:8080/3992

-- 
To view, visit http://gerrit.cloudera.org:8080/3688
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: Id771dea2eb4c1aa13c30d59e8b184a7d1bca8d34
Gerrit-PatchSet: 8
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3201: buffer pool header only

2016-08-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/3992

Change subject: IMPALA-3201: buffer pool header only
..

IMPALA-3201: buffer pool header only

Implementation will follow in a subsequent commit.

The BufferPool interface and reservation bookkeeping are documented.

Includes documentation for some guarantees of new reservation mechanism
where reservations are always guaranteed and pages can't be pinned and
buffers can't be allocated without without a reservation. Reservations
are tracked via a hierarchy of ReservationTrackers.

Change-Id: Id771dea2eb4c1aa13c30d59e8b184a7d1bca8d34
---
A be/src/bufferpool/buffer-pool.h
1 file changed, 318 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/3992/1
-- 
To view, visit http://gerrit.cloudera.org:8080/3992
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Id771dea2eb4c1aa13c30d59e8b184a7d1bca8d34
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-CR](cdh5-trunk) IMPALA-3201: buffer pool header only

2016-08-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3201: buffer pool header only
..


Patch Set 7:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/3688/7/be/src/bufferpool/buffer-pool.h
File be/src/bufferpool/buffer-pool.h:

PS7, Line 124: free up memory reservations
> this instance wasn't updated to be clearer on terminology:
Done


Line 155:   /// remainder will not be usable.
> alternatively, express size as multiple of min_buffer_len
My thought was that it was best to consistently denominate buffers in bytes 
rather than use two different units. There's a trade-off here in that we have 
to document what happens in this case, but I like the consistency.


Line 201:   /// closed and 'buffer_handle' will reference the buffer from 
'page_handle'. This
> "will be closed" = "will be destroyed", i assume? we seem to use close and 
The handle is closed and the page is destroyed. I'll use the second here since 
it's clearer.


PS7, Line 217: frees
> also not updated:
Done


Line 218:   /// reservation in 'src_client'. 'src' must be open and 'dst' must 
be closed before
> how do you get a closed BufferHandle, other than calling allocate and free 
The default constructor of BufferHandle creates a closed buffer handle: it's 
only opened by calling a buffer pool method with this as an argument. So the 
code looks like:

BufferHandle dst;
TransferBuffer(src_client, , dst_client, );


Line 218:   /// reservation in 'src_client'. 'src' must be open and 'dst' must 
be closed before
> we can decide to change this later, if the code that calls this ends up loo
Sounds good.


Line 279:   const BufferHandle* buffer_handle() const;
> presumably an error to call FreeBuffer on the returned handle directly?
Right, that's what the const modifier is enforcing.

Extended comment to explain it.


-- 
To view, visit http://gerrit.cloudera.org:8080/3688
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id771dea2eb4c1aa13c30d59e8b184a7d1bca8d34
Gerrit-PatchSet: 7
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-CR](cdh5-trunk) IMPALA-3201: buffer pool header only

2016-08-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3201: buffer pool header only
..


Patch Set 8: Code-Review+2

Carry +2

-- 
To view, visit http://gerrit.cloudera.org:8080/3688
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id771dea2eb4c1aa13c30d59e8b184a7d1bca8d34
Gerrit-PatchSet: 8
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-CR](cdh5-trunk) IMPALA-3201: buffer pool header only

2016-08-15 Thread Tim Armstrong (Code Review)
Hello Marcel Kornacker, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3688

to look at the new patch set (#8).

Change subject: IMPALA-3201: buffer pool header only
..

IMPALA-3201: buffer pool header only

Implementation will follow in a subsequent commit.

The BufferPool interface and reservation bookkeeping are documented.

Includes documentation for some guarantees of new reservation mechanism
where reservations are always guaranteed and pages can't be pinned and
buffers can't be allocated without without a reservation. Reservations
are tracked via a hierarchy of ReservationTrackers.

Change-Id: Id771dea2eb4c1aa13c30d59e8b184a7d1bca8d34
---
A be/src/bufferpool/buffer-pool.h
1 file changed, 318 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/88/3688/8
-- 
To view, visit http://gerrit.cloudera.org:8080/3688
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id771dea2eb4c1aa13c30d59e8b184a7d1bca8d34
Gerrit-PatchSet: 8
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3962: Clean up scratch tuple batch on scan failures

2016-08-15 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3962: Clean up scratch tuple batch on scan failures
..


Patch Set 1:

is there a better / more direct way to test this rather than relying on the 
stress test?

-- 
To view, visit http://gerrit.cloudera.org:8080/3991
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If1e27a1517d09ccaabdae1492b7e1fbf661ae3e5
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3962: Clean up scratch tuple batch on scan failures

2016-08-15 Thread Michael Ho (Code Review)
Michael Ho has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/3991

Change subject: IMPALA-3962: Clean up scratch tuple batch on scan failures
..

IMPALA-3962: Clean up scratch tuple batch on scan failures

The parquet scanner doesn't clean up scratch_batch_ properly
which causes it to process a partially filled scratch_batch_
if any of the column reader fails. This change cleans up the
scratch batch if any of the parquet column readers fails.
The clean up involves freeing the mem_pool of scratch_batch_
and setting number of tuples in scratch_batch_ to 0.

Change-Id: If1e27a1517d09ccaabdae1492b7e1fbf661ae3e5
---
M be/src/exec/hdfs-parquet-scanner.cc
1 file changed, 6 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/3991/1
-- 
To view, visit http://gerrit.cloudera.org:8080/3991
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: If1e27a1517d09ccaabdae1492b7e1fbf661ae3e5
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 


[Impala-ASF-CR] Remove dead and untested code

2016-08-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: Remove dead and untested code
..


Patch Set 1:

I did manually verify that the functions didn't have any callsites, so I'm 
convinced it's actually dead.

Building with shared objects doesn't work for me, but it hasn't worked for me 
since I upgraded to Ubuntu 5.10

I only build with gold, but we should get some coverage in the GVO.

-- 
To view, visit http://gerrit.cloudera.org:8080/3989
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I49c27cbfef03ef97befa9a607b3d8d7ac6e22a43
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3786: Replace "cloudera" with "apache" (part 1)

2016-08-15 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3786: Replace "cloudera" with "apache" (part 1)
..


Patch Set 2:

> I ran private Jenkins jobs, which passed, so our builds should be
 > fine. I don't really know anything about what downstream builds
 > exist or might be affected, like a cdh packaging build. Do you know
 > who would be a good person to ask about that?
 > 
 > I'm happy to squash the two reviews before committing, if people
 > think that's cleaner.

If you don't squash them, won't the build fail until the second one is 
committed?

-- 
To view, visit http://gerrit.cloudera.org:8080/3936
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3767dd1ee86df767075fdf1b371eb6b0b06668db
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3786: Replace "cloudera" with "apache" (part 1)

2016-08-15 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-3786: Replace "cloudera" with "apache" (part 1)
..


Patch Set 2: -Code-Review

I ran private Jenkins jobs, which passed, so our builds should be fine. I don't 
really know anything about what downstream builds exist or might be affected, 
like a cdh packaging build. Do you know who would be a good person to ask about 
that?

I'm happy to squash the two reviews before committing, if people think that's 
cleaner.

-- 
To view, visit http://gerrit.cloudera.org:8080/3936
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3767dd1ee86df767075fdf1b371eb6b0b06668db
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-CR](cdh5-trunk) IMPALA-3201: buffer pool header only

2016-08-15 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-3201: buffer pool header only
..


Patch Set 7: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3688/7/be/src/bufferpool/buffer-pool.h
File be/src/bufferpool/buffer-pool.h:

Line 218:   /// reservation in 'src_client'. 'src' must be open and 'dst' must 
be closed before
> how do you get a closed BufferHandle, other than calling allocate and free 
we can decide to change this later, if the code that calls this ends up looking 
awkward, i just wanted to point this out as something odd.


-- 
To view, visit http://gerrit.cloudera.org:8080/3688
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id771dea2eb4c1aa13c30d59e8b184a7d1bca8d34
Gerrit-PatchSet: 7
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-CR](cdh5-trunk) IMPALA-3629: Codegen TransferScratchTuples() in hdfs-parquet-scanner

2016-08-15 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-3629: Codegen TransferScratchTuples() in 
hdfs-parquet-scanner
..


Patch Set 8:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/3774/7/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

PS7, Line 162:   codegend_process_scratch_batch_fn_ = reinterpret_cast
> Is this check redundant as GetCodegenFn() should return NULL if codegen is 
Done


PS7, Line 553: //
> nit: 2 more spaces
Done


PS7, Line 585: 
> Did you mean to use these to replace some constants ? They don't seem to be
Done


http://gerrit.cloudera.org:8080/#/c/3774/7/be/src/exec/hdfs-parquet-scanner.h
File be/src/exec/hdfs-parquet-scanner.h:

PS7, Line 469: Processes a single row batch for TransferScratchTuples, looping 
over scratch_b
> Not really clear how this fn is different from TransferScratchTuples or why
Done


http://gerrit.cloudera.org:8080/#/c/3774/7/be/src/exec/hdfs-scanner.cc
File be/src/exec/hdfs-scanner.cc:

PS7, Line 558:   *write_complete_tuple_fn = codegen->FinalizeFunction(fn);
 :   if (*write_complete_
> Doesn't this trigger the DCHECK in some code which check the returned funct
Done


http://gerrit.cloudera.org:8080/#/c/3774/7/be/src/exec/hdfs-sequence-scanner.cc
File be/src/exec/hdfs-sequence-scanner.cc:

PS7, Line 63: writ
> nit: wrong indentation
Done


http://gerrit.cloudera.org:8080/#/c/3774/7/be/src/exec/hdfs-text-scanner.cc
File be/src/exec/hdfs-text-scanner.cc:

PS7, Line 699: DCHECK(*write_aligned_tuples_fn != NULL);
> Please see comments in CodegenWriteAlignedTuples() about this DCHECK().
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/3774
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic327e437c7cd2b3f92cdb11c1e907bfee2d44ee8
Gerrit-PatchSet: 8
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-CR](cdh5-trunk) IMPALA-3629: Codegen TransferScratchTuples() in hdfs-parquet-scanner

2016-08-15 Thread Thomas Tauber-Marshall (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3774

to look at the new patch set (#8).

Change subject: IMPALA-3629: Codegen TransferScratchTuples() in 
hdfs-parquet-scanner
..

IMPALA-3629: Codegen TransferScratchTuples() in hdfs-parquet-scanner

We currently don't do any codegen in the hdfs-parquet scanner,
which limits performance. This patch creates a new function,
ProcessScratchBatch, which contains the inner loop of
TransferScratchTuples, allowing us to cross-compile only this
performance critical part. It also uses CodegenEvalConjuncts to
replace the call to EvalConjuncts in ProcessScratchBatch with a
codegen'd version.

Additionally, it modifies the Codegen functions in
hdfs-avro/text/sequence-scanner to take an output parameter for
the codegen'd function and return a Status in order to improve
logging around codegen failures.

Change-Id: Ic327e437c7cd2b3f92cdb11c1e907bfee2d44ee8
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/exec/CMakeLists.txt
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-avro-scanner.h
A be/src/exec/hdfs-parquet-scanner-ir.cc
M be/src/exec/hdfs-parquet-scanner.cc
M be/src/exec/hdfs-parquet-scanner.h
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/hdfs-sequence-scanner.cc
M be/src/exec/hdfs-sequence-scanner.h
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/hdfs-text-scanner.h
15 files changed, 273 insertions(+), 125 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/74/3774/8
-- 
To view, visit http://gerrit.cloudera.org:8080/3774
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic327e437c7cd2b3f92cdb11c1e907bfee2d44ee8
Gerrit-PatchSet: 8
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] Remove dead and untested code

2016-08-15 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: Remove dead and untested code
..


Patch Set 1:

With and without shared objects? With clang and gcc for all of the modes? With 
and without the gold linker?

This may be a bit overzealous; just trying to make sure we don't accidentally 
break anybody.

-- 
To view, visit http://gerrit.cloudera.org:8080/3989
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I49c27cbfef03ef97befa9a607b3d8d7ac6e22a43
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-CR](cdh5-trunk) IMPALA-3201: buffer pool header only

2016-08-15 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-3201: buffer pool header only
..


Patch Set 7:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/3688/7/be/src/bufferpool/buffer-pool.h
File be/src/bufferpool/buffer-pool.h:

Line 155:   /// remainder will not be usable.
alternatively, express size as multiple of min_buffer_len


Line 201:   /// closed and 'buffer_handle' will reference the buffer from 
'page_handle'. This
"will be closed" = "will be destroyed", i assume? we seem to use close and 
destroy to mean the same thing.


Line 218:   /// reservation in 'src_client'. 'src' must be open and 'dst' must 
be closed before
how do you get a closed BufferHandle, other than calling allocate and free in 
sequence? should this maybe not require a src and dest after all, but do it 
in-place?


Line 279:   const BufferHandle* buffer_handle() const;
presumably an error to call FreeBuffer on the returned handle directly?


-- 
To view, visit http://gerrit.cloudera.org:8080/3688
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id771dea2eb4c1aa13c30d59e8b184a7d1bca8d34
Gerrit-PatchSet: 7
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Remove dead and untested code

2016-08-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: Remove dead and untested code
..


Patch Set 1:

It builds on ASAN, debug and release. Did you have any other compilation modes 
in mind?

-- 
To view, visit http://gerrit.cloudera.org:8080/3989
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I49c27cbfef03ef97befa9a607b3d8d7ac6e22a43
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-CR](cdh5-trunk) IMPALA-3823: Add timer to measure Parquet footer reads

2016-08-15 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3823: Add timer to measure Parquet footer reads
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3576/5/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 1891:   
process_footer_avg_timer_->UpdateCounter(_footer_process_timer);
Looking in AveragedCounter::UpdateCounter(), there's a couple of things that 
look concerning:
1) That method maintains a map from Counter* -> int, where the pointer is this 
argument.  But this argument is a stack parameter, so this doesn't seem like it 
will work like intended.
2) See the comment about synchronizing that map (appears to be the callers 
responsibility).


-- 
To view, visit http://gerrit.cloudera.org:8080/3576
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Icf87bad90037dd0cea63b10c537382ec0f980cbf
Gerrit-PatchSet: 5
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2347: Reuse metastore client connections in Catalog

2016-08-15 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-2347: Reuse metastore client connections in Catalog
..


IMPALA-2347: Reuse metastore client connections in Catalog

Currently we create a new connection to metastore every time Catalog
connects to HMS. This was intentionally done to circumvent HIVE-5181.
Given it is fixed already in Hive, this patch intends to refactor the
HMS client usage on the catalog to reuse the connections. Additionally
this patch makes MetaStoreClient implement AutoCloseable interface and
hence all the callers can use the try-with-resources to create a new
metastore client and needn't explicitly call release(). Also, this
patch increases the default initial metastore pool size to 10 from a
previous value of 5, which is less even for a decent DDL load.

In terms of design, this patch switches the metastore client
implementation to RetryingMetaStoreClient from previous implementation
of HiveMetaStoreClient. The reason for this switch is to handle HMS
failures from Catalog side where the entire metastore client pool
cache becomes stale in the event of a metastore restart and there is
no proper way to deal with it. RetryingMetaStoreClient has inbuilt
retry mechanism which reconnects stale connections in the event of
failures. For more details on retries and corresponding configurations,
check org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.

Change-Id: I517c0e1efef2584cd8d34017b33574f2ad69bd52
Reviewed-on: http://gerrit.cloudera.org:8080/3984
Reviewed-by: Bharath Vissapragada 
Tested-by: Internal Jenkins
---
M fe/src/main/java/com/cloudera/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/com/cloudera/impala/catalog/Catalog.java
M fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/com/cloudera/impala/catalog/DataSourceTable.java
M fe/src/main/java/com/cloudera/impala/catalog/HBaseTable.java
M fe/src/main/java/com/cloudera/impala/catalog/HdfsTable.java
M fe/src/main/java/com/cloudera/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/com/cloudera/impala/catalog/IncompleteTable.java
M fe/src/main/java/com/cloudera/impala/catalog/KuduTable.java
M fe/src/main/java/com/cloudera/impala/catalog/MetaStoreClientPool.java
M fe/src/main/java/com/cloudera/impala/catalog/Table.java
M fe/src/main/java/com/cloudera/impala/catalog/TableLoader.java
M fe/src/main/java/com/cloudera/impala/catalog/View.java
M fe/src/main/java/com/cloudera/impala/service/CatalogOpExecutor.java
M fe/src/main/java/com/cloudera/impala/util/MetaStoreUtil.java
M fe/src/test/java/com/cloudera/impala/catalog/CatalogTest.java
16 files changed, 124 insertions(+), 190 deletions(-)

Approvals:
  Bharath Vissapragada: Looks good to me, approved
  Internal Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/3984
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I517c0e1efef2584cd8d34017b33574f2ad69bd52
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Internal Jenkins


[Impala-ASF-CR] IMPALA-2347: Reuse metastore client connections in Catalog

2016-08-15 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-2347: Reuse metastore client connections in Catalog
..


Patch Set 1: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/3984
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I517c0e1efef2584cd8d34017b33574f2ad69bd52
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Internal Jenkins
Gerrit-HasComments: No


[Impala-ASF-CR] PREVIEW: IMPALA-2550 Introduce query-wide execution context.

2016-08-15 Thread David Knupp (Code Review)
David Knupp has posted comments on this change.

Change subject: PREVIEW: IMPALA-2550 Introduce query-wide execution context.
..


Patch Set 5:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/3817/5//COMMIT_MSG
Commit Message:

PS5, Line 15: recieves
receives


PS5, Line 17:  
A comma would be helpful here.


PS5, Line 24: and is
"and it is" ?


PS5, Line 24: which
This seems like an incomplete thought. Perhaps the "which" just needs to be 
removed?


-- 
To view, visit http://gerrit.cloudera.org:8080/3817
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I892091d6401acb2ea91ccb1623af54c6f9635e6c
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1619, IMPALA-3018: Address various small memory allocation related bugs

2016-08-15 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-1619, IMPALA-3018: Address various small memory 
allocation related bugs
..


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/3807
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6eb9a4472a65cf68edb0323b13d745277ead2e1d
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3964: Fix crash when a count(*) is performed on a nested collection.

2016-08-15 Thread Christopher Channing (Code Review)
Hello Alex Behm, Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3940

to look at the new patch set (#4).

Change subject: IMPALA-3964: Fix crash when a count(*) is performed on a nested 
collection.
..

IMPALA-3964: Fix crash when a count(*) is performed on a nested collection.

The Bug: Prior to this patch, a DCHECK was used to verify that the
underlying memory pool for the scratch batch was empty in a count based
scenario. For IMPALA-3964 (where a count(*) is performed on a nested
collection), if a Parquet column chunk is compressed, upon reading each
new data page it would be decompressed and eventually placed in to the
underlying scratch batch memory pool causing the aforementioned DCHECK
to fail. This was not picked up in the test suite as the TPCH nested
Parquet data is not compressed.

The Fix: Removed the erroneous DCHECK. Added logic to determine if any
memory in the scratch batch needs to be freed (due to the transfer that
occurs from the decompressed data pool), if so, it will be done.
Augmented the load_nested.py script to snappy compress each of the
tables within the 'tpch_nested_parquet' database. This is consistent with
how the flat TPCH Parquet data set is stored. Regarding test coverage,
there are already a number of tests that will perform nested collection
counts against the tables in the 'tpch_nested_parquet' database. For
uncompressed nested Parquet, the 'test_nested_types.py' test suite
leverages the 'ComplexTypesTbl' table to provide good coverage.

Change-Id: Id0955c85d18dfba4bd29a35ec95d0355da050607
---
M be/src/exec/hdfs-parquet-scanner.cc
M testdata/bin/load_nested.py
2 files changed, 7 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/3940/4
-- 
To view, visit http://gerrit.cloudera.org:8080/3940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id0955c85d18dfba4bd29a35ec95d0355da050607
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Christopher Channing 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Christopher Channing 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3964: Fix crash when a count(*) is performed on a nested collection.

2016-08-15 Thread Christopher Channing (Code Review)
Christopher Channing has posted comments on this change.

Change subject: IMPALA-3964: Fix crash when a count(*) is performed on a nested 
collection.
..


Patch Set 3:

(1 comment)

Made the change to use FreeAll() rather than transferring (also tweaked the 
commit message to reflect this).

http://gerrit.cloudera.org:8080/#/c/3940/3/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 574:   
dst_batch->tuple_data_pool()->AcquireData(scratch_batch_->mem_pool(), false);
> FreeAll() sounds good.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/3940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id0955c85d18dfba4bd29a35ec95d0355da050607
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Christopher Channing 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Christopher Channing 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3470: DecompressorTest is flaky.

2016-08-15 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3470: DecompressorTest is flaky.
..


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/3954
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7ebaa403abf45e31f38d6cf4e557d6274d877a8a
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Juan Yu 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Juan Yu 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3786: Replace "cloudera" with "apache" (part 1)

2016-08-15 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3786: Replace "cloudera" with "apache" (part 1)
..


Patch Set 2:

Is there anything that needs to be updated on the packaging side of things? 
i.e. will a nightly build still work?

also, I guess you will squash this and the next commit together before 
committing?

-- 
To view, visit http://gerrit.cloudera.org:8080/3936
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3767dd1ee86df767075fdf1b371eb6b0b06668db
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3856,IMPALA-3871: Fix BinaryPredicate normalization for Kudu

2016-08-15 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-3856,IMPALA-3871: Fix BinaryPredicate normalization for 
Kudu
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/3986/1/fe/src/main/java/com/cloudera/impala/analysis/BinaryPredicate.java
File fe/src/main/java/com/cloudera/impala/analysis/BinaryPredicate.java:

Line 126: SlotRef ref = predicate.getBoundSlot();
I think we should make this procedure more specific. getBoundSlot() internally 
uses unwrapSlotRef() which drills through implicit casts, but we don't actually 
want that here.

Imo, we should check instanceof SlotRef on the children and not rely on 
getBoundSlot() or unwrapSlotRef()


Line 135:   predicate.foldConstantChildren(analyzer);
Some of the functions used here feel a little clunky with respect to exception 
behavior.

It's weird that this function throws for unsupported literal types. 

I think we should change foldConstantChildren() in one of the following ways:
* either not throw at all and just leave exprs that failed to fold
* only throw when the BE eval failed and add a boolean return value indicating 
whether all exprs succeeded in folding or not

Happy to discuss if you have other ideas for cleaning this up.


http://gerrit.cloudera.org:8080/#/c/3986/1/tests/query_test/test_kudu.py
File tests/query_test/test_kudu.py:

Line 100:   def test_kudu_expr(self, vector):
might be safer to mark this as serial due to setup and teardown. I know it's 
not required if only this one test is is not run serially, but as soon as you 
add another it's going to break.


-- 
To view, visit http://gerrit.cloudera.org:8080/3986
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iae7612433a2e27f8887abe6624f9ee0f4867b934
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Alex Behm 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3952: Clear scratch batch mem pool if Open() failed.

2016-08-15 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3952: Clear scratch batch mem pool if Open() failed.
..


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/3953
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I55f32ed698a5b6fed8c28af1391aa07e1560e782
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] PREVIEW: IMPALA-2550 Introduce query-wide execution context.

2016-08-15 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: PREVIEW: IMPALA-2550 Introduce query-wide execution context.
..


Patch Set 5:

(1 comment)

please do a pass to disambiguate fragments and their instances (e.g., 
FragmentExecState).

http://gerrit.cloudera.org:8080/#/c/3817/5/common/thrift/ImpalaInternalService.thrift
File common/thrift/ImpalaInternalService.thrift:

Line 499:   
spaces


-- 
To view, visit http://gerrit.cloudera.org:8080/3817
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I892091d6401acb2ea91ccb1623af54c6f9635e6c
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] PREVIEW: IMPALA-2550 Introduce query-wide execution context.

2016-08-15 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: PREVIEW: IMPALA-2550 Introduce query-wide execution context.
..


Patch Set 4:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/3817/4/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS4, Line 418: // TODO: Should we create a QueryExecState for the coordinator 
fragment?
> I thought the coordinator does not have a QEM (or previously a FragmentMgr)
for the sake of keeping this patch small, it's probably a good idea to leave 
out the restructuring of the coordinator (plus, this has lots of conflicts with 
my next set of mt-related changes)


http://gerrit.cloudera.org:8080/#/c/3817/4/be/src/runtime/fragment-exec-state.h
File be/src/runtime/fragment-exec-state.h:

Line 77:   // TODO: make pointer to shared per-query state
remove, unless there's something to do


http://gerrit.cloudera.org:8080/#/c/3817/4/be/src/service/query-exec-mgr.h
File be/src/service/query-exec-mgr.h:

PS4, Line 80: GetQueryExecStateFromFragmentId
> As we spoke everything coming in to a backend specifies only the fragment_i
we can also encode the instance id as query id + instance offset/instance 
number. that way we can build static arrays, indexed by instance number, that 
live in the qes.


Line 91:   class ScopedQueryExecStateRef {
> Done
please find a more concise name (maybe take inspiration from 'lock_guard'?)


-- 
To view, visit http://gerrit.cloudera.org:8080/3817
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I892091d6401acb2ea91ccb1623af54c6f9635e6c
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] PREVIEW, please see the ongoing discussion in the Jira IMPALA-3944: Fix for BackendConfig::RemoveBackend

2016-08-15 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: PREVIEW, please see the ongoing discussion in the Jira 
IMPALA-3944: Fix for BackendConfig::RemoveBackend
..


Patch Set 1:

I made a comment on the JIRA, but I'd question this approach.

I think that in practice this bug only affects the mini cluster, right? Outside 
of the minicluster, nobody should run multiple impalads per host (that are 
associated with the same Impala cluster).

I think we should consider simplifying the SimpleScheduler to not attempt to 
schedule on multiple impalads on the same host.

-- 
To view, visit http://gerrit.cloudera.org:8080/3982
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia77d54ae72ae87fbf279623c4924be2943a84a3e
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3936: BufferedBlockMgr fixes for Pin() while write in flight.

2016-08-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3936: BufferedBlockMgr fixes for Pin() while write in 
flight.
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3832/2//COMMIT_MSG
Commit Message:

Line 28: It succeeded.
> I'm fine with the unit test coverage, but just thinking that if this worklo
Agreed, I think it's timing-dependent though so I couldn't see a way to make it 
reproducible - I think the main thing would be to run the stress test on a 
variety of data sizes and hardware configs.

I plan to port many of the BufferedBlockMgr unit tests to the new buffer pool 
so we should have coverage of this interesting sequence of operations.


-- 
To view, visit http://gerrit.cloudera.org:8080/3832
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4be4fad8e6f2303db19ea1e2bd0f13523781ae8e
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3936: BufferedBlockMgr fixes for Pin() while write in flight.

2016-08-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3936: BufferedBlockMgr fixes for Pin() while write in 
flight.
..


Patch Set 4: Code-Review+2

rebase, carry +2

-- 
To view, visit http://gerrit.cloudera.org:8080/3832
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4be4fad8e6f2303db19ea1e2bd0f13523781ae8e
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3936: BufferedBlockMgr fixes for Pin() while write in flight.

2016-08-15 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3936: BufferedBlockMgr fixes for Pin() while write in 
flight.
..


Patch Set 3: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/3832
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4be4fad8e6f2303db19ea1e2bd0f13523781ae8e
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3936: BufferedBlockMgr fixes for Pin() while write in flight.

2016-08-15 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3936: BufferedBlockMgr fixes for Pin() while write in 
flight.
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3832/2//COMMIT_MSG
Commit Message:

Line 28: It succeeded.
> Hard to say - I only saw it in a specific configuration I was running local
I'm fine with the unit test coverage, but just thinking that if this workload 
provides an interesting stress on spilling, then it'd be nice to carry forward 
to the buffer pool.  But I suppose we don't have any reason to believe it would 
stress the new code in the same way.


-- 
To view, visit http://gerrit.cloudera.org:8080/3832
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4be4fad8e6f2303db19ea1e2bd0f13523781ae8e
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Remove dead and untested code

2016-08-15 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: Remove dead and untested code
..


Patch Set 1: Code-Review+2

Nice!

-- 
To view, visit http://gerrit.cloudera.org:8080/3989
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I49c27cbfef03ef97befa9a607b3d8d7ac6e22a43
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3953: Fixes for KuduScanNode BE test failure

2016-08-15 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3953: Fixes for KuduScanNode BE test failure
..


Patch Set 3: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/3888
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I16be206c60a692d2a26d719de8cc45e859b06e97
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-CR](cdh5-trunk) IMPALA-3823: Add timer to measure Parquet footer reads

2016-08-15 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-3823: Add timer to measure Parquet footer reads
..


Patch Set 5: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3576/5/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 1885:   // this time to the averaged timer.
Can you add a TODO that this should be a histogram, not an averaged counter?


-- 
To view, visit http://gerrit.cloudera.org:8080/3576
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Icf87bad90037dd0cea63b10c537382ec0f980cbf
Gerrit-PatchSet: 5
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3063: Separate join inversion from join ordering.

2016-08-15 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-3063: Separate join inversion from join ordering.
..


Patch Set 2:

(16 comments)

http://gerrit.cloudera.org:8080/#/c/3846/2/fe/src/main/java/com/cloudera/impala/analysis/Analyzer.java
File fe/src/main/java/com/cloudera/impala/analysis/Analyzer.java:

Line 125:   private boolean isStraightJoin_ = false;
nice clean-up


http://gerrit.cloudera.org:8080/#/c/3846/2/fe/src/main/java/com/cloudera/impala/planner/ExchangeNode.java
File fe/src/main/java/com/cloudera/impala/planner/ExchangeNode.java:

Line 81
do we still have these checks somewhere?


Line 95
what about this?


http://gerrit.cloudera.org:8080/#/c/3846/2/fe/src/main/java/com/cloudera/impala/planner/JoinNode.java
File fe/src/main/java/com/cloudera/impala/planner/JoinNode.java:

Line 478: for (BinaryPredicate p: eqJoinConjuncts_) 
Collections.swap(p.getChildren(), 0, 1);
add BinaryPredicate.reverse()?


http://gerrit.cloudera.org:8080/#/c/3846/2/fe/src/main/java/com/cloudera/impala/planner/Planner.java
File fe/src/main/java/com/cloudera/impala/planner/Planner.java:

Line 345:* 1. The right-hand side is a SingularRowSrcNode. We place such a 
node on the
the first sentence makes it sound like it inverts *if* the rhs is already a 
singularrowsrc, in other words, it moves it to the lhs.


Line 355:   private void invertJoins(PlanNode root, boolean isSingleNodeExec) {
explain 2nd param


Line 360:   for (PlanNode child: root.getChildren()) {
single line


Line 372:   //we cannot execute it efficiently.
the same is true for right-anything if it's a broadcast. do we know we're not 
going to run into that?

why does a repartitioning null-aware right anti join not work?


Line 373:   if (joinNode.isStraightJoin() || 
joinOp.isNullAwareLeftAntiJoin()) {
what does it mean to deal with 'straight join' here and not for the entire 
block?


Line 380: // Always place a singular row src on the build side because 
it has
has produces


Line 385: // There is no backend support for distributed non-equi right 
outer/semi joins.
this seems to contradict the condition, which triggers if !issinglenodeexec 
(maybe phrase that flag as isDistrExec?)


PS2, Line 388: build-side hash table
"the materialized rhs", no? (this still applies to nlj?)


Line 398: !(joinOp.isInnerJoin() && 
!joinNode.getEqJoinConjuncts().isEmpty())) {
why can't this be applied to inner equi joins?


Line 404: // Re-compute tuple ids since their order must correspond to the 
order of children.
why?


Line 414:   private PlanNode useNljForSmallBuilds(PlanNode root, Analyzer 
analyzer)
the name doesn't really reflect the (much narrower) semantics, find a better 
one.


Line 423: if (joinNode.getJoinOp().isNullAwareLeftAntiJoin()) return root;
also check that this is really a hash join


-- 
To view, visit http://gerrit.cloudera.org:8080/3846
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If86db7753fc585bb4c69612745ec010327a4
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3964: Fix crash when a count(*) is performed on a nested collection.

2016-08-15 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-3964: Fix crash when a count(*) is performed on a nested 
collection.
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3940/3/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 574:   
dst_batch->tuple_data_pool()->AcquireData(scratch_batch_->mem_pool(), false);
> FreeAll() also works for both cases (I ran a private build to confirm), so 
FreeAll() sounds good.


-- 
To view, visit http://gerrit.cloudera.org:8080/3940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id0955c85d18dfba4bd29a35ec95d0355da050607
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Christopher Channing 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Christopher Channing 
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-CR](cdh5-trunk) IMPALA-1766: Misc. statistical functions. Implemented aggregate corr().

2016-08-15 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-1766: Misc. statistical functions. Implemented aggregate 
corr().
..


Patch Set 2:

> As a reminder, the gerrit "Impala" project is now deprecated. To
 > lear now to switch, read
 > 
 > http://mail-archives.apache.org/mod_mbox/incubator-impala-dev/201607.mbox/%3ccac-psx2shyqsoxgcckrqmqbo-jjwdp6cmgotu_1snuqvkdv...@mail.gmail.com%3E
 > 
 > To make sure you receive all messages that look like this,
 > subscribe to
 > 
 > http://mail-archives.apache.org/mod_mbox/incubator-impala-dev/

How is it going?

-- 
To view, visit http://gerrit.cloudera.org:8080/2534
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7347c173ec4f80037d45dd463c17eb81ceef14a1
Gerrit-PatchSet: 2
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Abdur Rafay 
Gerrit-Reviewer: Abdur Rafay
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Skye Wanderman-Milne 
Gerrit-HasComments: No


[Impala-CR](cdh5-trunk) IMPALA-889: Add support for ISO-SQL trim()

2016-08-15 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-889: Add support for ISO-SQL trim()
..


Patch Set 10:

> Youwei, any plans for this?

Any news?

-- 
To view, visit http://gerrit.cloudera.org:8080/3213
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4753c608b0b00569bf8c5e95b132df6df358e602
Gerrit-PatchSet: 10
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Youwei Wang 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Youwei Wang 
Gerrit-HasComments: No


[Impala-ASF-CR] Remove dead and untested code

2016-08-15 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: Remove dead and untested code
..


Patch Set 1:

If this build under all of our different compilation modes, I'm happy with it.

-- 
To view, visit http://gerrit.cloudera.org:8080/3989
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I49c27cbfef03ef97befa9a607b3d8d7ac6e22a43
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Jim Apple 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3898: Add a pytest skipif decorator based on presence of Impala LZO.

2016-08-15 Thread David Knupp (Code Review)
David Knupp has posted comments on this change.

Change subject: IMPALA-3898: Add a pytest skipif decorator based on presence of 
Impala LZO.
..


Patch Set 7:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/3782/7/tests/common/environ.py
File tests/common/environ.py:

Line 252: def _is_impalalzo_present():
> I don't think we have negative or negative Hive integration tests explicitl
> "impala is able to create tables with that compression and then  
 > restarted without it enabled"

Do you have any info for restarting impala without LZO enabled?


 > "It make make sense to call this _is_impalad_impalalzo_enabled() to  
 > distinguish it from these potential other cases."

Also, just wanted to clarify -- this method has been entirely removed as of 
patches 8 and 9. You're not lobbying for bringing it back, right?


Line 257:   impala_lzo_root = os.environ.get('IMPALA_LZO')
> I think it would be a cleaner interface to add a py.test argument specifyin
> "I think it would be a cleaner interface to add a py.test argument
> specifying compression schemes to test that defaulted to not include 
> impala-lzo"

So, one thought in response to this suggestion (after some discussion with 
Michael B.) In general, we want our test runs to be as inclusive as possible be 
default. Making people specify an option to include LZO whenever they run 
impala-py.test or run-tests.py seems cumbersome, and seems like something that 
could be easily overlooked.


I just wanted to bring that point up before making a change like this.


 > "I also suggest filing a feature request for Impala to have an interface  
 > to return available compression schemes"

This would be useful. Part of why this relatively simple change got so 
convoluted is that there is (apparently) no good way to way to test if Impala 
LZO is supported -- so it's been necessary to rely on imperfect checks of env 
variables, or .so files.


Line 260:   return os.path.isfile(os.path.join(impala_lzo_root, 'build', 
'libimpalalzo.so'))
> This is going to fail for remote clusters, so suggest adding a comment (or 
So just to reinforce what was decided earlier, after an off-line discussion 
with Dan on Friday, this method has been removed in favor of simply checking 
whether or not the IMPALA_LZO environment variable has been set.


-- 
To view, visit http://gerrit.cloudera.org:8080/3782
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If61a7799205cd00d440196303a42db32c522f5b1
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Harrison Sheinblatt 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Remove dead and untested code

2016-08-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/3989

Change subject: Remove dead and untested code
..

Remove dead and untested code

Remove some code that the code coverage build revealed was untested.
It is all dead, aside from SetErrorMsg(), which only had one callsite.

Change-Id: I49c27cbfef03ef97befa9a607b3d8d7ac6e22a43
---
M be/src/common/status.h
M be/src/exprs/scalar-fn-call.cc
M be/src/runtime/coordinator.cc
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/runtime/raw-value.cc
M be/src/runtime/raw-value.h
M be/src/service/query-exec-state.h
8 files changed, 2 insertions(+), 220 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/3989/1
-- 
To view, visit http://gerrit.cloudera.org:8080/3989
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I49c27cbfef03ef97befa9a607b3d8d7ac6e22a43
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3952: Clear scratch batch mem pool if Open() failed.

2016-08-15 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-3952: Clear scratch batch mem pool if Open() failed.
..


Patch Set 2: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3953/1/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

PS1, Line 210: else if (!FLAGS_enable_partitioned_hash_join ||
 :   !FLAGS_enable_partitioned_aggregation) {
> I thought about executing the branch in any case, but I felt it's better to
I see, thanks that makes sense.


-- 
To view, visit http://gerrit.cloudera.org:8080/3953
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I55f32ed698a5b6fed8c28af1391aa07e1560e782
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: Yes


  1   2   >