[Impala-ASF-CR] Expose $IMPALA MAVEN OPTIONS for configuring Maven.

2017-11-13 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8496 )

Change subject: Expose $IMPALA_MAVEN_OPTIONS for configuring Maven.
..


Patch Set 4: Code-Review+2

Rebase, carry + 2


--
To view, visit http://gerrit.cloudera.org:8080/8496
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2c62185476fd2388c7cda8884276b79a77370127
Gerrit-Change-Number: 8496
Gerrit-PatchSet: 4
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 13 Nov 2017 21:58:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6148: Specifying thirdparty deps as URLs

2017-11-09 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8456 )

Change subject: IMPALA-6148: Specifying thirdparty deps as URLs
..


Patch Set 6: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8456
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4628d86022d4bd8b762313f7056d76416a58b422
Gerrit-Change-Number: 8456
Gerrit-PatchSet: 6
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Thu, 09 Nov 2017 23:22:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6148: Specifying thirdparty deps as URLs

2017-11-09 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8456 )

Change subject: IMPALA-6148: Specifying thirdparty deps as URLs
..


Patch Set 7: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8456
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4628d86022d4bd8b762313f7056d76416a58b422
Gerrit-Change-Number: 8456
Gerrit-PatchSet: 7
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Thu, 09 Nov 2017 23:22:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6148: Specifying thirdparty deps as URLs

2017-11-08 Thread Joe McDonnell (Code Review)
Joe McDonnell has removed a vote on this change.

Change subject: IMPALA-6148: Specifying thirdparty deps as URLs
..


Removed Verified+1 by Joe McDonnell 
--
To view, visit http://gerrit.cloudera.org:8080/8456
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I4628d86022d4bd8b762313f7056d76416a58b422
Gerrit-Change-Number: 8456
Gerrit-PatchSet: 5
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Zach Amsden 


[Impala-ASF-CR] IMPALA-6148: Specifying thirdparty deps as URLs

2017-11-08 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8456 )

Change subject: IMPALA-6148: Specifying thirdparty deps as URLs
..


Patch Set 5: Verified+1 Code-Review+2

Rebase, carry +2.


--
To view, visit http://gerrit.cloudera.org:8080/8456
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4628d86022d4bd8b762313f7056d76416a58b422
Gerrit-Change-Number: 8456
Gerrit-PatchSet: 5
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Wed, 08 Nov 2017 23:03:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4835 (prep only): create io subfolder and namespace

2017-11-07 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8424 )

Change subject: IMPALA-4835 (prep only): create io subfolder and namespace
..


Patch Set 6:

(1 comment)

One small comment

http://gerrit.cloudera.org:8080/#/c/8424/6/be/src/runtime/io/disk-io-mgr.h
File be/src/runtime/io/disk-io-mgr.h:

http://gerrit.cloudera.org:8080/#/c/8424/6/be/src/runtime/io/disk-io-mgr.h@184
PS6, Line 184: /// TODO: Break this up the .h/.cc into multiple files under an 
/io subdirectory.
Can remove this TODO



--
To view, visit http://gerrit.cloudera.org:8080/8424
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If807f93a47d8027a43e56dd80b1b535d0bb74e1b
Gerrit-Change-Number: 8424
Gerrit-PatchSet: 6
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 07 Nov 2017 19:39:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6148: Specifying thirdparty deps as URLs

2017-11-06 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8456 )

Change subject: IMPALA-6148: Specifying thirdparty deps as URLs
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8456/3/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/8456/3/bin/impala-config.sh@138
PS3, Line 138: # Download URLs for toolchain dependencies can be overridden by
 : # IMPALA__URLs in impala-config-*.sh. We unset them 
here first:
 : for var in "${!IMPALA_@}"; do
 :   if [[ "$var" == IMPALA_*_URL ]]; then
 : unset $var
 :   fi
 : done
If I understand this correctly, setting these in the environment won't work. 
Instead, they must be set in impala-config-branch.sh or impala-config-local.sh. 
Is that what we want?

One thing that we do for some environment variables crucial to the build is to 
have an OVERRIDE version of these variables. Sometimes we also respect the 
environment itself. See:
DOWNLOAD_CDH_COMPONENTS
HADOOP_INCLUDE_DIR_OVERRIDE
HADOOP_LIB_DIR_OVERRIDE
HIVE_SRC_DIR_OVERRIDE



--
To view, visit http://gerrit.cloudera.org:8080/8456
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4628d86022d4bd8b762313f7056d76416a58b422
Gerrit-Change-Number: 8456
Gerrit-PatchSet: 3
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Tue, 07 Nov 2017 01:34:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder

2017-11-03 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8034 )

Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
..


Patch Set 15:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8034/15/be/src/util/dict-encoding.h
File be/src/util/dict-encoding.h:

http://gerrit.cloudera.org:8080/#/c/8034/15/be/src/util/dict-encoding.h@77
PS15, Line 77:   void ClearIndices() {
> We generally use the more concise one-line formatting for very short single
To add to what Tim said, this code didn't actually change, so it is nice to 
avoid touching code if you can. It makes it easier to track down what code 
change last impacted a piece of code.


http://gerrit.cloudera.org:8080/#/c/8034/15/be/src/util/dict-encoding.h@152
PS15, Line 152:   /// Destructor, release the bytes used by Memtracker.
  :   ~DictEncoder() {
  : ReleaseBytes();
  : DCHECK(dict_bytes_cnt_ == 0);
  :   }
I think we can omit this destructor, since DictEncoderBase does the same thing.


http://gerrit.cloudera.org:8080/#/c/8034/15/be/src/util/dict-encoding.h@296
PS15, Line 296:   /// Destructor, release the bytes used by Memtracker and 
close.
  :   ~DictDecoder() {
  : ReleaseBytes();
  : DCHECK(dict_bytes_cnt_ == 0);
  :}
I think we can omit this destructor, as DictDecoderBase does this.



--
To view, visit http://gerrit.cloudera.org:8080/8034
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299
Gerrit-Change-Number: 8034
Gerrit-PatchSet: 15
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 03 Nov 2017 23:14:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4835 (prep only): create io subfolder and namespace

2017-11-02 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8424 )

Change subject: IMPALA-4835 (prep only): create io subfolder and namespace
..


Patch Set 4:

Made a first pass through this and it makes sense so far. I will make a second 
pass soon.


--
To view, visit http://gerrit.cloudera.org:8080/8424
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If807f93a47d8027a43e56dd80b1b535d0bb74e1b
Gerrit-Change-Number: 8424
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 02 Nov 2017 21:24:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder

2017-10-31 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8034 )

Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
..


Patch Set 14:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/util/dict-encoding.h
File be/src/util/dict-encoding.h:

http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/util/dict-encoding.h@63
PS12, Line 63:   /// dict_encoded_size() bytes.
 :   virtual void Writ
> implemented the logic of Close() inside ClearIndices()
Close() should call ClearIndices() as one of its steps, ClearIndices() still 
needs to exist separately and cannot incorporate the logic of Close().



--
To view, visit http://gerrit.cloudera.org:8080/8034
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299
Gerrit-Change-Number: 8034
Gerrit-PatchSet: 14
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 01 Nov 2017 01:16:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1575: part 2: yield admission control resourcesa

2017-10-30 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8323 )

Change subject: IMPALA-1575: part 2: yield admission control resourcesa
..


Patch Set 4: Code-Review+1

This looks good to me. It is much clearer than my original patch.


--
To view, visit http://gerrit.cloudera.org:8080/8323
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I80279eb2bda740d7f61420f52db3bfa42a6a51ac
Gerrit-Change-Number: 8323
Gerrit-PatchSet: 4
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 30 Oct 2017 23:10:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder

2017-10-27 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8034 )

Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
..


Patch Set 12:

(5 comments)

I think this is very close.

http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/exec/hdfs-parquet-scanner.h
File be/src/exec/hdfs-parquet-scanner.h:

http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/exec/hdfs-parquet-scanner.h@352
PS12, Line 352:   MemTracker* dict_mem_tracker() { return 
scan_node_->mem_tracker(); }
See comment in HdfsParquetTableWriter. Also, see how we pass the mem_tracker to 
data_page_pool_.


http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/exec/hdfs-parquet-table-writer.h
File be/src/exec/hdfs-parquet-table-writer.h:

http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/exec/hdfs-parquet-table-writer.h@81
PS12, Line 81:   /// Memory tracker to track the memory used by encoder.
 :   MemTracker* dict_mem_tracker();
When I'm reading the code, I'm thinking that this is hiding more than it is 
illuminating. We use this in exactly one place. I would rather have the exact 
code right there with a good comment explaining which mem_tracker we are using.


http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/exec/hdfs-parquet-table-writer.cc
File be/src/exec/hdfs-parquet-table-writer.cc:

http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/exec/hdfs-parquet-table-writer.cc@179
PS12, Line 179:   dict_encoder_base_->ClearIndices();
  :   dict_encoder_base_->Close();
When Close() does a call to ClearIndices(), this can be simplified to just a 
Close() call.


http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/exec/hdfs-parquet-table-writer.cc@322
PS12, Line 322:plain_encoded_value_size_,
  :parent_->dict_mem_tracker()));
Nit: Indentation in this case should be even with the "D" in new DictEncoder. 
It goes against every editor's typical behavior, but it's the standard we have. 
Also, feel free to keep plain_encoded_value_size_ on the first line. (It's on 
the edge of our 90 char limit.)


http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/util/dict-encoding.h
File be/src/util/dict-encoding.h:

http://gerrit.cloudera.org:8080/#/c/8034/12/be/src/util/dict-encoding.h@63
PS12, Line 63:   void Close() {
 : ReleaseBytes();
I think Close() should do ClearIndices().



--
To view, visit http://gerrit.cloudera.org:8080/8034
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299
Gerrit-Change-Number: 8034
Gerrit-PatchSet: 12
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 27 Oct 2017 23:59:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6068: Fix dataload for complextypes fileformat

2017-10-24 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8350 )

Change subject: IMPALA-6068: Fix dataload for complextypes_fileformat
..


Patch Set 6: Code-Review+2

Rebased, carry +2


--
To view, visit http://gerrit.cloudera.org:8080/8350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7152306b2907198204a6d8d282a0bad561129b82
Gerrit-Change-Number: 8350
Gerrit-PatchSet: 6
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Wed, 25 Oct 2017 00:18:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6068: Fix dataload for complextypes fileformat

2017-10-24 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8350 )

Change subject: IMPALA-6068: Fix dataload for complextypes_fileformat
..


Patch Set 5: Code-Review+1

Carry Alex's +1


--
To view, visit http://gerrit.cloudera.org:8080/8350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7152306b2907198204a6d8d282a0bad561129b82
Gerrit-Change-Number: 8350
Gerrit-PatchSet: 5
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Tue, 24 Oct 2017 20:53:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6068: Fix dataload for complextypes fileformat

2017-10-24 Thread Joe McDonnell (Code Review)
Hello Alex Behm,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8350

to look at the new patch set (#4).

Change subject: IMPALA-6068: Fix dataload for complextypes_fileformat
..

IMPALA-6068: Fix dataload for complextypes_fileformat

Dataload typically follows a pattern of loading data into
a text version of a table, and then using an insert
overwrite from the text table to populate the table for
other file formats. This insert is always done in Impala
for Parquet and Kudu. Otherwise it runs in Hive.

Since Impala doesn't support writing nested data, the
population of complextypes_fileformat tries to hack
the insert to run in Hive by including it in the ALTER
part of the table definition. ALTER runs immediately
after CREATE and always runs in Hive. The problem is
that ALTER also runs before the base table
(functional.complextypes_fileformat) is populated.
The insert succeeds, but it is inserting zero rows.

This code change introduces a way to force the Parquet
load to run using Hive. This lets complextypes_fileformat
specify that the insert should happen in Hive and fixes
the ordering so that the table is populated correctly.

This is also useful for loading custom Parquet files
into Parquet tables. Hive supports the DATA LOAD LOCAL
syntax, which can read a file from the local filesystem.
This means that several locations that currently use
the hdfs commandline can be modified to use this SQL.
This change speeds up dataload by a few minutes, as it
avoids the overhead of the hdfs commandline.

Any other location that could use DATA LOAD LOCAL is
also switched over to use it. This includes the
testescape* tables which now print the appropriate
DATA LOAD commands as a result of text_delims_table.py.
Any location that already uses DATA LOAD LOCAL is also
switched to indicate that it must run in Hive. Any
location that was doing an HDFS command in the LOAD
section is moved to the LOAD_DEPENDENT_HIVE section.

Testing: Ran dataload and core tests. Also verified that
functional_parquet.complextypes_fileformat has rows.

Change-Id: I7152306b2907198204a6d8d282a0bad561129b82
---
M testdata/bin/create-load-data.sh
M testdata/bin/generate-schema-statements.py
M testdata/common/text_delims_table.py
M testdata/common/widetable.py
M testdata/datasets/functional/functional_schema_template.sql
5 files changed, 160 insertions(+), 152 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/8350/4
--
To view, visit http://gerrit.cloudera.org:8080/8350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7152306b2907198204a6d8d282a0bad561129b82
Gerrit-Change-Number: 8350
Gerrit-PatchSet: 4
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Joe McDonnell 


[Impala-ASF-CR] IMPALA-6068: Fix dataload for complextypes fileformat

2017-10-24 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8350 )

Change subject: IMPALA-6068: Fix dataload for complextypes_fileformat
..


Patch Set 3:

(1 comment)

Found a couple issues in testing. Backed out some unrelated optimizations to 
fix the issue. I will merge those separately.

http://gerrit.cloudera.org:8080/#/c/8350/3/testdata/common/text_delims_table.py
File testdata/common/text_delims_table.py:

http://gerrit.cloudera.org:8080/#/c/8350/3/testdata/common/text_delims_table.py@60
PS3, Line 60:   print ("LOAD DATA LOCAL INPATH '%s' "
> Mention that this is intended to be printed into one of the sections of our
Added comments to this file and widetable.py to clarify this.



--
To view, visit http://gerrit.cloudera.org:8080/8350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7152306b2907198204a6d8d282a0bad561129b82
Gerrit-Change-Number: 8350
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Tue, 24 Oct 2017 20:52:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1575: Part 1: eagerly release query exec resources

2017-10-23 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8303 )

Change subject: IMPALA-1575: Part 1: eagerly release query exec resources
..


Patch Set 9: Code-Review+1

(1 comment)

This makes sense to me.

http://gerrit.cloudera.org:8080/#/c/8303/9/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/8303/9/be/src/runtime/query-state.cc@314
PS9, Line 314: exec_resource_refcnt_.Add(1);  // decremented in ExecFInstance()
Does it make sense for these exec_resource_refcnt_.Add(1) locations to use 
AcquireExecResourceRefcount?



--
To view, visit http://gerrit.cloudera.org:8080/8303
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341
Gerrit-Change-Number: 8303
Gerrit-PatchSet: 9
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 23 Oct 2017 22:18:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6068: Fix dataload for complextypes fileformat

2017-10-23 Thread Joe McDonnell (Code Review)
Hello Alex Behm,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8350

to look at the new patch set (#3).

Change subject: IMPALA-6068: Fix dataload for complextypes_fileformat
..

IMPALA-6068: Fix dataload for complextypes_fileformat

Dataload typically follows a pattern of loading data into
a text version of a table, and then using an insert
overwrite from the text table to populate the table for
other file formats. This insert is always done in Impala
for Parquet and Kudu. Otherwise it runs in Hive.

Since Impala doesn't support writing nested data, the
population of complextypes_fileformat tries to hack
the insert to run in Hive by including it in the ALTER
part of the table definition. ALTER runs immediately
after CREATE and always runs in Hive. The problem is
that ALTER also runs before the base table
(functional.complextypes_fileformat) is populated.
The insert succeeds, but it is inserting zero rows.

This code change introduces a way to force the Parquet
load to run using Hive. This lets complextypes_fileformat
specify that the insert should happen in Hive and fixes
the ordering so that the table is populated correctly.

This is also useful for loading custom Parquet files
into Parquet tables. Hive supports the DATA LOAD LOCAL
syntax, which can read a file from the local filesystem.
This means that several locations that currently use
the hdfs commandline can be modified to use this SQL.
This change speeds up dataload by a few minutes, as it
avoids the overhead of the hdfs commandline.

Any other location that could use DATA LOAD LOCAL is
also switched over to use it. This includes the
testescape* tables which now print the appropriate
DATA LOAD commands as a result of text_delims_table.py.
Any location that already uses DATA LOAD LOCAL is also
switched to indicate that it must run in Hive. Any
location that was doing an HDFS command in the LOAD
section is moved to the LOAD_DEPENDENT_HIVE section.

Testing: Ran dataload and core tests. Also verified that
functional_parquet.complextypes_fileformat has rows.

Change-Id: I7152306b2907198204a6d8d282a0bad561129b82
---
M testdata/bin/create-load-data.sh
M testdata/bin/generate-schema-statements.py
M testdata/common/text_delims_table.py
M testdata/datasets/functional/functional_schema_template.sql
4 files changed, 154 insertions(+), 156 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/8350/3
--
To view, visit http://gerrit.cloudera.org:8080/8350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7152306b2907198204a6d8d282a0bad561129b82
Gerrit-Change-Number: 8350
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Joe McDonnell 


[Impala-ASF-CR] IMPALA-6068: Fix dataload for complextypes fileformat

2017-10-23 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8350 )

Change subject: IMPALA-6068: Fix dataload for complextypes_fileformat
..


Patch Set 2:

(1 comment)

The next upload does two things:
1. It adds asserts to make sure CREATE and CREATE_HIVE can't both be specified 
and DEPENDENT_LOAD and DEPENDENT_LOAD_HIVE can't both be specified.
2. It changes the testescape* tables to use LOAD DATA statements (by printing 
them from text_delims_table.py).
3. Fixes a typo in one of the SQLs

http://gerrit.cloudera.org:8080/#/c/8350/2/testdata/bin/generate-schema-statements.py
File testdata/bin/generate-schema-statements.py:

http://gerrit.cloudera.org:8080/#/c/8350/2/testdata/bin/generate-schema-statements.py@528
PS2, Line 528:   insert = eval_section(section['DEPENDENT_LOAD'])
> Are these two sections intended to be mutually exclusive? If so, then we sh
Added an assert that we don't specify both.



--
To view, visit http://gerrit.cloudera.org:8080/8350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7152306b2907198204a6d8d282a0bad561129b82
Gerrit-Change-Number: 8350
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Mon, 23 Oct 2017 22:08:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6068: Fix dataload for complextypes fileformat

2017-10-23 Thread Joe McDonnell (Code Review)
Joe McDonnell has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/8350 )

Change subject: IMPALA-6068: Fix dataload for complextypes_fileformat
..

IMPALA-6068: Fix dataload for complextypes_fileformat

Dataload typically follows a pattern of loading data into
a text version of a table, and then using an insert
overwrite from the text table to populate the table for
other file formats. This insert is always done in Impala
for Parquet and Kudu. Otherwise it runs in Hive.

Since Impala doesn't support writing nested data, the
population of complextypes_fileformat tries to hack
the insert to run in Hive by including it in the ALTER
part of the table definition. ALTER runs immediately
after CREATE and always runs in Hive. The problem is
that ALTER also runs before the base table
(functional.complextypes_fileformat) is populated.
The insert succeeds, but it is inserting zero rows.

This code change introduces a way to force the Parquet
load to run using Hive. This lets complextypes_fileformat
specify that the insert should happen in Hive and fixes
the ordering so that the table is populated correctly.

This is also useful for loading custom Parquet files
into Parquet tables. Hive supports the DATA LOAD LOCAL
syntax, which can read a file from the local filesystem.
This means that several locations that currently use
the hdfs commandline can be modified to use this SQL.
This change speeds up dataload by a few minutes, as it
avoids the overhead of the hdfs commandline.

Any other location that could use DATA LOAD LOCAL is
also switched over to use it. Any location that already
uses DATA LOAD LOCAL is also switched to indicate that
it must run in Hive.

Testing: Ran dataload and verified that
functional_parquet.complextypes_fileformat has rows.

Change-Id: I7152306b2907198204a6d8d282a0bad561129b82
---
M testdata/bin/create-load-data.sh
M testdata/bin/generate-schema-statements.py
M testdata/datasets/functional/functional_schema_template.sql
3 files changed, 141 insertions(+), 138 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/8350/2
--
To view, visit http://gerrit.cloudera.org:8080/8350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7152306b2907198204a6d8d282a0bad561129b82
Gerrit-Change-Number: 8350
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 


[Impala-ASF-CR] IMPALA-6068: Fix dataload for complextypes fileformat

2017-10-20 Thread Joe McDonnell (Code Review)
Joe McDonnell has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/8350


Change subject: IMPALA-6068: Fix dataload for complextypes_fileformat
..

IMPALA-6068: Fix dataload for complextypes_fileformat

Dataload typically follows a pattern of loading data into
a text version of a table, and then using an insert
overwrite from the text table to populate the table for
other file formats. This insert is always done in Impala
for Parquet and Kudu. Otherwise it runs in Hive.

Since Impala doesn't support writing nested data, the
population of complextypes_fileformat tries to hack
the insert to run in Hive by including it in the ALTER
part of the table definition. ALTER runs immediately
after CREATE and always runs in Hive. The problem is
that ALTER also runs before the base table
(functional.complextypes_fileformat) is populated.
The insert succeeds, but it is inserting zero rows.

This code change introduces a way to force the Parquet
load to run using Hive. This lets complextypes_fileformat
specify that the insert should happen in Hive and fixes
the ordering so that the table is populated correctly.

This is also useful for loading custom Parquet files
into Parquet tables. Hive supports the DATA LOAD LOCAL
syntax, which can read a file from the local filesystem.
This means that several locations that currently use
the hdfs commandline can be modified to use this SQL.
This change speeds up dataload by a few minutes, as it
avoids the overhead of the hdfs commandline.

Any other location that could use DATA LOAD LOCAL is
also switched over to use it. Any location that already
uses DATA LOAD LOCAL is also switched to indicate that
it must run in Hive.

Testing: Ran dataload and verified that
functional_parquet.complextypes_fileformat has rows.

Change-Id: I7152306b2907198204a6d8d282a0bad561129b82
---
M testdata/bin/create-load-data.sh
M testdata/bin/generate-schema-statements.py
M testdata/datasets/functional/functional_schema_template.sql
3 files changed, 133 insertions(+), 130 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/8350/1
--
To view, visit http://gerrit.cloudera.org:8080/8350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7152306b2907198204a6d8d282a0bad561129b82
Gerrit-Change-Number: 8350
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 


[Impala-ASF-CR] IMPALA-1575: Part 1: eagerly release query exec resources

2017-10-20 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8303 )

Change subject: IMPALA-1575: Part 1: eagerly release query exec resources
..


Patch Set 9:

(1 comment)

Making my way through this.

http://gerrit.cloudera.org:8080/#/c/8303/9/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/8303/9/be/src/runtime/coordinator.cc@842
PS9, Line 842:   if (!needs_finalization_ && !status.ok()) return status;
Just checking my own understanding of this. If this case triggers, the resource 
refcnt should be released in CancelInternal(). Is that right?



--
To view, visit http://gerrit.cloudera.org:8080/8303
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I41ff374b0403f10a145f7fee9b3145953ee32341
Gerrit-Change-Number: 8303
Gerrit-PatchSet: 9
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Fri, 20 Oct 2017 17:33:22 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6070: Parallel data load.

2017-10-18 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8320 )

Change subject: IMPALA-6070: Parallel data load.
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8320/1/testdata/bin/create-load-data.sh
File testdata/bin/create-load-data.sh:

http://gerrit.cloudera.org:8080/#/c/8320/1/testdata/bin/create-load-data.sh@505
PS1, Line 505:   # Tests depend on the kudu data being clean, so load the data 
from scratch.
 :   run-step "Loading Kudu functional" load-kudu.log \
 : load-data "functional-query" "core" "kudu/none/none" 
force
 :   run-step "Loading Kudu TPCH" load-kudu-tpch.log \
 : load-data "tpch" "core" "kudu/none/none" force
It should be possible to do the same thing for these. That will only save about 
4 minutes, but this runs even when loading from a snapshot.



--
To view, visit http://gerrit.cloudera.org:8080/8320
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I836c4e1586f229621c102c4f4ba22ce7224ab9ac
Gerrit-Change-Number: 8320
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Comment-Date: Wed, 18 Oct 2017 23:16:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs

2017-10-17 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8294 )

Change subject: IMPALA-6067: Enable s3 access via IAM roles for EC2 VMs
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8294/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java:

http://gerrit.cloudera.org:8080/#/c/8294/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@3000
PS1, Line 3000:   AnalysisError(String.format("load data inpath '%s' %s 
into table " +
  :   "tpch.lineitem", 
"s3a://bucket/test-warehouse/test.out", overwrite),
  :   "INPATH location 
's3a://bucket/test-warehouse/test.out' must point to an " +
  :   "HDFS, S3A or ADL filesystem.");
Impala cannot load data from s3n. I think this test is intended to verify our 
error message when given an s3n path, so I don't think an s3a path will work 
here. The source of the error is LoadDataStmt::analyzePaths().



--
To view, visit http://gerrit.cloudera.org:8080/8294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14cd9d4453a91baad3c379aa7e4944993fca95ae
Gerrit-Change-Number: 8294
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 17 Oct 2017 23:30:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder

2017-10-12 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8034 )

Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
..


Patch Set 8:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/hdfs-parquet-scanner.cc@274
PS8, Line 274:   // Before closing the Column readers, the accounted bytes in 
mem_tracker for
 :   // dict_decoder_ is released.
 :   if (mem_tracker_ != nullptr) {
 : for (ParquetColumnReader* col_reader : column_readers_) 
col_reader->Cleanup();
 :   }
I think this code won't be necessary if col_reader->Close() does the dictionary 
close.


http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/hdfs-parquet-scanner.cc@348
PS8, Line 348: CloseAndUnregisterFromParent
Same question here as elsewhere: Do we really need 
CloseAndUnregisterFromParent() or will Close() do?


http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/hdfs-parquet-table-writer.cc
File be/src/exec/hdfs-parquet-table-writer.cc:

http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/hdfs-parquet-table-writer.cc@178
PS8, Line 178: dict_encoder_base_->ClearIndices();
This should call the new base Close() function (which should do ClearIndices()).


http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/hdfs-parquet-table-writer.cc@1057
PS8, Line 1057: CloseAndUnregisterFromParent();
I think we want to limit use of CloseAndUnregisterFromParent() to cases that 
really need it. I think the ordinary Close() should work here. Check if that 
works.


http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/parquet-column-readers.h
File be/src/exec/parquet-column-readers.h:

http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/parquet-column-readers.h@234
PS8, Line 234:   /// Delete the bytes used for memory tracking.
 :   virtual void Cleanup() {}
I think this can be folded into Close() and removed.


http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/exec/parquet-column-readers.cc@269
PS8, Line 269:   virtual void Cleanup() {
 : dict_decoder_.Close();
 :   }
I think this can be folded into Close() without a separate Cleanup() function.


http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/util/dict-encoding.h
File be/src/util/dict-encoding.h:

http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/util/dict-encoding.h@107
PS8, Line 107:   DictEncoder(MemPool* pool, int encoded_value_size, MemTracker* 
mem_tracker) :
 :   DictEncoderBase(pool), buckets_(HASH_TABLE_SIZE, 
Node::INVALID_INDEX),
 :   encoded_value_size_(encoded_value_size),
 :   dict_mem_tracker_(mem_tracker),
 :   dict_bytes_cnt_(0) { }
As I understand it, the DictEncoderBase/DictEncoder split is that 
DictEncoderBase contains everything that does not rely on T and DictEncoder 
contains the type-specific stuff. The same split applies for 
DictDecoderBase/DictDecoder. Since the memory tracking is not type specific, I 
think it makes sense to put it in the base classes. This should simplify the 
code in HdfsParquetWriter, because then you can call Close() on the 
DictEncoderBase rather than having to go to the typed version. Close() on 
DictEncoderBase should also do ClearIndices().


http://gerrit.cloudera.org:8080/#/c/8034/8/be/src/util/dict-encoding.h@251
PS8, Line 251: ReleaseBytes();
Is there any codepath that should legitimately use this? If not, DCHECK that 
dict_bytes_cnt_ is zero.



--
To view, visit http://gerrit.cloudera.org:8080/8034
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299
Gerrit-Change-Number: 8034
Gerrit-PatchSet: 8
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Pranay Singh
Gerrit-Comment-Date: Thu, 12 Oct 2017 17:23:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4623: [DOCS] Document file handle caching

2017-10-05 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8200 )

Change subject: IMPALA-4623: [DOCS] Document file handle caching
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/8200
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I261c29eff80dc376528bba29ffb7d8e0f895e25f
Gerrit-Change-Number: 8200
Gerrit-PatchSet: 3
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Comment-Date: Thu, 05 Oct 2017 23:38:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4623: [DOCS] Document file handle caching

2017-10-05 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8200 )

Change subject: IMPALA-4623: [DOCS] Document file handle caching
..


Patch Set 3: Code-Review+1

This looks right to me. Are we expecting any other reviewers?


--
To view, visit http://gerrit.cloudera.org:8080/8200
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I261c29eff80dc376528bba29ffb7d8e0f895e25f
Gerrit-Change-Number: 8200
Gerrit-PatchSet: 3
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Comment-Date: Thu, 05 Oct 2017 23:22:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4623: [DOCS] Document file handle caching

2017-10-04 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8200 )

Change subject: IMPALA-4623: [DOCS] Document file handle caching
..


Patch Set 1:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/8200/1/docs/topics/impala_known_issues.xml
File docs/topics/impala_known_issues.xml:

http://gerrit.cloudera.org:8080/#/c/8200/1/docs/topics/impala_known_issues.xml@338
PS1, Line 338: continuously appended by an HDFS mechanism
This also applies if an HDFS file is overwritten in place.


http://gerrit.cloudera.org:8080/#/c/8200/1/docs/topics/impala_scalability.xml
File docs/topics/impala_scalability.xml:

http://gerrit.cloudera.org:8080/#/c/8200/1/docs/topics/impala_scalability.xml@967
PS1, Line 967: although the encryption layer
 : adds overhead that might lessen the benefit of the 
caching.
I'm not familiar with this overhead. What is this referring to?


http://gerrit.cloudera.org:8080/#/c/8200/1/docs/topics/impala_scalability.xml@973
PS1, Line 973: 20 thousand
Just curious: How do you decide to use "20 thousand" vs "20,000"?


http://gerrit.cloudera.org:8080/#/c/8200/1/docs/topics/impala_scalability.xml@991
PS1, Line 991: evict any stale file handles from the cache
The file handles won't actually be evicted directly. The new metadata will mean 
that new statements will no longer use that file handle and eventually it will 
get aged out. I'm not sure if this distinction is important for documentation, 
but I think the important thing is that the memory may not be freed 
immediately. (This is something we are likely to change in a future release.)


http://gerrit.cloudera.org:8080/#/c/8200/1/docs/topics/impala_scalability.xml@995
PS1, Line 995: To evaluate the effectiveness of file handle caching for a 
particular workload, issue the
 : PROFILE statement in 
impala-shell or examine query
 : profiles in the Impala web UI. Look for the ratio of 
CachedFileHandlesHitCount
 : (ideally, should be high) to 
CachedFileHandlesMissCount (ideally, should be low).
 : Before starting any evaluation, run some representative 
queries to warm up the cache,
 : because the first time each data file is accessed is 
always recorded as a cache miss.
I'm not sure this belongs here, but information about the cache across the 
whole impalad is available via the metrics page under impala-server:
impala-server.io.mgr.cached-file-handles-miss-count
impala-server.io.mgr.cached-file-handles-hit-count

The total number of file handles in the cache is:
impala-server.io.mgr.num-cached-file-handles



--
To view, visit http://gerrit.cloudera.org:8080/8200
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I261c29eff80dc376528bba29ffb7d8e0f895e25f
Gerrit-Change-Number: 8200
Gerrit-PatchSet: 1
Gerrit-Owner: John Russell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Comment-Date: Thu, 05 Oct 2017 02:37:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet

2017-10-02 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8056 )

Change subject: IMPALA-5525 Extend TestScannersFuzzing to test uncompressed 
parquet
..


Patch Set 5:

(2 comments)

I'm close to a +1 on this. Only minor thoughts left for me. I just invited Tim 
to the review.

http://gerrit.cloudera.org:8080/#/c/8056/5/tests/query_test/test_scanners_fuzz.py
File tests/query_test/test_scanners_fuzz.py:

http://gerrit.cloudera.org:8080/#/c/8056/5/tests/query_test/test_scanners_fuzz.py@103
PS5, Line 103: with no compression.
Nit: I would emphasize that this has a new table. Something like: "into a new 
table with no compression"


http://gerrit.cloudera.org:8080/#/c/8056/5/tests/query_test/test_scanners_fuzz.py@110
PS5, Line 110: for orig_tbl_name in tbl_list:
 :   src_table_name = "parquet_uncomp_src_" + orig_tbl_name
 :   dst_table_name = "parquet_uncomp_dst_" + orig_tbl_name
Nit:
I think I would prefer to have "dst_table_name" be "fuzz_table_name". Also, in 
other code, I think it would be good to be explicit about this being the 
fuzz_db and fuzz_table rather than dst_db and dst_table.



--
To view, visit http://gerrit.cloudera.org:8080/8056
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc
Gerrit-Change-Number: 8056
Gerrit-PatchSet: 5
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 02 Oct 2017 22:05:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder

2017-10-02 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8034 )

Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/util/dict-encoding.h
File be/src/util/dict-encoding.h:

http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/util/dict-encoding.h@412
PS3, Line 412: ConsumeBytes(sizeof(value));
> using num_values * size of type might not work when dealing with variable s
For strings, the StringValue is a pointer and a length. In the case of the 
dictionary, the pointer is pointing into the dictionary (allocated from 
dictionary_pool_ in InitDictionary()). Since the memory for the dictionary page 
is already tracked, we only need to track these StringValue's, which are 
fixed-size. Either way, we should aggregate the Consume() calls here.



--
To view, visit http://gerrit.cloudera.org:8080/8034
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299
Gerrit-Change-Number: 8034
Gerrit-PatchSet: 3
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Pranay Singh
Gerrit-Comment-Date: Mon, 02 Oct 2017 21:48:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder

2017-09-28 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8034 )

Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
..


Patch Set 3:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/exec/data-sink.h
File be/src/exec/data-sink.h:

http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/exec/data-sink.h@126
PS3, Line 126:   /// This memtracker tracks all the allocations done by parquet 
dictonary encoder.
 :   boost::scoped_ptr dict_mem_tracker_;
Can we move this down to HdfsParquetTableWriter?


http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/exec/exec-node.h
File be/src/exec/exec-node.h:

http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/exec/exec-node.h@325
PS3, Line 325:   /// Tracks all the memory allocation by parquet dictonary 
decoders
 :   boost::scoped_ptr dict_mem_tracker_;
Can this move down to HdfsParquetScanner? Other ExecNodes don't need this.


http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/util/dict-encoding.h
File be/src/util/dict-encoding.h:

http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/util/dict-encoding.h@113
PS3, Line 113:
Remove trailing spaces. This also applies to lines that don't have any code 
(there are a couple others in this file). These are easily visible in the 
Gerrit UI.

If using emacs, the following will highlight trailing whitespace:
(setq-default show-trailing-whitespace t)


http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/util/dict-encoding.h@164
PS3, Line 164: mtrackp
For class fields, the convention is to end the field name with an underscore. 
Also, look at other code that uses MemTrackers and see what variable name that 
code uses. It is good to harmonize these things across the codebase.


http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/util/dict-encoding.h@412
PS3, Line 412: ConsumeBytes(sizeof(value));
The parquet DictionaryPageHeader contains a num_values field. Look where we 
call CreateDictionaryDecoder and you'll see that we reference to it on the next 
line. I think we should consider doing a single ConsumeBytes using num_values * 
size of type.



--
To view, visit http://gerrit.cloudera.org:8080/8034
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299
Gerrit-Change-Number: 8034
Gerrit-PatchSet: 3
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Pranay Singh
Gerrit-Comment-Date: Thu, 28 Sep 2017 16:55:24 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5975: Work around broken beeline clients

2017-09-26 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8132 )

Change subject: IMPALA-5975: Work around broken beeline clients
..


Patch Set 2: Code-Review+2

Carrying +2


--
To view, visit http://gerrit.cloudera.org:8080/8132
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id8b9f3dde4445513f1f389785a002c6cc6b3dada
Gerrit-Change-Number: 8132
Gerrit-PatchSet: 2
Gerrit-Owner: Zach Amsden 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Zach Amsden 
Gerrit-Comment-Date: Tue, 26 Sep 2017 23:25:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet

2017-09-25 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8056 )

Change subject: IMPALA-5525 Extend TestScannersFuzzing to test uncompressed 
parquet
..


Patch Set 4:

(4 comments)

I like the overall approach. I have a few small naming/style issues, but I 
think this is getting closer.

http://gerrit.cloudera.org:8080/#/c/8056/4/tests/query_test/test_scanners_fuzz.py
File tests/query_test/test_scanners_fuzz.py:

http://gerrit.cloudera.org:8080/#/c/8056/4/tests/query_test/test_scanners_fuzz.py@101
PS4, Line 101: """ Clone an existing parquet table with codec as none in the
 : unique database. This cloned table is passed to 
run_fuzz_test
 : which clones the table and corrupts the table. The test 
later
 : checks that there is no crash while performing SQL 
queries on
 : a corrupt table.
 : """
I think this comment should focus on why this test is different from the 
others. For example, it should explain that the parquet tables in the default 
schema are always compressed. So, in order to test uncompressed parquet, we 
need to create a new source table. I think you can skip the last two sentences.


http://gerrit.cloudera.org:8080/#/c/8056/4/tests/query_test/test_scanners_fuzz.py@111
PS4, Line 111: db_name = unique_database
I would prefer to emphasize that the source and destination are the 
unique_database. To make that clearer, I think I would get rid of this variable 
and just use unique_database directly in each location.


http://gerrit.cloudera.org:8080/#/c/8056/4/tests/query_test/test_scanners_fuzz.py@117
PS4, Line 117: functional_parquet.alltypes
Can we extend this to do fuzzing on decimal_tbl as well? I was thinking this 
could be a loop that runs fuzzing over a list of tables (that happens to have 
two entries).


http://gerrit.cloudera.org:8080/#/c/8056/4/tests/query_test/test_scanners_fuzz.py@118
PS4, Line 118: .format(fq_tbl_name))
This indentation is a bit awkward. I don't think .format should be on its own 
line. One way to get around this is to use only 4 space indentation for the 
second line (" select"...) and then put the .format on that line.



--
To view, visit http://gerrit.cloudera.org:8080/8056
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc
Gerrit-Change-Number: 8056
Gerrit-PatchSet: 4
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Pranay Singh
Gerrit-Comment-Date: Mon, 25 Sep 2017 23:25:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5174: Suppress kudu flags that aren't relevant to Impala

2017-09-19 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5174: Suppress kudu flags that aren't relevant to Impala
..


Patch Set 1: Code-Review+1

I think this makes sense. At some point, should we add a test that tracks when 
parameters are added or removed from the UI? It seems like something that could 
happen inadvertently. It would add some overhead to adding/removing params.

-- 
To view, visit http://gerrit.cloudera.org:8080/8074
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I499c903cde92595c4a02803ecbf98ac1d41517b4
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5941: Fix Metastore schema creation in create-test-configuration.sh

2017-09-18 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5941: Fix Metastore schema creation in 
create-test-configuration.sh
..


Patch Set 3:

Ran full dataload + tests and everything worked. I think this is ready to merge.

-- 
To view, visit http://gerrit.cloudera.org:8080/8081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic312df4597c7d211d4ecd551d572f751aea0cd24
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet

2017-09-15 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5525 Extend TestScannersFuzzing to test uncompressed 
parquet
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8056/2/tests/query_test/test_scanners_fuzz.py
File tests/query_test/test_scanners_fuzz.py:

PS2, Line 97: """
> I'll use the functional-parquet to create the cloned table which will be pa
One thing that I failed to mention clearly enough is that the uncompressed 
parquet table should not be created in the default schemas. Instead, it should 
be created in unique_database. This is why the function signature of 
run_fuzz_test will change. The normal tests will pass in table_format and a 
table from the default schema. Your new test will pass in unique_database and 
the table you created.


-- 
To view, visit http://gerrit.cloudera.org:8080/8056
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Pranay Singh
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet

2017-09-15 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5525 Extend TestScannersFuzzing to test uncompressed 
parquet
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8056/2/tests/query_test/test_scanners_fuzz.py
File tests/query_test/test_scanners_fuzz.py:

PS2, Line 97: fq_tbl_name = "functional_parquet" + "." + tbl_name
> Do not create a table in the default schemas. Instead, with the changes I d
Another option is to keep the clone code in run_fuzz_test, but change the 
arguments so that it specifies a source database, source table, destination 
database, and destination table. The existing code would simply pass in the 
appropriate existing table. The uncompressed parquet code would create an 
uncompressed parquet table and pass that in.


-- 
To view, visit http://gerrit.cloudera.org:8080/8056
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Pranay Singh
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5941: Fix Metastore schema creation in create-test-configuration.sh

2017-09-15 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5941: Fix Metastore schema creation in 
create-test-configuration.sh
..


Patch Set 1:

(1 comment)

We have had it this way for a very long time, so I'm kicking off some Jenkins 
jobs to test data load.

http://gerrit.cloudera.org:8080/#/c/8081/1/bin/create-test-configuration.sh
File bin/create-test-configuration.sh:

Line 99:-f 
${HIVE_HOME}/scripts/metastore/upgrade/postgres/hive-schema-1.1.0.postgres.sql
> Very minor comment: it might be simpler to use a relative path here and avo
Fixed, that is definitely cleaner. I tested it on my box and it seems to work.


-- 
To view, visit http://gerrit.cloudera.org:8080/8081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic312df4597c7d211d4ecd551d572f751aea0cd24
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5941: Fix Metastore schema creation in create-test-configuration.sh

2017-09-15 Thread Joe McDonnell (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/8081

to look at the new patch set (#2).

Change subject: IMPALA-5941: Fix Metastore schema creation in 
create-test-configuration.sh
..

IMPALA-5941: Fix Metastore schema creation in create-test-configuration.sh

The Hive Metastore schema script includes other SQL
scripts using \i, which expects absolute paths. Since
we currently invoke it from outside the schema script
directory, it is unable to find those included scripts.

The fix is to switch to the Hive Metastore script
directory when invoking the schema script.

Change-Id: Ic312df4597c7d211d4ecd551d572f751aea0cd24
---
M bin/create-test-configuration.sh
1 file changed, 5 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/8081/2
-- 
To view, visit http://gerrit.cloudera.org:8080/8081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic312df4597c7d211d4ecd551d572f751aea0cd24
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet

2017-09-15 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5525 Extend TestScannersFuzzing to test uncompressed 
parquet
..


Patch Set 2:

(2 comments)

Take a look at these suggestions and let me know if they make sense.

http://gerrit.cloudera.org:8080/#/c/8056/2/tests/query_test/test_scanners_fuzz.py
File tests/query_test/test_scanners_fuzz.py:

PS2, Line 97: fq_tbl_name = "functional_parquet" + "." + tbl_name
Do not create a table in the default schemas. Instead, with the changes I 
described elsewhere, create a separate clone function that will create an 
uncompressed parquet table directly in unique_database and then pass that into 
run_fuzz_test.

The existing tests will run a simple table clone function that is equivalent to 
the current code.


PS2, Line 129: self.execute_query("create table %s.%s like %s" % 
(unique_database, table, table))
 : fuzz_table_location = 
get_fs_path("/test-warehouse/{0}.db/{1}".format(
 : unique_database, table))
Pull this logic out into its own function e.g. clone_table. run_fuzz_test 
should take in a table that has already been created in unique_database. It 
should not do the clone itself. This allows you to use a different clone 
function to create a parquet table without compression.


-- 
To view, visit http://gerrit.cloudera.org:8080/8056
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Pranay Singh
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder

2017-09-15 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8034/2/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

PS2, Line 213: dict_decoder_(parent_->dictionary_pool_.get()),
We should think about having a separate MemPool for parts of the dictionary 
that will not be referenced by the RowBatch and can be freed at the end of the 
row group.

Since some datatypes are copied by value rather than pointing into the 
dictionary, this might be used for those as well.


http://gerrit.cloudera.org:8080/#/c/8034/2/be/src/util/dict-encoding.h
File be/src/util/dict-encoding.h:

PS2, Line 229: memcpy(val_ptr, dict_[index], sizeof(T));
The memcpy is doing a dereference here and is not different from *val_ptr = 
*dict_[index];


PS2, Line 238:   std::vector dict_;
Using a T* means that GetValue() must do a dereference to obtain the value. 
This is a very performance sensitive codepath, so we need to avoid this extra 
dereference. This should maintain the value directly in the vector.


-- 
To view, visit http://gerrit.cloudera.org:8080/8034
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Pranay Singh
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5941: Fix Metastore schema creation in create-test-configuration.sh

2017-09-15 Thread Joe McDonnell (Code Review)
Joe McDonnell has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/8081

Change subject: IMPALA-5941: Fix Metastore schema creation in 
create-test-configuration.sh
..

IMPALA-5941: Fix Metastore schema creation in create-test-configuration.sh

The Hive Metastore schema script includes other SQL
scripts using \i, which expects absolute paths. Since
we currently invoke it from outside the schema script
directory, it is unable to find those included scripts.

The fix is to switch to the Hive Metastore script
directory when invoking the schema script.

Change-Id: Ic312df4597c7d211d4ecd551d572f751aea0cd24
---
M bin/create-test-configuration.sh
1 file changed, 4 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/8081/1
-- 
To view, visit http://gerrit.cloudera.org:8080/8081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ic312df4597c7d211d4ecd551d572f751aea0cd24
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 


[Impala-ASF-CR] IMPALA-5525 Extend TestScannersFuzzing to test uncompressed parquet

2017-09-13 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5525 Extend TestScannersFuzzing to test uncompressed 
parquet
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8056/1/tests/query_test/test_scanners_fuzz.py
File tests/query_test/test_scanners_fuzz.py:

Line 68
Avoid unrelated whitespace diffs.

One way of getting a graphical diff that can help with this is to use the tool 
meld. For example:

git difftool -y -t meld  

Where  could be asf-gerrit/master or origin/master or whatnot.


PS1, Line 96: fq_tbl_name = "functional_parquet" + "." + tbl_name
I'm wary of creating tables in our default schemas. This won't get cleaned up, 
and it is subtle behavior. If we can create the new table in the 
unique_database that would be nice


PS1, Line 98: create = ("create table {0} stored as parquet as select * 
from functional.alltypes"
:   .format(fq_tbl_name))
I think we need to verify that the right options are being set when we create 
this table. As I understand it, you need to specify the query option 
compression_codec = none to create a parquet file without compression. 

https://www.cloudera.com/documentation/enterprise/5-8-x/topics/impala_compression_codec.html


-- 
To view, visit http://gerrit.cloudera.org:8080/8056
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I760de7203a51cf82b16016fa8043cadc7c8325bc
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Joe McDonnell 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-1649 Pass precision to Print TCounterType::DOUBLE VALUE

2017-09-13 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-1649 Pass precision to Print TCounterType::DOUBLE_VALUE
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7725/2/be/src/util/pretty-printer.h
File be/src/util/pretty-printer.h:

PS2, Line 142:   case TUnit::DOUBLE_VALUE: {
 : double output = static_cast(value);
 : ss << std::setprecision(precision) << output << " ";
I think TUnit::DOUBLE_VALUE is so similar to TUnit::UNIT that we should be 
looking to remove uses of DOUBLE_VALUE and allow UNIT to specify a precision.


-- 
To view, visit http://gerrit.cloudera.org:8080/7725
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Icc63d8a59bdf175341097df143798fb1a957d93f
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Joe McDonnell 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder

2017-09-11 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8034/1/be/src/exec/exec-node.h
File be/src/exec/exec-node.h:

PS1, Line 215:  MemTracker* decoder_mem_tracker() { return 
decoder_mem_tracker_.get(); }
The dictionary is only used for Parquet, so I think the MemPool should not be 
in the ExecNode, as that is used for a large number of other things. Look into 
moving this down to HdfsParquetScanner (or reuse dictionary_pool_).


http://gerrit.cloudera.org:8080/#/c/8034/1/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

PS1, Line 213: dict_decoder_(new 
MemPool(parent->scan_node_->decoder_mem_tracker())),
It is important not to create objects on a per-column basis unless truly 
necessary. I think it makes more sense to have a single MemPool up at the 
HdfsParquetScanner level. Look at how dictionary_pool_ works. I think it might 
be better to reuse that pool.


http://gerrit.cloudera.org:8080/#/c/8034/1/be/src/util/dict-encoding.h
File be/src/util/dict-encoding.h:

PS1, Line 234: *val_ptr = *dict_[index];
dict_ needs to return T's directly rather than T*'s. This is a performance 
critical path, and an extra dereference is too expensive.


-- 
To view, visit http://gerrit.cloudera.org:8080/8034
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Joe McDonnell 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5750: Catch exceptions from boost thread creation

2017-09-06 Thread Joe McDonnell (Code Review)
Hello Sailesh Mukil, Dan Hecht, Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7730

to look at the new patch set (#12).

Change subject: IMPALA-5750: Catch exceptions from boost thread creation
..

IMPALA-5750: Catch exceptions from boost thread creation

The boost thread constructor will throw boost::thread_resource_error
if it is unable to spawn a thread on the system
(e.g. due to a ulimit). This uncaught exception crashes
Impala. Systems with a large number of nodes and threads
are hitting this limit.

This change catches the exception from the thread
constructor and converts it to a Status. This requires
several changes:
1. util/thread.h's Thread constructor is now private
and all Threads are constructed via a new Create()
static factory method.
2. util/thread-pool.h's ThreadPool requires that Init()
be called after the ThreadPool is constructed.
3. To propagate the Status, Threads cannot be created in
constructors, so this is moved to initialization methods
that can return Status.
4. Threads now use unique_ptr's for management in all
cases. Threads cannot be used as stack-allocated local
variables or direct declarations in classes.

Query execution code paths will now handle the error:
1. If the scan node fails to spawn any scanner thread,
it will abort the query.
2. Failing to spawn a fragment instance from the query
state in StartFInstances() will correctly report the error
to the coordinator and tear down the query.

Testing:
This introduces the parameter thread_creation_fault_injection,
which will cause Thread::Create() calls in eligible
locations to fail randomly roughly 1% of the time.
Quite a few locations of Thread::Create() and
ThreadPool::Init() are necessary for startup and cannot
be eligible. However, all the locations used for query
execution are marked as eligible and governed by this
parameter. The code was tested by setting this parameter
to true and running queries to verify that queries either
run to completion with the correct result or fail with
appropriate status.

Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
---
M be/src/benchmarks/thread-create-benchmark.cc
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-server.h
M be/src/catalog/catalogd-main.cc
M be/src/common/global-flags.cc
M be/src/common/init.cc
M be/src/common/init.h
M be/src/exec/blocking-join-node.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/kudu-scan-node.cc
M be/src/exec/kudu-scan-node.h
M be/src/rpc/TAcceptQueueServer.cpp
M be/src/rpc/auth-provider.h
M be/src/rpc/authentication.cc
M be/src/rpc/thrift-server-test.cc
M be/src/rpc/thrift-server.cc
M be/src/rpc/thrift-server.h
M be/src/rpc/thrift-thread.cc
M be/src/rpc/thrift-thread.h
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/disk-io-mgr-handle-cache.h
M be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/parallel-executor.cc
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/thread-resource-mgr.h
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/admission-controller.h
M be/src/service/child-query.cc
M be/src/service/child-query.h
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/impalad-main.cc
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore-subscriber.h
M be/src/statestore/statestore.cc
M be/src/statestore/statestore.h
M be/src/statestore/statestored-main.cc
M be/src/testutil/in-process-servers.cc
M be/src/testutil/in-process-servers.h
M be/src/util/thread-pool-test.cc
M be/src/util/thread-pool.h
M be/src/util/thread.cc
M be/src/util/thread.h
M common/thrift/generate_error_codes.py
55 files changed, 455 insertions(+), 215 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/7730/12
-- 
To view, visit http://gerrit.cloudera.org:8080/7730
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5750: Catch exceptions from boost thread creation

2017-09-06 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5750: Catch exceptions from boost thread creation
..


Patch Set 11:

(1 comment)

Rebased all the way and incorporated changes from IMPALA-5892.

http://gerrit.cloudera.org:8080/#/c/7730/11/be/src/exec/hdfs-scan-node.cc
File be/src/exec/hdfs-scan-node.cc:

PS11, Line 361: This
> The "this" was confusing the first time I read it (I thought it was referri
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7730
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5888: free other local allocations in Parquet

2017-09-06 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5888: free other local allocations in Parquet
..


Patch Set 2: Code-Review+1

This makes sense to me.

-- 
To view, visit http://gerrit.cloudera.org:8080/7933
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7792552510b54aa95044e44218e3351a36d6f9a8
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5892: Allow reporting status independent of fragment instance

2017-09-06 Thread Joe McDonnell (Code Review)
Hello Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7943

to look at the new patch set (#3).

Change subject: IMPALA-5892: Allow reporting status independent of fragment 
instance
..

IMPALA-5892: Allow reporting status independent of fragment instance

Queries can hit an error that is not specific to a
particular fragment instance. For example, QueryState::StartFInstances()
calls DescriptorTbl::Create() before any fragment instances
start. This location has no reason to report status via a
particular fragment, and there is currently no way to report
status otherwise. This leads to a query hang, because the
status is never propagated back to the coordinator.

This adds the ability to report status that is not associated
with a particular fragment instance. By reporting status,
the coordinator will now correctly abort the query in the
case of the QueryState::StartFInstances() scenario described.

Change-Id: I4cd98022f1d62a999c7c80ff5474fa8d069eb12c
---
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/query-state.cc
M common/thrift/ImpalaInternalService.thrift
6 files changed, 73 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/7943/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7943
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4cd98022f1d62a999c7c80ff5474fa8d069eb12c
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR] IMPALA-5892: Allow reporting status independent of fragment instance

2017-09-06 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5892: Allow reporting status independent of fragment 
instance
..


Patch Set 2:

(3 comments)

Also rebased all the way.

http://gerrit.cloudera.org:8080/#/c/7943/2/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

Line 221:   DCHECK(requests_fragment_detail || omits_fragment_args);
> how about just:
Done


http://gerrit.cloudera.org:8080/#/c/7943/2/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

Line 873:   // in that case).  Coordinator fragment failed in this case so we 
log the query_id.
> not your change (and doesn't have to be addressed now) but this comment doe
I think the first sentence makes some sense. If there is already an error in 
query_status_, then UpdateStatus() will return that rather than the status 
passed in. If a fragment hit an error and called CancelInternal(), that would 
call ReleaseResources(), which closes the coord_sink_. That should cause 
GetNext() to return an error, but not the original error.

The second sentence doesn't match anything that I can see.


http://gerrit.cloudera.org:8080/#/c/7943/2/common/thrift/ImpalaInternalService.thrift
File common/thrift/ImpalaInternalService.thrift:

PS2, Line 613: =cdfklnoru
> some random characters snuck in
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7943
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4cd98022f1d62a999c7c80ff5474fa8d069eb12c
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5750: Catch exceptions from boost thread creation

2017-09-05 Thread Joe McDonnell (Code Review)
Hello Sailesh Mukil, Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7730

to look at the new patch set (#11).

Change subject: IMPALA-5750: Catch exceptions from boost thread creation
..

IMPALA-5750: Catch exceptions from boost thread creation

The boost thread constructor will throw boost::thread_resource_error
if it is unable to spawn a thread on the system
(e.g. due to a ulimit). This uncaught exception crashes
Impala. Systems with a large number of nodes and threads
are hitting this limit.

This change catches the exception from the thread
constructor and converts it to a Status. This requires
several changes:
1. util/thread.h's Thread constructor is now private
and all Threads are constructed via a new Create()
static factory method.
2. util/thread-pool.h's ThreadPool requires that Init()
be called after the ThreadPool is constructed.
3. To propagate the Status, Threads cannot be created in
constructors, so this is moved to initialization methods
that can return Status.
4. Threads now use unique_ptr's for management in all
cases. Threads cannot be used as stack-allocated local
variables or direct declarations in classes.

Query execution code paths will now handle the error:
1. If the scan node fails to spawn any scanner thread,
it will abort the query.
2. Failing to spawn a fragment instance from the query
state in StartFInstances() will correctly report the error
to the coordinator and tear down the query.

Testing:
This introduces the parameter thread_creation_fault_injection,
which will cause Thread::Create() calls in eligible
locations to fail randomly roughly 1% of the time.
Quite a few locations of Thread::Create() and
ThreadPool::Init() are necessary for startup and cannot
be eligible. However, all the locations used for query
execution are marked as eligible and governed by this
parameter. The code was tested by setting this parameter
to true and running queries to verify that queries either
run to completion with the correct result or fail with
appropriate status.

Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
---
M be/src/benchmarks/thread-create-benchmark.cc
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-server.h
M be/src/catalog/catalogd-main.cc
M be/src/common/global-flags.cc
M be/src/common/init.cc
M be/src/common/init.h
M be/src/exec/blocking-join-node.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/kudu-scan-node.cc
M be/src/exec/kudu-scan-node.h
M be/src/rpc/TAcceptQueueServer.cpp
M be/src/rpc/auth-provider.h
M be/src/rpc/authentication.cc
M be/src/rpc/thrift-server-test.cc
M be/src/rpc/thrift-server.cc
M be/src/rpc/thrift-server.h
M be/src/rpc/thrift-thread.cc
M be/src/rpc/thrift-thread.h
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/disk-io-mgr-handle-cache.h
M be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/parallel-executor.cc
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/thread-resource-mgr.h
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/admission-controller.h
M be/src/service/child-query.cc
M be/src/service/child-query.h
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/impalad-main.cc
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore-subscriber.h
M be/src/statestore/statestore.cc
M be/src/statestore/statestore.h
M be/src/statestore/statestored-main.cc
M be/src/testutil/in-process-servers.cc
M be/src/testutil/in-process-servers.h
M be/src/util/thread-pool-test.cc
M be/src/util/thread-pool.h
M be/src/util/thread.cc
M be/src/util/thread.h
M common/thrift/generate_error_codes.py
55 files changed, 491 insertions(+), 239 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/7730/11
-- 
To view, visit http://gerrit.cloudera.org:8080/7730
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5750: Catch exceptions from boost thread creation

2017-09-05 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5750: Catch exceptions from boost thread creation
..


Patch Set 10:

(1 comment)

Also rebased.

http://gerrit.cloudera.org:8080/#/c/7730/8/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

PS8, Line 332: ery 
> My test shows that the fragments that are not on the coordinator node will 
Moved the call to ReportExecStatusAux to after all fragment instances are 
prepared and verified that this works. The code to do this is ugly (because it 
relies on calling ReportExecStatusAux using the failed fis), and it will 
require IMPALA-5892 to become cleaner (where we can pass fis=nullptr to 
ReportExecStatusAux).


-- 
To view, visit http://gerrit.cloudera.org:8080/7730
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5750: Catch exceptions from boost thread creation

2017-09-05 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5750: Catch exceptions from boost thread creation
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7730/8/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

PS8, Line 332: true
> Ah right. So I think the assumption is that the coordinator may not be able
My test shows that the fragments that are not on the coordinator node will 
exit. (This test is the equivalent of  instances_started=false with complete 
RPC failure.) However, if the message doesn't get to the coordinator, then the 
coordinator fragments will hang (seemingly indefinitely). I'm using a simple 
query, so it could be that this is not representative.


-- 
To view, visit http://gerrit.cloudera.org:8080/7730
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5750: Catch exceptions from boost thread creation

2017-09-05 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5750: Catch exceptions from boost thread creation
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7730/8/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

PS8, Line 332: true
> What's the consequence of using instances_started=false?  Did you verify th
instances_started only impacts the case when there are 3 repeated RPC failures 
to the coordinator. I will run a test to see what happens.


-- 
To view, visit http://gerrit.cloudera.org:8080/7730
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5750: Catch exceptions from boost thread creation

2017-09-05 Thread Joe McDonnell (Code Review)
Hello Sailesh Mukil, Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7730

to look at the new patch set (#10).

Change subject: IMPALA-5750: Catch exceptions from boost thread creation
..

IMPALA-5750: Catch exceptions from boost thread creation

The boost thread constructor will throw boost::thread_resource_error
if it is unable to spawn a thread on the system
(e.g. due to a ulimit). This uncaught exception crashes
Impala. Systems with a large number of nodes and threads
are hitting this limit.

This change catches the exception from the thread
constructor and converts it to a Status. This requires
several changes:
1. util/thread.h's Thread constructor is now private
and all Threads are constructed via a new Create()
static factory method.
2. util/thread-pool.h's ThreadPool requires that Init()
be called after the ThreadPool is constructed.
3. To propagate the Status, Threads cannot be created in
constructors, so this is moved to initialization methods
that can return Status.
4. Threads now use unique_ptr's for management in all
cases. Threads cannot be used as stack-allocated local
variables or direct declarations in classes.

Query execution code paths will now handle the error:
1. If the scan node fails to spawn any scanner thread,
it will abort the query.
2. Failing to spawn a fragment instance from the query
state in StartFInstances() will correctly report the error
to the coordinator and tear down the query.

Testing:
This introduces the parameter thread_creation_fault_injection,
which will cause Thread::Create() calls in eligible
locations to fail randomly roughly 1% of the time.
Quite a few locations of Thread::Create() and
ThreadPool::Init() are necessary for startup and cannot
be eligible. However, all the locations used for query
execution are marked as eligible and governed by this
parameter. The code was tested by setting this parameter
to true and running queries to verify that queries either
run to completion with the correct result or fail with
appropriate status.

Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
---
M be/src/benchmarks/thread-create-benchmark.cc
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-server.h
M be/src/catalog/catalogd-main.cc
M be/src/common/global-flags.cc
M be/src/common/init.cc
M be/src/common/init.h
M be/src/exec/blocking-join-node.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/kudu-scan-node.cc
M be/src/exec/kudu-scan-node.h
M be/src/rpc/TAcceptQueueServer.cpp
M be/src/rpc/auth-provider.h
M be/src/rpc/authentication.cc
M be/src/rpc/thrift-server-test.cc
M be/src/rpc/thrift-server.cc
M be/src/rpc/thrift-server.h
M be/src/rpc/thrift-thread.cc
M be/src/rpc/thrift-thread.h
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/disk-io-mgr-handle-cache.h
M be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/parallel-executor.cc
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/thread-resource-mgr.h
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/admission-controller.h
M be/src/service/child-query.cc
M be/src/service/child-query.h
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/impalad-main.cc
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore-subscriber.h
M be/src/statestore/statestore.cc
M be/src/statestore/statestore.h
M be/src/statestore/statestored-main.cc
M be/src/testutil/in-process-servers.cc
M be/src/testutil/in-process-servers.h
M be/src/util/thread-pool-test.cc
M be/src/util/thread-pool.h
M be/src/util/thread.cc
M be/src/util/thread.h
M common/thrift/generate_error_codes.py
55 files changed, 480 insertions(+), 238 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/7730/10
-- 
To view, visit http://gerrit.cloudera.org:8080/7730
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5750: Catch exceptions from boost thread creation

2017-09-05 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5750: Catch exceptions from boost thread creation
..


Patch Set 8:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/7730/8/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

Line 226: if (fis->IsPrepared()) {
> should this check that the prepare status is ok? if Prepare() failed, then 
The first thing that Prepare() does is allocate the RuntimeState. It has a 
comment saying not to return before that. See 
https://github.com/apache/incubator-impala/blob/master/be/src/runtime/fragment-instance-state.cc#L119
We also check whether fis->profile() is null. So, I think this should be safe.

When IMPALA-5892 goes in, I will just use fis==nullptr. That will make this 
part of the change unnecessary, as nothing needs to send status for an 
unprepared fragment instance.


Line 328:   
ExecEnv::GetInstance()->query_exec_mgr()->ReleaseQueryState(this);
> this is a little weird since ReleaseQueryState() can delete this. Except it
Added a comment.


PS8, Line 332: true
> passing instances_started=true can lead to a deadlock if the RPC send fails
Good point, changed this to instances_started=false. If the RPC to the 
coordinator fails, I think there are other ways to hang, but this is clearly 
wrong. I'll think about whether there is a better way to handle this case.


PS8, Line 336: update fis_map_
> nit: seems excessive
Done


http://gerrit.cloudera.org:8080/#/c/7730/8/be/src/runtime/thread-resource-mgr.h
File be/src/runtime/thread-resource-mgr.h:

Line 288:   if (!in_callback) InvokeCallbacks();
> As we discussed in person, I don't think it's safe even in that case genera
I went with option #1. I added a comment with a warning to 
thread-resource-mgr.h and noted why we were using skip_callbacks in 
kudu-scan-node.cc and hdfs-scan-node.cc.


-- 
To view, visit http://gerrit.cloudera.org:8080/7730
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5892: Allow reporting status independent of fragment instance

2017-09-05 Thread Joe McDonnell (Code Review)
Joe McDonnell has uploaded a new patch set (#2).

Change subject: IMPALA-5892: Allow reporting status independent of fragment 
instance
..

IMPALA-5892: Allow reporting status independent of fragment instance

Queries can hit an error that is not specific to a
particular fragment instance. For example, QueryState::StartFInstances()
calls DescriptorTbl::Create() before any fragment instances
start. This location has no reason to report status via a
particular fragment, and there is currently no way to report
status otherwise. This leads to a query hang, because the
status is never propagated back to the coordinator.

This adds the ability to report status that is not associated
with a particular fragment instance. By reporting status,
the coordinator will now correctly abort the query in the
case of the QueryState::StartFInstances() scenario described.

Change-Id: I4cd98022f1d62a999c7c80ff5474fa8d069eb12c
---
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/query-state.cc
M common/thrift/ImpalaInternalService.thrift
6 files changed, 79 insertions(+), 24 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/7943/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7943
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4cd98022f1d62a999c7c80ff5474fa8d069eb12c
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 


[Impala-ASF-CR] IMPALA-5892: Allow reporting status independent of fragment instance

2017-09-05 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5892: Allow reporting status independent of fragment 
instance
..


Patch Set 1:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/7943/1/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

PS1, Line 219: *failed_instance_id = failed_instance_id_;
> see header question -- this is why it seems we should require that the call
Changed the semantic so that the caller must specify both or neither.


PS1, Line 309: independently
> independent of an instance
Rewrote this comment.


PS1, Line 310: incorporated
> incorporated to where?  and what is "here" referring to, given that you jus
Rewrote this comment.


PS1, Line 312: cumulative_status
> that name confused me because it's doesn't seem like this is the cumulative
Changed this to "overall_backend_status" and rewrote the comment.


http://gerrit.cloudera.org:8080/#/c/7943/1/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

Line 95:   /// (with not specific fragment responsible).
> nit: s/not/no/?
Done


PS1, Line 100: has_failed_instance == false
> what does that mean? do you mean '*has_failed_instance == false' or somethi
Changed the semantics so that the caller must specify both or neither. Also 
updated the comment to make the semantics clear.


http://gerrit.cloudera.org:8080/#/c/7943/1/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

PS1, Line 375:   Status UpdateStatus(const Status& status, const TUniqueId& 
failed_fragment,
 :   const std::string& instance_hostname,
 :   bool has_failed_instance = true) WARN_UNUSED_RESULT;
> If we keep this one method, let's be careful with our use of default parame
Changed the ordering and made the arguments required.


http://gerrit.cloudera.org:8080/#/c/7943/1/common/thrift/ImpalaInternalService.thrift
File common/thrift/ImpalaInternalService.thrift:

PS1, Line 628: This incorporates status from
 :   // TFragmentInstanceExecStatus as well as errors that occur 
independently
> given today's implementation it doesn't matter, but from a protocol standpo
Updated the comment to be clear about the priorities. We don't currently report 
multiple errors, but I specified what I think is a reasonable semantic for that 
case.


-- 
To view, visit http://gerrit.cloudera.org:8080/7943
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4cd98022f1d62a999c7c80ff5474fa8d069eb12c
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5750: Catch exceptions from boost thread creation

2017-09-01 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5750: Catch exceptions from boost thread creation
..


Patch Set 8:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/7730/8/be/src/catalog/catalog-server.h
File be/src/catalog/catalog-server.h:

PS8, Line 39: class TGetAllCatalogObjectsResponse;
> nit: Not your change, but no longer required.
Done


http://gerrit.cloudera.org:8080/#/c/7730/7/be/src/exec/kudu-scan-node.cc
File be/src/exec/kudu-scan-node.cc:

PS7, Line 240: SetDoneInternal();
> We previously didn't call materialized_row_batches_->Shutdown() here, but n
This previously called Shutdown() when the last scanner thread is done.

I looked into this, and there doesn't seem to be any reason to wait. It is safe 
to call Shutdown() when the other scanner threads are still around. We do that 
for the HDFS version of this. We also do it if we reach a limit. I looked at 
the different methods on the queue, and I don't see any problem.


http://gerrit.cloudera.org:8080/#/c/7730/8/be/src/rpc/auth-provider.h
File be/src/rpc/auth-provider.h:

PS8, Line 22: #include 
> No longer needed.
Done


http://gerrit.cloudera.org:8080/#/c/7730/8/be/src/rpc/thrift-thread.cc
File be/src/rpc/thrift-thread.cc:

PS8, Line 40: throw atc::SystemResourceException("Thread::Create() failed");
> is it worth incorporating the status msg into the exception message?
I think that makes sense. The status has the category and thread name, which 
could be useful to us.


http://gerrit.cloudera.org:8080/#/c/7730/8/be/src/runtime/disk-io-mgr-handle-cache.h
File be/src/runtime/disk-io-mgr-handle-cache.h:

Line 28: #include "common/hdfs.h"
> nit: Include what you use, status.h (it's already being pulled in through o
Done


http://gerrit.cloudera.org:8080/#/c/7730/8/be/src/runtime/exec-env.cc
File be/src/runtime/exec-env.cc:

PS8, Line 341: admission_controller_->Init()
> The admission controller dequeue thread starts a little later than before n
I think it is ok. ExecEnv::StartServices() is called from impalad-main.cc soon 
after constructing the ExecEnv and before starting the backend or beeswax or 
hs2 servers. There should be nothing to admit.

One thing I noticed is that StartServices() is not called from FeSupport. I 
think that should be ok, but I'm checking.


http://gerrit.cloudera.org:8080/#/c/7730/8/be/src/runtime/thread-resource-mgr.h
File be/src/runtime/thread-resource-mgr.h:

Line 288:   if (!in_callback) InvokeCallbacks();
> This explains why we need to skip the callbacks. But why is it safe to do s
You're right. This is only safe from a callback if a thread did 
TryAcquireThreadToken and then wants to return it without using it. This 
correct usage is equivalent to just passing on picking up a thread token. If 
the callback did a release of an unrelated thread, then that would be a 
problem, because we won't find a replacement for that thread. There is no real 
way to tell if something has called TryAcquireThreadToken. We don't give it an 
object or anything.

I can augment the comment to detail that it must be used in a very targeted 
way, or there could be separate functions.


http://gerrit.cloudera.org:8080/#/c/7730/8/be/src/service/child-query.h
File be/src/service/child-query.h:

PS8, Line 22: 
> boost/thread/mutex.hpp
Done


http://gerrit.cloudera.org:8080/#/c/7730/8/be/src/statestore/statestore.h
File be/src/statestore/statestore.h:

Line 30: 
> nit: include what you use: status.h
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7730
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5750: Catch exceptions from boost thread creation

2017-09-01 Thread Joe McDonnell (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7730

to look at the new patch set (#9).

Change subject: IMPALA-5750: Catch exceptions from boost thread creation
..

IMPALA-5750: Catch exceptions from boost thread creation

The boost thread constructor will throw boost::thread_resource_error
if it is unable to spawn a thread on the system
(e.g. due to a ulimit). This uncaught exception crashes
Impala. Systems with a large number of nodes and threads
are hitting this limit.

This change catches the exception from the thread
constructor and converts it to a Status. This requires
several changes:
1. util/thread.h's Thread constructor is now private
and all Threads are constructed via a new Create()
static factory method.
2. util/thread-pool.h's ThreadPool requires that Init()
be called after the ThreadPool is constructed.
3. To propagate the Status, Threads cannot be created in
constructors, so this is moved to initialization methods
that can return Status.
4. Threads now use unique_ptr's for management in all
cases. Threads cannot be used as stack-allocated local
variables or direct declarations in classes.

Query execution code paths will now handle the error:
1. If the scan node fails to spawn any scanner thread,
it will abort the query.
2. Failing to spawn a fragment instance from the query
state in StartFInstances() will correctly report the error
to the coordinator and tear down the query.

Testing:
This introduces the parameter thread_creation_fault_injection,
which will cause Thread::Create() calls in eligible
locations to fail randomly roughly 1% of the time.
Quite a few locations of Thread::Create() and
ThreadPool::Init() are necessary for startup and cannot
be eligible. However, all the locations used for query
execution are marked as eligible and governed by this
parameter. The code was tested by setting this parameter
to true and running queries to verify that queries either
run to completion with the correct result or fail with
appropriate status.

Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
---
M be/src/benchmarks/thread-create-benchmark.cc
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-server.h
M be/src/catalog/catalogd-main.cc
M be/src/common/global-flags.cc
M be/src/common/init.cc
M be/src/common/init.h
M be/src/exec/blocking-join-node.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/kudu-scan-node.cc
M be/src/exec/kudu-scan-node.h
M be/src/rpc/TAcceptQueueServer.cpp
M be/src/rpc/auth-provider.h
M be/src/rpc/authentication.cc
M be/src/rpc/thrift-server-test.cc
M be/src/rpc/thrift-server.cc
M be/src/rpc/thrift-server.h
M be/src/rpc/thrift-thread.cc
M be/src/rpc/thrift-thread.h
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/disk-io-mgr-handle-cache.h
M be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/parallel-executor.cc
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/thread-resource-mgr.h
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/admission-controller.h
M be/src/service/child-query.cc
M be/src/service/child-query.h
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/impalad-main.cc
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore-subscriber.h
M be/src/statestore/statestore.cc
M be/src/statestore/statestore.h
M be/src/statestore/statestored-main.cc
M be/src/testutil/in-process-servers.cc
M be/src/testutil/in-process-servers.h
M be/src/util/thread-pool-test.cc
M be/src/util/thread-pool.h
M be/src/util/thread.cc
M be/src/util/thread.h
M common/thrift/generate_error_codes.py
55 files changed, 471 insertions(+), 238 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/7730/9
-- 
To view, visit http://gerrit.cloudera.org:8080/7730
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5892: Allow reporting status independent of fragment instance

2017-09-01 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5892: Allow reporting status independent of fragment 
instance
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7943/1/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

PS1, Line 375:   Status UpdateStatus(const Status& status, const TUniqueId& 
failed_fragment,
 :   const std::string& instance_hostname,
 :   bool has_failed_instance = true) WARN_UNUSED_RESULT;
Might be cleaner to add two functions UpdateFragmentStatus(status, 
failed_fragment, instance_hostname) and UpdateGeneralStatus(status, 
instance_hostname).


-- 
To view, visit http://gerrit.cloudera.org:8080/7943
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4cd98022f1d62a999c7c80ff5474fa8d069eb12c
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5892: Allow reporting status independent of fragment instance

2017-09-01 Thread Joe McDonnell (Code Review)
Joe McDonnell has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7943

Change subject: IMPALA-5892: Allow reporting status independent of fragment 
instance
..

IMPALA-5892: Allow reporting status independent of fragment instance

Queries can hit an error that is not specific to a
particular fragment instance. For example, QueryState::StartFInstances()
calls DescriptorTbl::Create() before any fragment instances
start. This location has no reason to report status via a
particular fragment, and there is currently no way to report
status otherwise. This leads to a query hang, because the
status is never propagated back to the coordinator.

This adds the ability to report status that is not associated
with a particular fragment instance. By reporting status,
the coordinator will now correctly abort the query in the
case of the QueryState::StartFInstances() scenario described.

Change-Id: I4cd98022f1d62a999c7c80ff5474fa8d069eb12c
---
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/query-state.cc
M common/thrift/ImpalaInternalService.thrift
6 files changed, 58 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/7943/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7943
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I4cd98022f1d62a999c7c80ff5474fa8d069eb12c
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 


[Impala-ASF-CR] IMPALA-5750: Catch exceptions from boost thread creation

2017-09-01 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5750: Catch exceptions from boost thread creation
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7730/7/be/src/exec/kudu-scan-node.cc
File be/src/exec/kudu-scan-node.cc:

PS7, Line 173: scanner_threads_.AddThread(move(t));
> There may be another race here where if the KuduScanNode is closed before w
Here is my understanding:
SetDone() in Close() gets lock_ before calling SetDoneInternal(), so it holds 
lock_ regardless of whether done_ is true. The thread in ThreadAvailableCb() 
also gets lock_ before reading done_. So, they agree about done_ and either the 
thread in Close waits for the thread in ThreadAvailableCb() to add the thread 
and will need to join with it, or the thread in ThreadAvailableCb() will see 
done_==true and bail. 

I might be missing something.


-- 
To view, visit http://gerrit.cloudera.org:8080/7730
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5750: Catch exceptions from boost thread creation

2017-09-01 Thread Joe McDonnell (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7730

to look at the new patch set (#8).

Change subject: IMPALA-5750: Catch exceptions from boost thread creation
..

IMPALA-5750: Catch exceptions from boost thread creation

The boost thread constructor will throw boost::thread_resource_error
if it is unable to spawn a thread on the system
(e.g. due to a ulimit). This uncaught exception crashes
Impala. Systems with a large number of nodes and threads
are hitting this limit.

This change catches the exception from the thread
constructor and converts it to a Status. This requires
several changes:
1. util/thread.h's Thread constructor is now private
and all Threads are constructed via a new Create()
static factory method.
2. util/thread-pool.h's ThreadPool requires that Init()
be called after the ThreadPool is constructed.
3. To propagate the Status, Threads cannot be created in
constructors, so this is moved to initialization methods
that can return Status.
4. Threads now use unique_ptr's for management in all
cases. Threads cannot be used as stack-allocated local
variables or direct declarations in classes.

Query execution code paths will now handle the error:
1. If the scan node fails to spawn any scanner thread,
it will abort the query.
2. Failing to spawn a fragment instance from the query
state in StartFInstances() will correctly report the error
to the coordinator and tear down the query.

Testing:
This introduces the parameter thread_creation_fault_injection,
which will cause Thread::Create() calls in eligible
locations to fail randomly roughly 1% of the time.
Quite a few locations of Thread::Create() and
ThreadPool::Init() are necessary for startup and cannot
be eligible. However, all the locations used for query
execution are marked as eligible and governed by this
parameter. The code was tested by setting this parameter
to true and running queries to verify that queries either
run to completion with the correct result or fail with
appropriate status.

Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
---
M be/src/benchmarks/thread-create-benchmark.cc
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-server.h
M be/src/catalog/catalogd-main.cc
M be/src/common/global-flags.cc
M be/src/common/init.cc
M be/src/common/init.h
M be/src/exec/blocking-join-node.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/kudu-scan-node.cc
M be/src/exec/kudu-scan-node.h
M be/src/rpc/TAcceptQueueServer.cpp
M be/src/rpc/auth-provider.h
M be/src/rpc/authentication.cc
M be/src/rpc/thrift-server-test.cc
M be/src/rpc/thrift-server.cc
M be/src/rpc/thrift-server.h
M be/src/rpc/thrift-thread.cc
M be/src/rpc/thrift-thread.h
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/disk-io-mgr-handle-cache.h
M be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/parallel-executor.cc
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/thread-resource-mgr.h
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/admission-controller.h
M be/src/service/child-query.cc
M be/src/service/child-query.h
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/impalad-main.cc
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore-subscriber.h
M be/src/statestore/statestore.cc
M be/src/statestore/statestore.h
M be/src/statestore/statestored-main.cc
M be/src/testutil/in-process-servers.cc
M be/src/testutil/in-process-servers.h
M be/src/util/thread-pool-test.cc
M be/src/util/thread-pool.h
M be/src/util/thread.cc
M be/src/util/thread.h
M common/thrift/generate_error_codes.py
55 files changed, 465 insertions(+), 235 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/7730/8
-- 
To view, visit http://gerrit.cloudera.org:8080/7730
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5750: Catch exceptions from boost thread creation

2017-08-31 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5750: Catch exceptions from boost thread creation
..


Patch Set 7:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/7730/7/be/src/exec/hdfs-scan-node.cc
File be/src/exec/hdfs-scan-node.cc:

PS7, Line 371: COUNTER_ADD(num_scanner_threads_started_counter_, 1);
> The problem with moving this here is when the node is under heavy stress, t
Yes, this could be off. I suppose there is no way around it.


http://gerrit.cloudera.org:8080/#/c/7730/7/be/src/exec/kudu-scan-node.cc
File be/src/exec/kudu-scan-node.cc:

PS7, Line 171: ++num_active_scanners_;
> This can cause quite a few races. It can race with L220, L244 and L245.
num_active_scanners_ is protected by lock_. We get lock_ at the top of this 
loop and the other locations get lock_.


http://gerrit.cloudera.org:8080/#/c/7730/6/be/src/runtime/query-exec-mgr.cc
File be/src/runtime/query-exec-mgr.cc:

PS6, Line 59:   status = Thread::Create("query-exec-mgr",
:   Substitute("start-query-finstances-$0", PrintId(query_id)),
:   ::StartQueryHelper, this, qs, , true);
> Done
In testing, I found that this also needs 
qs->ReleaseInitialReservationRefcount(), which was taken in Init(). Added.


http://gerrit.cloudera.org:8080/#/c/7730/7/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

PS7, Line 335: // Fragment instance successfully started
 : // update fis_map_
 : fis_map_.emplace(fis->instance_id(), fis);
 : // update fragment_map_
 : vector& fis_list = 
fragment_map_[instance_ctx.fragment_idx];
 : fis_list.push_back(fis);
> Is it safe to update the map with the Fragment instance state after already
My thought process is something like this:
fis is a local variable. Both fis_map_ and fragment_map_ are private variables 
on QueryState. None of these variables are referenced outside this QueryState 
except through APIs protected by the promise. It looks like the fragment 
instance doesn't care about using the functions that would access these 
variables.

Worth considering if there is any way around it, but I think it is safe.


http://gerrit.cloudera.org:8080/#/c/7730/7/be/src/util/thread.cc
File be/src/util/thread.cc:

PS7, Line 303: rand()
> Not to be pedantic, but a rand() without seeding the PRNG first, will cause
I was thinking it might make sense for us to do a single srand() at startup. I 
will look into the appropriate place to do this.


-- 
To view, visit http://gerrit.cloudera.org:8080/7730
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5750: Catch exceptions from boost thread creation

2017-08-31 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5750: Catch exceptions from boost thread creation
..


Patch Set 6:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/7730/6/be/src/exec/kudu-scan-node.cc
File be/src/exec/kudu-scan-node.cc:

PS6, Line 159: Status status;
 : status =
> nit: combine these lines
Done


PS6, Line 163: num_scanner_threads_started_counter_
> here and other scan node: I guess we undo this rather than just waiting unt
I checked, and this is only used as a statistic and for making the names of the 
threads unique. I changed this to increment after successful thread create 
(here and for hdfs-scan-node.cc). (A side effect is that the thread indexes 
start from zero.)


http://gerrit.cloudera.org:8080/#/c/7730/6/be/src/runtime/query-exec-mgr.cc
File be/src/runtime/query-exec-mgr.cc:

PS6, Line 59:   RETURN_IF_ERROR(Thread::Create("query-exec-mgr",
:   Substitute("start-query-finstances-$0", PrintId(query_id)),
:   ::StartQueryHelper, this, qs, , true));
> Yes, that looks like a bug. I will fix in the next upload.
Done


http://gerrit.cloudera.org:8080/#/c/7730/6/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

PS6, Line 199: // If this fragment instance is not prepared, it must be 
reporting an error.
 : DCHECK(fis->IsPrepared() || !status.ok());
> If fis==null case doesn't report the error, then that's a bug we need to fi
Looking into a fix. I will also do fault injection at that point to see what 
happens. The two might be slightly different, because in that case no fragment 
instances have been started.


PS6, Line 331:   // Remove fis from fis_map_ and fragment_map_
 :   fis_map_.erase(fis->instance_id());
 :   fis_list.pop_back();
> why don't we just do this on the success path? is it to try to be consisten
I checked and all other uses of these two things check that the 
instances_preapred_promise_ is set. So, I moved it to the success path, since 
that is safe.


http://gerrit.cloudera.org:8080/#/c/7730/6/be/src/runtime/thread-resource-mgr.h
File be/src/runtime/thread-resource-mgr.h:

PS6, Line 115: bool in_callback = false);
> this seems complicated. why is it needed now?
ReleaseThreadToken calls InvokeCallbacks. The two locations where we use this 
(hdfs-scan-node.cc and kudu-scan-node.cc) are in functions that are registered 
as callbacks. If we try to release inside the callback, this is mutual 
recursion.


http://gerrit.cloudera.org:8080/#/c/7730/6/be/src/service/child-query.h
File be/src/service/child-query.h:

Line 159:   Status ExecAsync(std::vector&& child_queries);
> WARN_UNUSED_RESULT (or are we relying on the Status annotation now?)
Added WARN_UNUSED_RESULT.


http://gerrit.cloudera.org:8080/#/c/7730/6/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

PS6, Line 598: Thread::Create("query-exec-state", "wait-thread",
 :   ::Wait, this, _thread_);
> Good point. I will add it to the list of locations that are eligible for fa
Done


http://gerrit.cloudera.org:8080/#/c/7730/6/be/src/service/impala-beeswax-server.cc
File be/src/service/impala-beeswax-server.cc:

PS6, Line 71:   Status status;
:   status = request_state->WaitAsync();
> nit: combine
Done


http://gerrit.cloudera.org:8080/#/c/7730/6/be/src/util/thread.cc
File be/src/util/thread.cc:

Line 304: if ((nanos % 100) == 1) {
> Good point, will make that change.
Changed to use rand()


Line 327:   thread->swap(t);
> *thread = move(t) seems clearer
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7730
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5750: Catch exceptions from boost thread creation

2017-08-31 Thread Joe McDonnell (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7730

to look at the new patch set (#7).

Change subject: IMPALA-5750: Catch exceptions from boost thread creation
..

IMPALA-5750: Catch exceptions from boost thread creation

The boost thread constructor will throw boost::thread_resource_error
if it is unable to spawn a thread on the system
(e.g. due to a ulimit). This uncaught exception crashes
Impala. Systems with a large number of nodes and threads
are hitting this limit.

This change catches the exception from the thread
constructor and converts it to a Status. This requires
several changes:
1. util/thread.h's Thread constructor is now private
and all Threads are constructed via a new Create()
static factory method.
2. util/thread-pool.h's ThreadPool requires that Init()
be called after the ThreadPool is constructed.
3. To propagate the Status, Threads cannot be created in
constructors, so this is moved to initialization methods
that can return Status.
4. Threads now use unique_ptr's for management in all
cases. Threads cannot be used as stack-allocated local
variables or direct declarations in classes.

Query execution code paths will now handle the error:
1. If the scan node fails to spawn any scanner thread,
it will abort the query.
2. Failing to spawn a fragment instance from the query
state in StartFInstances() will correctly report the error
to the coordinator and tear down the query.

Testing:
This introduces the parameter thread_creation_fault_injection,
which will cause Thread::Create() calls in eligible
locations to fail randomly roughly 1% of the time.
Quite a few locations of Thread::Create() and
ThreadPool::Init() are necessary for startup and cannot
be eligible. However, all the locations used for query
execution are marked as eligible and governed by this
parameter. The code was tested by setting this parameter
to true and running queries to verify that queries either
run to completion with the correct result or fail with
appropriate status.

Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
---
M be/src/benchmarks/thread-create-benchmark.cc
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-server.h
M be/src/catalog/catalogd-main.cc
M be/src/common/global-flags.cc
M be/src/common/init.cc
M be/src/common/init.h
M be/src/exec/blocking-join-node.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/kudu-scan-node.cc
M be/src/exec/kudu-scan-node.h
M be/src/rpc/TAcceptQueueServer.cpp
M be/src/rpc/auth-provider.h
M be/src/rpc/authentication.cc
M be/src/rpc/thrift-server-test.cc
M be/src/rpc/thrift-server.cc
M be/src/rpc/thrift-server.h
M be/src/rpc/thrift-thread.cc
M be/src/rpc/thrift-thread.h
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/disk-io-mgr-handle-cache.h
M be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/parallel-executor.cc
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/thread-resource-mgr.h
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/admission-controller.h
M be/src/service/child-query.cc
M be/src/service/child-query.h
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/impalad-main.cc
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore-subscriber.h
M be/src/statestore/statestore.cc
M be/src/statestore/statestore.h
M be/src/statestore/statestored-main.cc
M be/src/testutil/in-process-servers.cc
M be/src/testutil/in-process-servers.h
M be/src/util/thread-pool-test.cc
M be/src/util/thread-pool.h
M be/src/util/thread.cc
M be/src/util/thread.h
M common/thrift/generate_error_codes.py
55 files changed, 464 insertions(+), 235 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/7730/7
-- 
To view, visit http://gerrit.cloudera.org:8080/7730
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5750: Catch exceptions from boost thread creation

2017-08-30 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5750: Catch exceptions from boost thread creation
..


Patch Set 6:

(9 comments)

Still going through the comments, but I thought I'd put up some quick replies.

http://gerrit.cloudera.org:8080/#/c/7730/6//COMMIT_MSG
Commit Message:

Line 28: variables or direct declarations in classes.
> any reason Thread and ThreadPool don't follow the same pattern? i.e. both u
ThreadPool uses the constructor+init pattern because we subclass it for the 
CallableThreadPool. Subclassing gets awkward with the static create pattern. If 
someone forgets to run Init(), they find out when they offer a piece of work to 
the ThreadPool.

For Thread, I think it would be feasible to use the constructor+init pattern. 
The difference is that if someone forgets an Init(), they might be expecting a 
thread and it just isn't there.


http://gerrit.cloudera.org:8080/#/c/7730/6/be/src/exec/hdfs-scan-node.cc
File be/src/exec/hdfs-scan-node.cc:

PS6, Line 363: pool->ReleaseThreadToken(false, true);
> Do you need to drop the lock before calling this? Based on the comment from
Yes, that is what in_callback is for. ReleaseThreadToken can call these thread 
availability callbacks, so we could end up back in this code. The 
in_callback=true prevents that.


http://gerrit.cloudera.org:8080/#/c/7730/6/be/src/runtime/query-exec-mgr.cc
File be/src/runtime/query-exec-mgr.cc:

PS6, Line 59:   RETURN_IF_ERROR(Thread::Create("query-exec-mgr",
:   Substitute("start-query-finstances-$0", PrintId(query_id)),
:   ::StartQueryHelper, this, qs, , true));
> Shouldn't we call ReleaseQueryState(qs); if the thread failed to start here
Yes, that looks like a bug. I will fix in the next upload.


http://gerrit.cloudera.org:8080/#/c/7730/6/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

PS6, Line 199: // If this fragment instance is not prepared, it must be 
reporting an error.
 : DCHECK(fis->IsPrepared() || !status.ok());
> Is this necessarily true?
The race might exist. I'm looking at the code to check.

The old DCHECK is stricter than the new one. Right now, we require that fis is 
nullptr or fis->IsPrepared(). I'm carving out an exception for failure during 
thread spawn when the fis won't have a chance to be prepared.


PS6, Line 199: // If this fragment instance is not prepared, it must be 
reporting an error.
 : DCHECK(fis->IsPrepared() || !status.ok());
> I don't really understand this DCHECK.  oh, because we pass 'fis' in the ne
In testing, I noticed that fis==nullptr doesn't actually convey the status. If 
I use fis==nullptr, the coordinator never gets notified of the status and the 
query hangs. When I went through ReportExecStatusAux, I noticed that apart from 
some DCHECKs, we don't use the status argument unless fis is non-null. 
TReportExecStatusParams doesn't really have a place for status apart from in 
the TFragmentInstanceExecStatus's.


PS6, Line 328: prepare_status
> why is this called "prepare_status"?  the old 'prepare_status' was the stat
The idea is to share the codepath below, which is waiting for the fragment 
instance states to be prepared. To incorporate the thread create status into 
the instances_prepared_promise_, the status needs to be available outside the 
loop.


http://gerrit.cloudera.org:8080/#/c/7730/6/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

PS6, Line 598: Thread::Create("query-exec-state", "wait-thread",
 :   ::Wait, this, _thread_);
> Why not inject the thread failures here as well? I think it's an important 
Good point. I will add it to the list of locations that are eligible for fault 
injection. I will be doing more test runs as I go, so this will now be tested.


http://gerrit.cloudera.org:8080/#/c/7730/6/be/src/util/thread.cc
File be/src/util/thread.cc:

Line 304: if ((nanos % 100) == 1) {
> on some platforms, the monotonic clock may not actually have nanosecond res
Good point, will make that change.


http://gerrit.cloudera.org:8080/#/c/7730/6/be/src/util/thread.h
File be/src/util/thread.h:

PS6, Line 158: thread
> nit: 'thread'
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7730
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5750: Catch exceptions from boost thread creation

2017-08-29 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5750: Catch exceptions from boost thread creation
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7730/5/be/src/util/thread.cc
File be/src/util/thread.cc:

Line 48: DEFINE_bool(thread_creation_fault_injection, false, "Causes calls to 
Thread::Create() "
> Nice!
That makes sense. It is good for it to be close to the other fault injection 
options. I made this only for DEBUG builds. I think that is sufficient for now.


-- 
To view, visit http://gerrit.cloudera.org:8080/7730
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5750: Catch exceptions from boost thread creation

2017-08-29 Thread Joe McDonnell (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7730

to look at the new patch set (#6).

Change subject: IMPALA-5750: Catch exceptions from boost thread creation
..

IMPALA-5750: Catch exceptions from boost thread creation

The boost thread constructor will throw boost::thread_resource_error
if it is unable to spawn a thread on the system
(e.g. due to a ulimit). This uncaught exception crashes
Impala. Systems with a large number of nodes and threads
are hitting this limit.

This change catches the exception from the thread
constructor and converts it to a Status. This requires
several changes:
1. util/thread.h's Thread constructor is now private
and all Threads are constructed via a new Create()
static factory method.
2. util/thread-pool.h's ThreadPool requires that Init()
be called after the ThreadPool is constructed.
3. To propagate the Status, Threads cannot be created in
constructors, so this is moved to initialization methods
that can return Status.
4. Threads now use unique_ptr's for management in all
cases. Threads cannot be used as stack-allocated local
variables or direct declarations in classes.

Query execution code paths will now handle the error:
1. If the scan node fails to spawn any scanner thread,
it will abort the query.
2. Failing to spawn a fragment instance from the query
state in StartFInstances() will correctly report the error
to the coordinator and tear down the query.

Testing:
This introduces the parameter thread_creation_fault_injection,
which will cause Thread::Create() calls in eligible
locations to fail randomly roughly 1% of the time.
Quite a few locations of Thread::Create() and
ThreadPool::Init() are necessary for startup and cannot
be eligible. However, all the locations used for query
execution are marked as eligible and governed by this
parameter. The code was tested by setting this parameter
to true and running queries to verify that queries either
run to completion with the correct result or fail with
appropriate status.

Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
---
M be/src/benchmarks/thread-create-benchmark.cc
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-server.h
M be/src/catalog/catalogd-main.cc
M be/src/common/global-flags.cc
M be/src/common/init.cc
M be/src/common/init.h
M be/src/exec/blocking-join-node.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/kudu-scan-node.cc
M be/src/exec/kudu-scan-node.h
M be/src/rpc/TAcceptQueueServer.cpp
M be/src/rpc/auth-provider.h
M be/src/rpc/authentication.cc
M be/src/rpc/thrift-server-test.cc
M be/src/rpc/thrift-server.cc
M be/src/rpc/thrift-server.h
M be/src/rpc/thrift-thread.cc
M be/src/rpc/thrift-thread.h
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/disk-io-mgr-handle-cache.h
M be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/parallel-executor.cc
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/thread-resource-mgr.h
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/admission-controller.h
M be/src/service/child-query.cc
M be/src/service/child-query.h
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/impalad-main.cc
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore-subscriber.h
M be/src/statestore/statestore.cc
M be/src/statestore/statestore.h
M be/src/statestore/statestored-main.cc
M be/src/testutil/in-process-servers.cc
M be/src/testutil/in-process-servers.h
M be/src/util/thread-pool-test.cc
M be/src/util/thread-pool.h
M be/src/util/thread.cc
M be/src/util/thread.h
M common/thrift/generate_error_codes.py
55 files changed, 460 insertions(+), 227 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/7730/6
-- 
To view, visit http://gerrit.cloudera.org:8080/7730
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5750: Catch exceptions from boost thread creation

2017-08-28 Thread Joe McDonnell (Code Review)
Joe McDonnell has uploaded a new patch set (#5).

Change subject: IMPALA-5750: Catch exceptions from boost thread creation
..

IMPALA-5750: Catch exceptions from boost thread creation

The boost thread constructor will throw boost::thread_resource_error
if it is unable to spawn a thread on the system
(e.g. due to a ulimit). This uncaught exception crashes
Impala. Systems with a large number of nodes and threads
are hitting this limit.

This change catches the exception from the thread
constructor and converts it to a Status. This requires
several changes:
1. util/thread.h's Thread constructor is now private
and all Threads are constructed via a new Create()
static factory method.
2. util/thread-pool.h's ThreadPool requires that Init()
be called after the ThreadPool is constructed.
3. To propagate the Status, Threads cannot be created in
constructors, so this is moved to initialization methods
that can return Status.
4. Threads now use unique_ptr's for management in all
cases. Threads cannot be used as stack-allocated local
variables or direct declarations in classes.

Query execution code paths will now handle the error:
1. If the scan node fails to spawn any scanner thread,
it will abort the query.
2. Failing to spawn a fragment instance from the query
state in StartFInstances() will correctly report the error
to the coordinator and tear down the query.

Testing:
This introduces the parameter thread_creation_fault_injection,
which will cause Thread::Create() calls in eligible
locations to fail randomly roughly 1% of the time.
Quite a few locations of Thread::Create() and
ThreadPool::Init() are necessary for startup and cannot
be eligible. However, all the locations used for query
execution are marked as eligible and governed by this
parameter. The code was tested by setting this parameter
to true and running queries to verify that queries either
run to completion with the correct result or fail with
appropriate status.

Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
---
M be/src/benchmarks/thread-create-benchmark.cc
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-server.h
M be/src/catalog/catalogd-main.cc
M be/src/common/init.cc
M be/src/common/init.h
M be/src/exec/blocking-join-node.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/kudu-scan-node.cc
M be/src/exec/kudu-scan-node.h
M be/src/rpc/TAcceptQueueServer.cpp
M be/src/rpc/auth-provider.h
M be/src/rpc/authentication.cc
M be/src/rpc/thrift-server-test.cc
M be/src/rpc/thrift-server.cc
M be/src/rpc/thrift-server.h
M be/src/rpc/thrift-thread.cc
M be/src/rpc/thrift-thread.h
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/disk-io-mgr-handle-cache.h
M be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/parallel-executor.cc
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/thread-resource-mgr.h
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/admission-controller.h
M be/src/service/child-query.cc
M be/src/service/child-query.h
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/impalad-main.cc
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore-subscriber.h
M be/src/statestore/statestore.cc
M be/src/statestore/statestore.h
M be/src/statestore/statestored-main.cc
M be/src/testutil/in-process-servers.cc
M be/src/testutil/in-process-servers.h
M be/src/util/thread-pool-test.cc
M be/src/util/thread-pool.h
M be/src/util/thread.cc
M be/src/util/thread.h
M common/thrift/generate_error_codes.py
54 files changed, 454 insertions(+), 227 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/7730/5
-- 
To view, visit http://gerrit.cloudera.org:8080/7730
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5750: Catch exceptions from boost thread creation

2017-08-28 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5750: Catch exceptions from boost thread creation
..


Patch Set 4:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/7730/3//COMMIT_MSG
Commit Message:

Line 28: variables or direct declarations in classes.
> Yeah, it does seem like there's an important distinction there.
I labeled the locations that are eligible for fault injection, and then added a 
parameter to turn on thread creation fault injection. This should make testing 
much easier.


http://gerrit.cloudera.org:8080/#/c/7730/3/be/src/exec/hdfs-scan-node.cc
File be/src/exec/hdfs-scan-node.cc:

Line 361:   COUNTER_ADD(_scanner_thread_counter_, -1);
> Oh I see. Seems ok. I'm a little skeptical that it's worth trying to recove
There are arguments in either direction. Given that we think this should be 
rare, I think it makes sense to go ahead and abort the query. I changed this 
code and kudu-scan-node.cc to do that.


http://gerrit.cloudera.org:8080/#/c/7730/4/be/src/exec/kudu-scan-node.cc
File be/src/exec/kudu-scan-node.cc:

Line 164:   UndoGetNextScanToken(token);
> In this case it does seem simpler to fail the query, since it we mess this 
Changed this to fail the query.


http://gerrit.cloudera.org:8080/#/c/7730/4/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

Line 330:   // Either get this to work or remove before merge.
> Still relevant?
Removed


Line 350: Status instance_status = entry.second->WaitForPrepare();
> I think we should preserve the postcondition that all fragment instances ha
That makes sense. One thing that I noticed when making this change is that in 
some circumstances the fragment instance will get into an RPC call before the 
query starts cancelling. In this case, the RPC call needs to timeout for the 
query to complete its cancel. The query does eventually end and nothing crashes.


http://gerrit.cloudera.org:8080/#/c/7730/3/be/src/util/thread-pool.h
File be/src/util/thread-pool.h:

Line 61: 
> Yeah I think that would make most sense to me - reduces the number of diffe
Fixed this by doing Shutdown() + Join() in the error condition.


http://gerrit.cloudera.org:8080/#/c/7730/3/be/src/util/thread.cc
File be/src/util/thread.cc:

Line 301:   unique_ptr t(new Thread(category, name));
> Sounds like a good idea to me.
Added.


-- 
To view, visit http://gerrit.cloudera.org:8080/7730
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5750: Catch exceptions from boost thread creation

2017-08-25 Thread Joe McDonnell (Code Review)
Joe McDonnell has uploaded a new patch set (#4).

Change subject: IMPALA-5750: Catch exceptions from boost thread creation
..

IMPALA-5750: Catch exceptions from boost thread creation

The boost thread constructor will throw boost::thread_resource_error
if it is unable to spawn a thread on the system
(e.g. due to a ulimit). This uncaught exception crashes
Impala. Systems with a large number of nodes and threads
are hitting this limit.

This change catches the exception from the thread
constructor and converts it to a Status. This requires
several changes:
1. util/thread.h's Thread constructor is now private
and all Threads are constructed via a new Create()
static factory method.
2. util/thread-pool.h's ThreadPool requires that Init()
be called after the ThreadPool is constructed.
3. To propagate the Status, Threads cannot be created in
constructors, so this is moved to initialization methods
that can return Status.
4. Threads now use unique_ptr's for management in all
cases. Threads cannot be used as stack-allocated local
variables or direct declarations in classes.

Query execution code paths will now handle the error:
1. Non-MT scan nodes, such as HDFS scan node and Kudu
scan node require at least one scanner thread to
make progress. This patch will abort the query if
the scan node cannot obtain at least one scanner thread.
However, if the scan node has at least one scanner
thread, the query will continue.
2. Failing to spawn a fragment instance from the query
state in StartFInstances() will correctly report the error
to the coordinator and tear down the query.

Testing:
Thread::Create() has an optional argument to randomly
inject thread creation errors at any single point in
the code. It causes roughly 1% of the calls to fail.
The code was tested by setting this option in different
code points and verifying that the queries either
run to completion with the correct result or fail with
the appropriate status.

Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
---
M be/src/benchmarks/thread-create-benchmark.cc
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-server.h
M be/src/catalog/catalogd-main.cc
M be/src/common/init.cc
M be/src/common/init.h
M be/src/exec/blocking-join-node.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/kudu-scan-node-base.cc
M be/src/exec/kudu-scan-node-base.h
M be/src/exec/kudu-scan-node.cc
M be/src/exec/kudu-scan-node.h
M be/src/rpc/TAcceptQueueServer.cpp
M be/src/rpc/auth-provider.h
M be/src/rpc/authentication.cc
M be/src/rpc/thrift-server-test.cc
M be/src/rpc/thrift-server.cc
M be/src/rpc/thrift-server.h
M be/src/rpc/thrift-thread.cc
M be/src/rpc/thrift-thread.h
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/disk-io-mgr-handle-cache.h
M be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/parallel-executor.cc
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/thread-resource-mgr.h
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/admission-controller.h
M be/src/service/child-query.cc
M be/src/service/child-query.h
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/impalad-main.cc
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore-subscriber.h
M be/src/statestore/statestore.cc
M be/src/statestore/statestore.h
M be/src/statestore/statestored-main.cc
M be/src/testutil/in-process-servers.cc
M be/src/testutil/in-process-servers.h
M be/src/util/thread-pool-test.cc
M be/src/util/thread-pool.h
M be/src/util/thread.cc
M be/src/util/thread.h
M common/thrift/generate_error_codes.py
56 files changed, 444 insertions(+), 225 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/7730/4
-- 
To view, visit http://gerrit.cloudera.org:8080/7730
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5750: Catch exceptions from boost thread creation

2017-08-25 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5750: Catch exceptions from boost thread creation
..


Patch Set 3:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/7730/3//COMMIT_MSG
Commit Message:

Line 28: variables or direct declarations in classes.
> For testing, it might be worth adding a stress startup option to make threa
I have done hand testing on a variety of codepaths. I'm looking into automating 
the testing. It is useful to make a distinction between the thread creates that 
are necessary for startup and/or instance life and those that are only needed 
for query success. I think tagging the query locations can lead to a repeatable 
test: either the queries return the right answer or they return this error.


http://gerrit.cloudera.org:8080/#/c/7730/3/be/src/exec/hdfs-scan-node.cc
File be/src/exec/hdfs-scan-node.cc:

Line 361:   COUNTER_ADD(num_scanner_threads_started_counter_, -1);
> It seems ok to omit fixing up num_scanner_threads_started_counter_. I don't
I'm ok with either way. My thought is that in cases where the query doesn't 
abort we may want to know how many threads actually ran.


http://gerrit.cloudera.org:8080/#/c/7730/3/be/src/exec/kudu-scan-node.cc
File be/src/exec/kudu-scan-node.cc:

Line 154: ++num_active_scanners_;
> This is protected by lock_ so we could just wait until the thread is create
Done


http://gerrit.cloudera.org:8080/#/c/7730/3/be/src/rpc/thrift-thread.cc
File be/src/rpc/thrift-thread.cc:

Line 37:   if (!status.ok()) throw 
atc::SystemResourceException("Thread::Create() failed");
> This could use some explanation, since it's an unusual idiom in the code. I
Added a comment for this, but it is a bit unclear to me how thrift handles the 
exception. It seems like it gets sent back to the client, but I'm not 100% sure.


http://gerrit.cloudera.org:8080/#/c/7730/3/be/src/service/impala-hs2-server.cc
File be/src/service/impala-hs2-server.cc:

Line 469:   HS2_RETURN_ERROR(return_val, status.GetDetail(), 
SQLSTATE_GENERAL_ERROR);
> It looks like we have identical error-handling code in three places. May be
Changed to gotos and shared code.


PS3, Line 476: (void) 
> discard_result
Done


http://gerrit.cloudera.org:8080/#/c/7730/3/be/src/util/thread-pool.h
File be/src/util/thread-pool.h:

Line 61:   /// error spawning the threads.
> How do things get cleaned up on an error? Does Shutdown() handle that case?
When we get an error, only successfully started threads are in the threads_ 
ThreadGroup. ~ThreadPool calls Shutdown() and Join(), so we know that the 
threads will be cleaned up eventually. The owner of the ThreadPool might also 
call Shutdown() and Join(), and this is fine, because both of those functions 
are idempotent. The only thing the owner can do wrong is to Join() without 
calling Shutdown(), which makes no sense.

It might make sense to have the error case automatically call Shutdown() + 
Join().


http://gerrit.cloudera.org:8080/#/c/7730/3/be/src/util/thread.cc
File be/src/util/thread.cc:

Line 301: return Status(e.what());
> Might be worth adding a status code to common/thrift/generate_error_codes.p
Added this. I don't think the category or name would be useful to the customer, 
but if they would be useful to us, we can add it to the error message.


http://gerrit.cloudera.org:8080/#/c/7730/3/be/src/util/thread.h
File be/src/util/thread.h:

PS3, Line 69: std::unique_ptr&
> we'd normally use pointers for output arguments - https://google.github.io/
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7730
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5352: Age out unused file handles from the cache

2017-08-22 Thread Joe McDonnell (Code Review)
Hello Impala Public Jenkins, Matthew Jacobs, Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7640

to look at the new patch set (#10).

Change subject: IMPALA-5352: Age out unused file handles from the cache
..

IMPALA-5352: Age out unused file handles from the cache

Currently, a file handle in the file handle cache will
only be evicted if the cache reaches its capacity. This
means that file handles can be retained for an indefinite
amount of time. This is true even for files that have
been deleted, replaced, or modified. Since a file handle
maintains a file descriptor for local files, this can
prevent the disk space from being freed. Additionally,
unused file handles are wasted memory.

This adds code to evict file handles that have been
unused for longer than a specified threshold. A thread
periodically checks the file handle cache to see if
any file handle should be evicted. The threshold is
specified by 'unused_file_handle_timeout_sec'; it
defaults to 6 hours.

This adds a test to custom_cluster/test_hdfs_fd_caching.py
to verify the eviction behavior.

Change-Id: Iefe04b3e2e22123ecb8b3e494934c93dfb29682e
---
M be/src/runtime/disk-io-mgr-handle-cache.h
M be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr.cc
M tests/custom_cluster/test_hdfs_fd_caching.py
4 files changed, 172 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/7640/10
-- 
To view, visit http://gerrit.cloudera.org:8080/7640
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iefe04b3e2e22123ecb8b3e494934c93dfb29682e
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5352: Age out unused file handles from the cache

2017-08-22 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5352: Age out unused file handles from the cache
..


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7640/9/be/src/runtime/disk-io-mgr-handle-cache.inline.h
File be/src/runtime/disk-io-mgr-handle-cache.inline.h:

PS9, Line 190:   bool timed_out = true;
 :   // This Get() will time out until shutdown, when the promise 
is set.
 :   while (timed_out) {
> this is weird. normally I'm not in favor of while (true), but in this case 
Yes, that is better. Changed it.


-- 
To view, visit http://gerrit.cloudera.org:8080/7640
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iefe04b3e2e22123ecb8b3e494934c93dfb29682e
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5352: Age out unused file handles from the cache

2017-08-22 Thread Joe McDonnell (Code Review)
Hello Impala Public Jenkins, Matthew Jacobs, Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7640

to look at the new patch set (#9).

Change subject: IMPALA-5352: Age out unused file handles from the cache
..

IMPALA-5352: Age out unused file handles from the cache

Currently, a file handle in the file handle cache will
only be evicted if the cache reaches its capacity. This
means that file handles can be retained for an indefinite
amount of time. This is true even for files that have
been deleted, replaced, or modified. Since a file handle
maintains a file descriptor for local files, this can
prevent the disk space from being freed. Additionally,
unused file handles are wasted memory.

This adds code to evict file handles that have been
unused for longer than a specified threshold. A thread
periodically checks the file handle cache to see if
any file handle should be evicted. The threshold is
specified by 'unused_file_handle_timeout_sec'; it
defaults to 6 hours.

This adds a test to custom_cluster/test_hdfs_fd_caching.py
to verify the eviction behavior.

Change-Id: Iefe04b3e2e22123ecb8b3e494934c93dfb29682e
---
M be/src/runtime/disk-io-mgr-handle-cache.h
M be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr.cc
M tests/custom_cluster/test_hdfs_fd_caching.py
4 files changed, 171 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/7640/9
-- 
To view, visit http://gerrit.cloudera.org:8080/7640
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iefe04b3e2e22123ecb8b3e494934c93dfb29682e
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5352: Age out unused file handles from the cache

2017-08-22 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5352: Age out unused file handles from the cache
..


Patch Set 8:

Fixed the run time of disk-io-mgr-test by changing to an interruptible wait.

-- 
To view, visit http://gerrit.cloudera.org:8080/7640
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iefe04b3e2e22123ecb8b3e494934c93dfb29682e
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5352: Age out unused file handles from the cache

2017-08-21 Thread Joe McDonnell (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7640

to look at the new patch set (#5).

Change subject: IMPALA-5352: Age out unused file handles from the cache
..

IMPALA-5352: Age out unused file handles from the cache

Currently, a file handle in the file handle cache will
only be evicted if the cache reaches its capacity. This
means that file handles can be retained for an indefinite
amount of time. This is true even for files that have
been deleted, replaced, or modified. Since a file handle
maintains a file descriptor for local files, this can
prevent the disk space from being freed. Additionally,
unused file handles are wasted memory.

This adds code to evict file handles that have been
unused for longer than a specified threshold. A thread
periodically checks the file handle cache to see if
any file handle should be evicted. The threshold is
specified by 'unused_file_handle_timeout_sec'; it
defaults to 6 hours.

This adds a test to custom_cluster/test_hdfs_fd_caching.py
to verify the eviction behavior.

Change-Id: Iefe04b3e2e22123ecb8b3e494934c93dfb29682e
---
M be/src/runtime/disk-io-mgr-handle-cache.h
M be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr.cc
M tests/custom_cluster/test_hdfs_fd_caching.py
4 files changed, 169 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/7640/5
-- 
To view, visit http://gerrit.cloudera.org:8080/7640
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iefe04b3e2e22123ecb8b3e494934c93dfb29682e
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5352: Age out unused file handles from the cache

2017-08-21 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5352: Age out unused file handles from the cache
..


Patch Set 4:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/7640/4/be/src/runtime/disk-io-mgr-handle-cache.h
File be/src/runtime/disk-io-mgr-handle-cache.h:

Line 94:   ~FileHandleCache();
> Worth mentioning this is only used for backend tests.
Done


PS4, Line 187: true
> SHUTDOWN_TRUE.
Done


http://gerrit.cloudera.org:8080/#/c/7640/4/tests/custom_cluster/test_hdfs_fd_caching.py
File tests/custom_cluster/test_hdfs_fd_caching.py:

Line 63:   def run_fd_caching_test(self, vector, caching_expected, 
cache_capacity, \
> nit: line continuation shouldn't be needed inside parens
Done


PS4, Line 113: (
> nit: unnecessary parens
Good point. Fixed.


PS4, Line 150: ):
> nit: unnecessary parens
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7640
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iefe04b3e2e22123ecb8b3e494934c93dfb29682e
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5750: Catch exceptions from boost thread creation

2017-08-21 Thread Joe McDonnell (Code Review)
Joe McDonnell has uploaded a new patch set (#3).

Change subject: IMPALA-5750: Catch exceptions from boost thread creation
..

IMPALA-5750: Catch exceptions from boost thread creation

The boost thread constructor will throw boost::thread_resource_error
if it is unable to spawn a thread on the system
(e.g. due to a ulimit). This uncaught exception crashes
Impala. Systems with a large number of nodes and threads
are hitting this limit.

This change catches the exception from the thread
constructor and converts it to a Status. This requires
several changes:
1. util/thread.h's Thread constructor is now private
and all Threads are constructed via a new Create()
static factory method.
2. util/thread-pool.h's ThreadPool requires that Init()
be called after the ThreadPool is constructed.
3. To propagate the Status, Threads cannot be created in
constructors, so this is moved to initialization methods
that can return Status.
4. Threads now use unique_ptr's for management in all
cases. Threads cannot be used as stack-allocated local
variables or direct declarations in classes.

Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
---
M be/src/benchmarks/thread-create-benchmark.cc
M be/src/catalog/catalog-server.cc
M be/src/catalog/catalog-server.h
M be/src/catalog/catalogd-main.cc
M be/src/common/init.cc
M be/src/common/init.h
M be/src/exec/blocking-join-node.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/kudu-scan-node.cc
M be/src/rpc/TAcceptQueueServer.cpp
M be/src/rpc/auth-provider.h
M be/src/rpc/authentication.cc
M be/src/rpc/thrift-server-test.cc
M be/src/rpc/thrift-server.cc
M be/src/rpc/thrift-server.h
M be/src/rpc/thrift-thread.cc
M be/src/rpc/thrift-thread.h
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/parallel-executor.cc
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-state.cc
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/admission-controller.h
M be/src/service/child-query.cc
M be/src/service/child-query.h
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/impalad-main.cc
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore-subscriber.h
M be/src/statestore/statestore.cc
M be/src/statestore/statestore.h
M be/src/statestore/statestored-main.cc
M be/src/testutil/in-process-servers.cc
M be/src/testutil/in-process-servers.h
M be/src/util/thread-pool-test.cc
M be/src/util/thread-pool.h
M be/src/util/thread.cc
M be/src/util/thread.h
47 files changed, 297 insertions(+), 149 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/7730/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7730
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5750: Catch exceptions from boost thread creation

2017-08-21 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5750: Catch exceptions from boost thread creation
..


Patch Set 1:

(17 comments)

http://gerrit.cloudera.org:8080/#/c/7730/1//COMMIT_MSG
Commit Message:

Line 38: 3. The Thread::Create and ThreadPool::Create have
> Would unique_ptr do the job? I think you will need to support unique_ptr an
Changed to unique_ptr


http://gerrit.cloudera.org:8080/#/c/7730/2//COMMIT_MSG
Commit Message:

Line 18: 1. util/thread.h's Thread constructor is now private
> I think it's a matter of taste, but in some ways I prefer the constructor +
Yes, the constructor + Init() also makes subclassing easier. I switched 
ThreadPool over to this approach. I think it makes more sense for that. Thread 
seems like either would work, so I'm sticking with a static factory method for 
now.


Line 34: 1. QueryState::StartFInstances() needs appropriate
> One option is to defer the tricky cases til a follow-up patch and bring dow
I did that for query-state.cc, but I think that is probably the most common 
place for new threads to hit this exception, so it would be really nice to get 
it working.


http://gerrit.cloudera.org:8080/#/c/7730/1/be/src/exec/blocking-join-node.cc
File be/src/exec/blocking-join-node.cc:

Line 197: 
RETURN_IF_ERROR(Thread::Create(FragmentInstanceState::FINST_THREAD_GROUP_NAME,
> Do we need to release the thread token on this error path?
Good point, this definitely needs to release the thread token. Fixed.


http://gerrit.cloudera.org:8080/#/c/7730/2/be/src/rpc/TAcceptQueueServer.cpp
File be/src/rpc/TAcceptQueueServer.cpp:

Line 226: stop_ = true;
> Should we throw an exception here? What's the behavior when it continues ex
Setting stop_=true means we will not execute the loop and instead will do 
shutdown cleanup in the "if (stop_)" block and exit the function. Exiting the 
function stops the server.

At various points in this file, we catch exceptions, so I think we don't want 
an exception.


http://gerrit.cloudera.org:8080/#/c/7730/1/be/src/runtime/parallel-executor.cc
File be/src/runtime/parallel-executor.cc:

Line 39: RETURN_IF_ERROR(Thread::Create("parallel-executor", ss.str(),
> If we created some threads I think we still need to call JoinAll() to avoid
Yes, that definitely needs to happen. Those threads reference stack variables, 
so it would be very dangerous to just leave. Changed to do the JoinAll().


http://gerrit.cloudera.org:8080/#/c/7730/1/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

Line 339: // TODO: This error handling needs work.
> I think worst-case we could do a LOG(FATAL) for now and come back to fix th
Changed to LOG(FATAL) for the moment, but I am still looking into getting this 
to work.


http://gerrit.cloudera.org:8080/#/c/7730/1/be/src/service/impala-beeswax-server.cc
File be/src/service/impala-beeswax-server.cc:

PS1, Line 74: (void) U
> should use discard_result() once you rebase onto my change (GCC 7 does not 
Done


http://gerrit.cloudera.org:8080/#/c/7730/1/be/src/statestore/statestore.h
File be/src/statestore/statestore.h:

Line 404:   boost::scoped_ptr 
subscriber_topic_update_threadpool_;
> long lines
Done


http://gerrit.cloudera.org:8080/#/c/7730/2/be/src/statestore/statestored-main.cc
File be/src/statestore/statestored-main.cc:

Line 71:   ABORT_IF_ERROR(RegisterMemoryMetrics(metrics.get(), false, nullptr, 
nullptr));
> You also need to do:
Done


PS2, Line 72: StartMemoryMaintenanceThread();
> ABORT_IF_ERROR(StartMemoryMaintenanceThread());
Done


http://gerrit.cloudera.org:8080/#/c/7730/1/be/src/util/thread-pool.h
File be/src/util/thread-pool.h:

Line 123:   Status CreateThreads(const std::string& group, const std::string& 
thread_prefix,
> Should add WARN_UNUSED_RESULT to make sure it's not dropped anywhere.
Done


Line 185:   static Status Create(const std::string& group, const std::string& 
thread_prefix,
> Should add WARN_UNUSED_RESULT to make sure it's not dropped anywhere.
Done


http://gerrit.cloudera.org:8080/#/c/7730/2/be/src/util/thread-pool.h
File be/src/util/thread-pool.h:

Line 46:   ///  -- work_function: the function to run every time an item is 
consumed from the queue
> Update comment to include 'tpool' arg.
Changed approach on ThreadPool to make it follow the constructor + Init() 
pattern, so this change is gone.


PS2, Line 184: /// TODO: add comment
> Comment?
Removed


PS2, Line 188: DCHECK(pool == nullptr);
 : boost::scoped_ptr local_tpool;
 : local_tpool.reset(new CallableThreadPool(queue_size, 
::Worker));
 : RETURN_IF_ERROR(local_tpool->CreateThreads(group, 
thread_prefix, num_threads));
 : pool.swap(local_tpool);
 : return Status::OK();
> Why not just call the parent's Create() here?
Switching to the constructor + Init() pattern eliminated the need for this 
code. Now 

[Impala-ASF-CR] IMPALA-5352: Age out unused file handles from the cache

2017-08-20 Thread Joe McDonnell (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7640

to look at the new patch set (#4).

Change subject: IMPALA-5352: Age out unused file handles from the cache
..

IMPALA-5352: Age out unused file handles from the cache

Currently, a file handle in the file handle cache will
only be evicted if the cache reaches its capacity. This
means that file handles can be retained for an indefinite
amount of time. This is true even for files that have
been deleted, replaced, or modified. Since a file handle
maintains a file descriptor for local files, this can
prevent the disk space from being freed. Additionally,
unused file handles are wasted memory.

This adds code to evict file handles that have been
unused for longer than a specified threshold. A thread
periodically checks the file handle cache to see if
any file handle should be evicted. The threshold is
specified by 'unused_file_handle_timeout_sec'; it
defaults to 6 hours.

This adds a test to custom_cluster/test_hdfs_fd_caching.py
to verify the eviction behavior.

Change-Id: Iefe04b3e2e22123ecb8b3e494934c93dfb29682e
---
M be/src/runtime/disk-io-mgr-handle-cache.h
M be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr.cc
M tests/custom_cluster/test_hdfs_fd_caching.py
4 files changed, 167 insertions(+), 30 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/7640/4
-- 
To view, visit http://gerrit.cloudera.org:8080/7640
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iefe04b3e2e22123ecb8b3e494934c93dfb29682e
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5352: Age out unused file handles from the cache

2017-08-20 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5352: Age out unused file handles from the cache
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7640/3/be/src/runtime/disk-io-mgr-handle-cache.h
File be/src/runtime/disk-io-mgr-handle-cache.h:

PS3, Line 137:/// Move constructor
 : FileHandleEntry(FileHandleEntry&& entry)
 : : fh(std::move(entry.fh)), lru_entry(entry.lru_entry) {
 :   DCHECK(!entry.in_use);
 : }
> Is this necessary? I think the compiler should generate a move constructor 
Good point, it isn't needed. Removed.


-- 
To view, visit http://gerrit.cloudera.org:8080/7640
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iefe04b3e2e22123ecb8b3e494934c93dfb29682e
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5352: Age out unused file handles from the cache

2017-08-19 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5352: Age out unused file handles from the cache
..


Patch Set 2:

(12 comments)

Also rebased to the latest

http://gerrit.cloudera.org:8080/#/c/7640/2//COMMIT_MSG
Commit Message:

Line 25: This adds a test to custom_cluster/test_hdfs_fd_caching.py
> worth thinking about testing this on a cluster where there is a huge burst 
Since this is happening every second, the cache should have a limited number of 
file handles to clean up on a single partition in a single iteration.


http://gerrit.cloudera.org:8080/#/c/7640/2/be/src/runtime/disk-io-mgr-handle-cache.h
File be/src/runtime/disk-io-mgr-handle-cache.h:

PS2, Line 97:  Returns an error if thread creation fails.
> now it's void :)
Done


http://gerrit.cloudera.org:8080/#/c/7640/2/be/src/runtime/disk-io-mgr-handle-cache.inline.h
File be/src/runtime/disk-io-mgr-handle-cache.inline.h:

Line 49:   shut_down_.Store(SHUTDOWN_FALSE);
> nit: I think we can just initialize this in the initialiser list.
Done


Line 67:   if (eviction_thread_.get() != nullptr) {
> nit: conditional fits on one line.
Done


PS2, Line 64: template 
: FileHandleCache::~FileHandleCache() {
:   shut_down_.Store(SHUTDOWN_TRUE);
:   if (eviction_thread_.get() != nullptr) {
: eviction_thread_->Join();
:   }
: }
> this never actually runs because this has process-lifetime, right?
Since this is part of the DiskIoMgr object, disk-io-mgr-test.cc covers this. 
Some of the tests construct a DiskIoMgr with a smart pointer and later reset 
that pointer.


PS2, Line 116: // This emplace_hint requires constructing a pair in the 
map. Emplacing an
 : // element with a multiple argument constructor (such as 
FileHandleEntry)
 : // introduces ambiguity between the key's arguments and the 
value's arguments.
 : // To resolve the ambiguity, use a piecewise construction 
and separate the
 : // arguments for the key from the value.
 : typename MapType::iterator new_it = 
p.cache.emplace_hint(range.second,
 : std::piecewise_construct, std::forward_as_tuple(*fname),
 : std::forward_as_tuple(new_fh, p.lru_list));
> Yeah I'm wondering if there's a simpler way to express it.
Changed this to use a move constructor. This is not performance critical. We 
can revisit this later if it bothers us.


PS2, Line 196: FileHandleCache::FileHandleCachePartition& p
> how about moving this function into FileHandleCachePartition, then calling 
I added detailed comments about the lru list manipulations, but I don't think 
putting it in a separate method makes sense. I looked at the function 
signatures and they would need a lot of explanation in the function comments, 
so I would rather have the comments inline. My main thought is that this code 
is one coherent whole, and anyone doing maintenance or improvement on it is 
going to have to understand every line. I'm reluctant to add code structures to 
hide anything.

Right now, FileHandleCachePartition is purely for memory layout. My general 
feeling is that adding methods to it and turning FileHandleCachePartition into 
a full object is overkill. It moves code around, but it doesn't really change 
anything.


PS2, Line 197: uint64_t now = MonotonicSeconds();
> maybe this should be determined in EvictHandlesLoop() before you start taki
This is called every second. I don't think there needs to be any consistency 
between partitions.


PS2, Line 198: valid
> valid is a bit vague
Changed to "allowed"


PS2, Line 199: 0
> hitting this branch seems extremely unlikely, only if the user sets the tim
Changed the non-eviction test to set the flag to a very large value.


PS2, Line 202: lru_entry = p.lru_list.front();
 : typename MapType::iterator evict_it = lru_entry.map_entry;
 : uint64_t lru_oldest_timestamp
> it'd be nice to use the same naming convention for these vars since they're
Done


http://gerrit.cloudera.org:8080/#/c/7640/2/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

Line 112: DEFINE_uint64(unused_file_handle_timeout, 43200, "Maximum time, in 
seconds, that an "
> I do like having the units in the name, would you mind appending _sec ?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7640
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iefe04b3e2e22123ecb8b3e494934c93dfb29682e
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5352: Age out unused file handles from the cache

2017-08-19 Thread Joe McDonnell (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7640

to look at the new patch set (#3).

Change subject: IMPALA-5352: Age out unused file handles from the cache
..

IMPALA-5352: Age out unused file handles from the cache

Currently, a file handle in the file handle cache will
only be evicted if the cache reaches its capacity. This
means that file handles can be retained for an indefinite
amount of time. This is true even for files that have
been deleted, replaced, or modified. Since a file handle
maintains a file descriptor for local files, this can
prevent the disk space from being freed. Additionally,
unused file handles are wasted memory.

This adds code to evict file handles that have been
unused for longer than a specified threshold. A thread
periodically checks the file handle cache to see if
any file handle should be evicted. The threshold is
specified by 'unused_file_handle_timeout_sec'; it
defaults to 6 hours.

This adds a test to custom_cluster/test_hdfs_fd_caching.py
to verify the eviction behavior.

Change-Id: Iefe04b3e2e22123ecb8b3e494934c93dfb29682e
---
M be/src/runtime/disk-io-mgr-handle-cache.h
M be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr.cc
M tests/custom_cluster/test_hdfs_fd_caching.py
4 files changed, 173 insertions(+), 30 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/7640/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7640
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iefe04b3e2e22123ecb8b3e494934c93dfb29682e
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5750: Catch exceptions from boost thread creation

2017-08-18 Thread Joe McDonnell (Code Review)
Joe McDonnell has uploaded a new patch set (#2).

Change subject: IMPALA-5750: Catch exceptions from boost thread creation
..

IMPALA-5750: Catch exceptions from boost thread creation

The boost thread constructor will throw
boost::thread_resource_error if it is unable to spawn
a thread on the system (e.g. due to a ulimit). This
uncaught exception crashes Impala. Systems with a large
number of nodes and threads are hitting this limit.

This change catches the exception from the thread
constructor and converts it to a Status. This requires
several changes:
1. util/thread.h's Thread constructor is now private
and all Threads are constructed via a new Create()
static factory method.
2. util/thread-pool.h's ThreadPool constructor is now
private and all ThreadPools are constructed via a
new Create() static factory method.
3. To propagate the Status, Threads and ThreadPools
cannot be created in constructors, so this is moved to
initialization methods that can return Status.
4. Since Threads and ThreadPools need to be passed
as arguments to the Create method, this eliminates
the ability to use them as stack-allocated local
variables or direct declarations in classes. These
have been replaced with scoped_ptr's.

Remaining TODO:
1. QueryState::StartFInstances() needs appropriate
cleanup if thread create fails. This is intricate.
2. There is some duplicate code between ThreadPool
and CallableThreadPool to try to eliminate.
3. The Thread::Create and ThreadPool::Create have
both pointer and scoped_ptr versions. It would be
nice to only have one version.

Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
---
M be/src/benchmarks/thread-create-benchmark.cc
M be/src/catalog/catalog-server.cc
M be/src/common/init.cc
M be/src/common/init.h
M be/src/exec/blocking-join-node.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/kudu-scan-node.cc
M be/src/rpc/TAcceptQueueServer.cpp
M be/src/rpc/authentication.cc
M be/src/rpc/thrift-server-test.cc
M be/src/rpc/thrift-server.cc
M be/src/rpc/thrift-thread.cc
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/parallel-executor.cc
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-state.cc
M be/src/scheduling/admission-controller.cc
M be/src/service/child-query.cc
M be/src/service/child-query.h
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-server.cc
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore.cc
M be/src/statestore/statestore.h
M be/src/statestore/statestored-main.cc
M be/src/testutil/in-process-servers.cc
M be/src/util/hdfs-bulk-ops.cc
M be/src/util/hdfs-bulk-ops.h
M be/src/util/thread-pool-test.cc
M be/src/util/thread-pool.h
M be/src/util/thread.cc
M be/src/util/thread.h
38 files changed, 371 insertions(+), 177 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/7730/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7730
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 


[Impala-ASF-CR] IMPALA-5750: Catch exceptions from boost thread creation

2017-08-18 Thread Joe McDonnell (Code Review)
Joe McDonnell has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7730

Change subject: IMPALA-5750: Catch exceptions from boost thread creation
..

IMPALA-5750: Catch exceptions from boost thread creation

The boost thread constructor will throw
boost::thread_resource_error if it is unable to spawn
a thread on the system (e.g. due to a ulimit). This
uncaught exception crashes Impala. Several customers
are seeing this.

This change catches the exception from the thread
constructor and converts it to a Status. This requires
several changes:
1. util/thread.h's Thread constructor is now private
and all Threads are constructed via a new Create()
static factory method.
2. util/thread-pool.h's ThreadPool constructor is now
private and all ThreadPools are constructed via a
new Create() static factory method.
3. To propagate the Status, Threads and ThreadPools
cannot be created in constructors, so this is moved to
initialization methods that can return Status.
4. Since Threads and ThreadPools need to be passed
as arguments to the Create method, this eliminates
the ability to use them as stack-allocated local
variables or direct declarations in classes. These
have been replaced with scoped_ptr's.

Remaining TODO:
1. QueryState::StartFInstances() needs appropriate
cleanup if thread create fails. This is intricate.
2. There is some duplicate code between ThreadPool
and CallableThreadPool to try to eliminate.
3. The Thread::Create and ThreadPool::Create have
both pointer and scoped_ptr versions. It would be
nice to only have one version.

Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
---
M be/src/benchmarks/thread-create-benchmark.cc
M be/src/catalog/catalog-server.cc
M be/src/common/init.cc
M be/src/common/init.h
M be/src/exec/blocking-join-node.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/kudu-scan-node.cc
M be/src/rpc/TAcceptQueueServer.cpp
M be/src/rpc/authentication.cc
M be/src/rpc/thrift-server-test.cc
M be/src/rpc/thrift-server.cc
M be/src/rpc/thrift-thread.cc
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/parallel-executor.cc
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-state.cc
M be/src/scheduling/admission-controller.cc
M be/src/service/child-query.cc
M be/src/service/child-query.h
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-server.cc
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore.cc
M be/src/statestore/statestore.h
M be/src/statestore/statestored-main.cc
M be/src/testutil/in-process-servers.cc
M be/src/util/hdfs-bulk-ops.cc
M be/src/util/hdfs-bulk-ops.h
M be/src/util/thread-pool-test.cc
M be/src/util/thread-pool.h
M be/src/util/thread.cc
M be/src/util/thread.h
38 files changed, 371 insertions(+), 177 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/7730/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7730
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 


[Impala-ASF-CR] IMPALA-5352: Age out unused file handles from the cache

2017-08-17 Thread Joe McDonnell (Code Review)
Joe McDonnell has uploaded a new patch set (#2).

Change subject: IMPALA-5352: Age out unused file handles from the cache
..

IMPALA-5352: Age out unused file handles from the cache

Currently, a file handle in the file handle cache will
only be evicted if the cache reaches its capacity. This
means that file handles can be retained for an indefinite
amount of time. This is true even for files that have
been deleted, replaced, or modified. Since a file handle
maintains a file descriptor for local files, this can
prevent the disk space from being freed. Additionally,
unused file handles are wasted memory.

This adds code to evict file handles that have been
unused for longer than a specified threshold. A thread
periodically checks the file handle cache to see if
any file handle should be evicted. The threshold is
specified by 'unused_file_handle_timeout'; it
defaults to 6 hours.

This adds a test to custom_cluster/test_hdfs_fd_caching.py
to verify the eviction behavior.

Change-Id: Iefe04b3e2e22123ecb8b3e494934c93dfb29682e
---
M be/src/runtime/disk-io-mgr-handle-cache.h
M be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr.cc
M tests/custom_cluster/test_hdfs_fd_caching.py
4 files changed, 151 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/7640/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7640
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iefe04b3e2e22123ecb8b3e494934c93dfb29682e
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5598: Fix excessive dumping in MemLimitExceeded

2017-08-15 Thread Joe McDonnell (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7597

to look at the new patch set (#4).

Change subject: IMPALA-5598: Fix excessive dumping in MemLimitExceeded
..

IMPALA-5598: Fix excessive dumping in MemLimitExceeded

ExecQueryFInstances RPC timeouts in stress tests were
traced to excessive dumping in MemLimitExceeded() and
LogUsage(). The QueryState::Init() hits the process
memory limit, and this causes MemLimitExceeded to dump
the process memory tracker. On the stress test, this
involves dumping hundreds of queries and all the
structures underneath. This is expensive and the
contention between threads accessing these structures
causes RPC timeouts.

This adds the ability to the limit the recursive depth
when dumping memory trackers. It also modifies the
dumping in MemLimitExceeded() to have the following
semantics:
1. The query memory tracker is always logged in full
if it exists.
2. The process memory tracker is logged if the query
memory tracker doesn't exist or if the process memory
limit is being hit. The process memory tracker is
limited to dumping only two levels of children.

Other uses of LogUsage() are not limited (e.g. /memz
and the query memory page on the web UI).

A stress test run with the process memory tracker
limited to two levels showed no RPC timeouts.

Change-Id: Ib151c3894d4c43b508779e6809d77e2f8af89cf2
---
M be/src/exec/exec-node.cc
M be/src/exec/hash-table-test.cc
M be/src/runtime/bufferpool/reservation-tracker-test.cc
M be/src/runtime/initial-reservations.cc
M be/src/runtime/mem-tracker.cc
M be/src/runtime/mem-tracker.h
M be/src/runtime/runtime-state.cc
M be/src/service/impala-http-handler.cc
M be/src/util/default-path-handlers.cc
9 files changed, 61 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/97/7597/4
-- 
To view, visit http://gerrit.cloudera.org:8080/7597
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib151c3894d4c43b508779e6809d77e2f8af89cf2
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5598: Fix excessive dumping in MemLimitExceeded

2017-08-15 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5598: Fix excessive dumping in MemLimitExceeded
..


Patch Set 3:

(2 comments)

Also rebased to the latest.

http://gerrit.cloudera.org:8080/#/c/7597/3/be/src/runtime/mem-tracker.cc
File be/src/runtime/mem-tracker.cc:

PS3, Line 353: logged_query_tracker
> rather than introducing another control variable, why not just 'state == nu
Done


http://gerrit.cloudera.org:8080/#/c/7597/3/be/src/runtime/mem-tracker.h
File be/src/runtime/mem-tracker.h:

PS3, Line 338: summary of the queries
> does this mean it it dumps the query-level tracker but nothing below? if so
Yes, it has a single-line summary of the query tracker. I updated the comment 
to indicate this.


-- 
To view, visit http://gerrit.cloudera.org:8080/7597
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib151c3894d4c43b508779e6809d77e2f8af89cf2
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5598: Fix excessive dumping in MemLimitExceeded

2017-08-14 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5598: Fix excessive dumping in MemLimitExceeded
..


Patch Set 2:

(1 comment)

Fix long line and fix a backend test I had missed.

http://gerrit.cloudera.org:8080/#/c/7597/2/be/src/exec/exec-node.cc
File be/src/exec/exec-node.cc:

Line 218:<< 
state->instance_mem_tracker()->LogUsage(MemTracker::UNLIMITED_DEPTH);
> Long line
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7597
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib151c3894d4c43b508779e6809d77e2f8af89cf2
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5598: Fix excessive dumping in MemLimitExceeded

2017-08-14 Thread Joe McDonnell (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/7597

to look at the new patch set (#3).

Change subject: IMPALA-5598: Fix excessive dumping in MemLimitExceeded
..

IMPALA-5598: Fix excessive dumping in MemLimitExceeded

ExecQueryFInstances RPC timeouts in stress tests were
traced to excessive dumping in MemLimitExceeded() and
LogUsage(). The QueryState::Init() hits the process
memory limit, and this causes MemLimitExceeded to dump
the process memory tracker. On the stress test, this
involves dumping hundreds of queries and all the
structures underneath. This is expensive and the
contention between threads accessing these structures
causes RPC timeouts.

This adds the ability to the limit the recursive depth
when dumping memory trackers. It also modifies the
dumping in MemLimitExceeded() to have the following
semantics:
1. The query memory tracker is always logged in full
if it exists.
2. The process memory tracker is logged if the query
memory tracker doesn't exist or if the process memory
limit is being hit. The process memory tracker is
limited to dumping only two levels of children.

Other uses of LogUsage() are not limited (e.g. /memz
and the query memory page on the web UI).

A stress test run with the process memory tracker
limited to two levels showed no RPC timeouts.

Change-Id: Ib151c3894d4c43b508779e6809d77e2f8af89cf2
---
M be/src/exec/exec-node.cc
M be/src/exec/hash-table-test.cc
M be/src/runtime/bufferpool/reservation-tracker-test.cc
M be/src/runtime/initial-reservations.cc
M be/src/runtime/mem-tracker.cc
M be/src/runtime/mem-tracker.h
M be/src/runtime/runtime-state.cc
M be/src/service/impala-http-handler.cc
M be/src/util/default-path-handlers.cc
9 files changed, 62 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/97/7597/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7597
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib151c3894d4c43b508779e6809d77e2f8af89cf2
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5598: Fix excessive dumping in MemLimitExceeded

2017-08-14 Thread Joe McDonnell (Code Review)
Joe McDonnell has uploaded a new patch set (#2).

Change subject: IMPALA-5598: Fix excessive dumping in MemLimitExceeded
..

IMPALA-5598: Fix excessive dumping in MemLimitExceeded

ExecQueryFInstances RPC timeouts in stress tests were
traced to excessive dumping in MemLimitExceeded() and
LogUsage(). The QueryState::Init() hits the process
memory limit, and this causes MemLimitExceeded to dump
the process memory tracker. On the stress test, this
involves dumping hundreds of queries and all the
structures underneath. This is expensive and the
contention between threads accessing these structures
causes RPC timeouts.

This adds the ability to the limit the recursive depth
when dumping memory trackers. It also modifies the
dumping in MemLimitExceeded() to have the following
semantics:
1. The query memory tracker is always logged in full
if it exists.
2. The process memory tracker is logged if the query
memory tracker doesn't exist or if the process memory
limit is being hit. The process memory tracker is
limited to dumping only two levels of children.

Other uses of LogUsage() are not limited (e.g. /memz
and the query memory page on the web UI).

A stress test run with the process memory tracker
limited to two levels showed no RPC timeouts.

Change-Id: Ib151c3894d4c43b508779e6809d77e2f8af89cf2
---
M be/src/exec/exec-node.cc
M be/src/exec/hash-table-test.cc
M be/src/runtime/initial-reservations.cc
M be/src/runtime/mem-tracker.cc
M be/src/runtime/mem-tracker.h
M be/src/runtime/runtime-state.cc
M be/src/service/impala-http-handler.cc
M be/src/util/default-path-handlers.cc
8 files changed, 61 insertions(+), 30 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/97/7597/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7597
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib151c3894d4c43b508779e6809d77e2f8af89cf2
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-5598: Fix excessive dumping in MemLimitExceeded

2017-08-14 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5598: Fix excessive dumping in MemLimitExceeded
..


Patch Set 1:

(3 comments)

Rebased all the way.

http://gerrit.cloudera.org:8080/#/c/7597/1/be/src/runtime/mem-tracker.cc
File be/src/runtime/mem-tracker.cc:

PS1, Line 341: (process_capacity < failed_allocation_size)
> nit: unnecessary parens.
Done


PS1, Line 344: 1
> +1 I was wondering about this as well. The pool tracker is probably not goi
Changed this to two levels. Stress tests show that this isn't a problem. I also 
added constants for the code to use.


http://gerrit.cloudera.org:8080/#/c/7597/1/be/src/runtime/mem-tracker.h
File be/src/runtime/mem-tracker.h:

PS1, Line 324:   int max_recursive_levels = INT_MAX
> Maybe we should make it a required argument to force new callsites to think
Changed this to a required argument.


-- 
To view, visit http://gerrit.cloudera.org:8080/7597
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib151c3894d4c43b508779e6809d77e2f8af89cf2
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5352: Age out unused file handles from the cache

2017-08-10 Thread Joe McDonnell (Code Review)
Joe McDonnell has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7640

Change subject: IMPALA-5352: Age out unused file handles from the cache
..

IMPALA-5352: Age out unused file handles from the cache

Currently, a file handle in the file handle cache will
only be evicted if the cache reaches its capacity. This
means that file handles can be retained for an indefinite
amount of time. This is true even for files that have
been deleted, replaced, or modified. Since a file handle
maintains a file descriptor for local files, this can
prevent the disk space from being freed. Additionally,
unused file handles are wasted memory.

This adds code to evict file handles that have been
unused for longer than a specified threshold. A thread
periodically checks the file handle cache to see if
any file handle should be evicted. The threshold is
specified by 'unused_file_handle_timeout_mins'; it
defaults to 6 hours.

This adds a test to custom_cluster/test_hdfs_fd_caching.py
to verify the eviction behavior.

Change-Id: Iefe04b3e2e22123ecb8b3e494934c93dfb29682e
---
M be/src/runtime/disk-io-mgr-handle-cache.h
M be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr.cc
M tests/custom_cluster/test_hdfs_fd_caching.py
4 files changed, 138 insertions(+), 19 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/7640/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7640
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Iefe04b3e2e22123ecb8b3e494934c93dfb29682e
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 


[Impala-ASF-CR] IMPALA-5158,IMPALA-5236: account for unused buffer pool reservations

2017-08-08 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5158,IMPALA-5236: account for unused buffer pool 
reservations
..


Patch Set 6: Code-Review+1

This looks good to me.

-- 
To view, visit http://gerrit.cloudera.org:8080/7380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idb1fa3110dc893321f9f4e8ced6b7ede12194dad
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5158,IMPALA-5236: account for unused buffer pool reservations

2017-08-08 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5158,IMPALA-5236: account for unused buffer pool 
reservations
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7380/5/be/src/runtime/mem-tracker.h
File be/src/runtime/mem-tracker.h:

PS5, Line 266: uint64_t val = consumption_metric_->value();
 : if (negate_consumption_metric_) val = -val;
val needs to be signed.


-- 
To view, visit http://gerrit.cloudera.org:8080/7380
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idb1fa3110dc893321f9f4e8ced6b7ede12194dad
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5598: Fix excessive dumping in MemLimitExceeded

2017-08-04 Thread Joe McDonnell (Code Review)
Joe McDonnell has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7597

Change subject: IMPALA-5598: Fix excessive dumping in MemLimitExceeded
..

IMPALA-5598: Fix excessive dumping in MemLimitExceeded

ExecQueryFInstances RPC timeouts in stress tests were
traced to excessive dumping in MemLimitExceeded() and
LogUsage(). The QueryState::Init() hits the process
memory limit, and this causes MemLimitExceeded to dump
the process memory tracker. On the stress test, this
involves dumping hundreds of queries and all the
structures underneath. This is expensive and the
contention between threads accessing these structures
causes RPC timeouts.

This adds the ability to the limit the recursive depth
when dumping memory trackers. It also modifies the
dumping in MemLimitExceeded() to have the following
semantics:
1. The query memory tracker is always logged in full
if it exists.
2. The process memory tracker is logged if the query
memory tracker doesn't exist or if the process memory
limit is being hit. The process memory tracker is
limited to dumping only its immediate children.

Other uses of LogUsage() are not limited (e.g. /memz
and the query memory page on the web UI).

A stress test run with the process memory tracker
limited to dumping its immediate children showed no
RPC timeouts.

Change-Id: Ib151c3894d4c43b508779e6809d77e2f8af89cf2
---
M be/src/runtime/mem-tracker.cc
M be/src/runtime/mem-tracker.h
2 files changed, 39 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/97/7597/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7597
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib151c3894d4c43b508779e6809d77e2f8af89cf2
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 


[Impala-ASF-CR] IMPALA-1575: Yield admission control resources at query end

2017-08-02 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change.

Change subject: IMPALA-1575: Yield admission control resources at query end
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7079/3/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

PS3, Line 82:   if (desc_tbl_ != nullptr) desc_tbl_->ReleaseResources();
This may need to be in ~QueryState(). I'm investigating how desc_tbl_ is used. 
So far, testing hasn't revealed any issue with having this here.


-- 
To view, visit http://gerrit.cloudera.org:8080/7079
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia5003d017b3142a160bacf7e3569ff26026b1700
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: Yes


  1   2   3   >