[Impala-ASF-CR] IMPALA-4543: Properly escape ignored tests subdirectories.

2016-11-29 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-4543: Properly escape ignored tests subdirectories.
..


Patch Set 1:

(2 comments)

Just wondering why we started hitting that ? Why isn't it showing up in other 
Jenkins jobs ?

http://gerrit.cloudera.org:8080/#/c/5242/1/tests/run-tests.py
File tests/run-tests.py:

PS1, Line 68: print >> sys.stderr, "run_tests arguments:", args
sys.stderr.write('Unexpected exception with pytest %s' % args)


PS1, Line 144: '\''
Is this supposed to be \' instead ?


-- 
To view, visit http://gerrit.cloudera.org:8080/5242
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I006eb559ec5f5b5b0379997fab945116dfc7e8d7
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4172/IMPALA-3653: Improvements to block metadata loading

2016-11-29 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
..


Patch Set 13: Code-Review+1

Thanks for the review Alex. Carrying +1.

-- 
To view, visit http://gerrit.cloudera.org:8080/5148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4172/IMPALA-3653: Improvements to block metadata loading

2016-11-29 Thread Bharath Vissapragada (Code Review)
Hello Alex Behm,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5148

to look at the new patch set (#13).

Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
..

IMPALA-4172/IMPALA-3653: Improvements to block metadata loading

This patch improves the block metadata loading (locations and disk
storage IDs) for partitioned and un-partitioned tables in the Catalog
server.

Without this patch:
--
We loop through each and every file in the table/partition directories
and call getFileBlockLocations() on it to obtain the block metadata.
This results in large number of RPC calls to the Namenode, especially
for tables with large no. of files/partitions.

With this patch:
---
We move the block metadata querying to use listStatus() call which
accepts a directory as input and fetches the 'BlockLocation' objects
for every file recursively in that directory. This improves the
metadata loading in the following ways.

- For non-partitioned tables, we query all the BlockLocations in a
single RPC call in the base table directory and load the corresponding
disk IDs.

- For partitioned tables, we query the BlockLocations for all the
partitions residing under the base table directories in a single RPC
and then load every partition with non-default partition directory
separately.

- REFRESH on a table reloads the block metadata from scratch for
every data file every time. So it can be used as a replacement for
invalidate in situations like HDFS block rebalancing which needs
block metadata update.

Also, this patch does away with VolumeIds returned by the HDFS NN
and uses the new StorageIDs returned by the BlockLocation class.
These StorageIDs are UUID strings and hence are mapped to a
per-node 0-based index as expected by the backend. In the upcoming
versions of Hadoop APIs, getFileBlockStorageLocations() is deprecated
and instead the listStatus() returns BlockLocations with storage IDs
embedded. This patch makes use of this improvement to reduce an
additional RPC to the NN to fetch the storage locations.

Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26
---
A fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
5 files changed, 379 insertions(+), 345 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/5148/13
-- 
To view, visit http://gerrit.cloudera.org:8080/5148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Mostafa Mokhtar 


[Impala-ASF-CR] IMPALA-4172/IMPALA-3653: Improvements to block metadata loading

2016-11-29 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
..


Patch Set 12:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/5148/12//COMMIT_MSG
Commit Message:

Line 15: We loop throuh each and every file in the table/partition directories
> through
Done


Line 17: This results in large no. of RPC calls to namenode, especially with
> no. -> number
Done


Line 35: 
> mention the behavior change of REFRESH
Done


http://gerrit.cloudera.org:8080/#/c/5148/12/fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java
File fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java:

Line 31:  * - To maintain consistent mapping across all the table instances so 
that the disk thread
> ... a consistent mapping ... so that the assignment of scan ranges to I/O t
Done


http://gerrit.cloudera.org:8080/#/c/5148/12/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

Line 288:* Queries the filesystem to load the file block metadata (e.g. DFS 
blocks) for the
> Suggest rephrasing:
Done


Line 296:*   and enumerate all its blocks and their corresponding hosts and 
disk IDs.
> remove part about disk ids, I think that's the next point
Done


Line 778: LOG.debug("partsByPath size: " + partsByPath.size());
> check log lvl
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

2016-11-29 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status 
reporting
..


Patch Set 4: Code-Review+1

(7 comments)

http://gerrit.cloudera.org:8080/#/c/5250/4/be/src/runtime/plan-fragment-executor.cc
File be/src/runtime/plan-fragment-executor.cc:

PS4, Line 294: if (!status.ok()) {
 :
nit: one line


PS4, Line 298: opened_promise_.Get()
here it seems clearer to return status.


PS4, Line 331: RETURN_IF_ERROR(opened_promise_.Get());
I'm still of the opinion that this logic is better placed in FragmentExecState, 
e.g.:

  if (fragment->Open().OK()) {
fragment->Exec();
  }
  fragment->Close();

and then here you could just DCHECK(opened_promise.IsSet() && 
opened_promise_.Get().OK()); which is a simpler lifecycle invariant. That's in 
keeping with how the PFE lifecycle is managed now, and it seems to make sense 
to have all that logic in one place.

For example, we don't have the same logic in Open() wrt prepare_promise_, which 
we should do for consistency with this method as implemented here.


PS4, Line 447: that
the


Line 489: 
how about DCHECK(!report_thread_active_) here? I'd like to get some guarantee 
that FragmentComplete() was called in all executions it should have been before 
Close().


http://gerrit.cloudera.org:8080/#/c/5250/3/be/src/runtime/plan-fragment-executor.h
File be/src/runtime/plan-fragment-executor.h:

PS3, Line 66: setting that flag to 0 disables periodic reporting altogether
> Isn't that what the next sentence is saying? Or am I misunderstanding your 
Oh yeah. I was thrown off by 'disables periodic reporting altogether', which 
while correct was confusing.


http://gerrit.cloudera.org:8080/#/c/5250/4/be/src/runtime/plan-fragment-executor.h
File be/src/runtime/plan-fragment-executor.h:

PS4, Line 249:  
Nit: extra space.


-- 
To view, visit http://gerrit.cloudera.org:8080/5250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4172/IMPALA-3653: Improvements to block metadata loading

2016-11-29 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
..


Patch Set 12: Code-Review+1

(7 comments)

I'm pretty happy with the changes. Let's let Dimitris take a look.

http://gerrit.cloudera.org:8080/#/c/5148/12//COMMIT_MSG
Commit Message:

Line 15: We loop throuh each and every file in the table/partition directories
through


Line 17: This results in large no. of RPC calls to namenode, especially with
no. -> number

to the NameNode

especially for large tables


Line 35: 
mention the behavior change of REFRESH


http://gerrit.cloudera.org:8080/#/c/5148/12/fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java
File fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java:

Line 31:  * - To maintain consistent mapping across all the table instances so 
that the disk thread
... a consistent mapping ... so that the assignment of scan ranges to I/O 
threads is balanced and consistent for all scans on the same host.


http://gerrit.cloudera.org:8080/#/c/5148/12/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

Line 288:* Queries the filesystem to load the file block metadata (e.g. DFS 
blocks) for the
Suggest rephrasing:

Drops and re-loads the block metadata for all partitions in 'partsByPath' whose 
location is under the given 'dirPath'. It involves the following steps:
...


Line 296:*   and enumerate all its blocks and their corresponding hosts and 
disk IDs.
remove part about disk ids, I think that's the next point


Line 778: LOG.debug("partsByPath size: " + partsByPath.size());
check log lvl


-- 
To view, visit http://gerrit.cloudera.org:8080/5148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2890: Support ALTER TABLE statements for Kudu tables

2016-11-29 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-2890: Support ALTER TABLE statements for Kudu tables
..


IMPALA-2890: Support ALTER TABLE statements for Kudu tables

With this commit, we add support for additional ALTER TABLE statements
against Kudu tables. The new supported ALTER TABLE operations for Kudu are:
- ADD/DROP range partitions. Syntax:
ALTER TABLE  ADD [IF NOT EXISTS] RANGE 
ALTER TABLE  DROP [IF EXISTS] RANGE 
- ADD/DROP/RENAME column. Syntax:
ALTER TABLE  ADD COLUMNS (col_spec, [col_spec, ...])
ALTER TABLE  DROP COLUMN 
ALTER TABLE  CHANGE COLUMN   
- Rename Kudu table using the 'kudu.table_name' table property. Example:
  ALTER TABLE  SET TBLPROPERTY ('kudu.tbl_name'=''),
  will change the underlying Kudu table name to .
- Renaming the HMS/Catalog table entry of a Kudu table is supported using the
  existing ALTER TABLE  RENAME TO  syntax.

Not supported:
- ALTER TABLE  REPLACE COLUMNS

Change-Id: I04bc87e04e05da5cc03edec79d13cedfd2012896
Reviewed-on: http://gerrit.cloudera.org:8080/5136
Reviewed-by: Dimitris Tsirogiannis 
Tested-by: Internal Jenkins
---
M common/thrift/JniCatalog.thrift
M fe/src/main/cup/sql-parser.cup
A 
fe/src/main/java/org/apache/impala/analysis/AlterTableAddDropRangePartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddReplaceColsStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableChangeColStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableSetFileFormatStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableSetLocationStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/DistributeParam.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M testdata/workloads/functional-query/queries/QueryTest/kudu_alter.test
M tests/query_test/test_kudu.py
20 files changed, 964 insertions(+), 58 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Dimitris Tsirogiannis: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/5136
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I04bc87e04e05da5cc03edec79d13cedfd2012896
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Internal Jenkins


[Impala-ASF-CR] IMPALA-2890: Support ALTER TABLE statements for Kudu tables

2016-11-29 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-2890: Support ALTER TABLE statements for Kudu tables
..


Patch Set 10: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5136
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I04bc87e04e05da5cc03edec79d13cedfd2012896
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Internal Jenkins
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4172/IMPALA-3653: Improvements to block metadata loading

2016-11-29 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has uploaded a new patch set (#12).

Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
..

IMPALA-4172/IMPALA-3653: Improvements to block metadata loading

This patch improves the block metadata loading (locations and disk
storage IDs) for partitioned and un-partitioned tables in the Catalog
server.

Without this patch:
--
We loop throuh each and every file in the table/partition directories
and call getFileBlockLocations() on it to obtain the block metadata.
This results in large no. of RPC calls to namenode, especially with
tables with large no. of files/partitions.

With this patch:
---
We move the block metadata querying to use listStatus() call which
accepts a directory as input and fetches the 'BlockLocation' objects
for every file recursively in that directory. This improves the
metadata loading in the following ways.

- For non-partitioned tables, we query all the BlockLocations in a
single RPC call in the base table directory and load the corresponding
disk IDs.

- For partitioned tables, we query the BlockLocations for all the
partitions residing under the base table directories in a single RPC
and then load every partition with non-default partition directory
separately.

Also, this patch does away with VolumeIds returned by the HDFS NN
and uses the new StorageIDs returned by the BlockLocation class.
These StorageIDs are UUID strings and hence are mapped to a
per-node 0-based index as expected by the backend. In the upcoming
versions of Hadoop APIs, getFileBlockStorageLocations() is deprecated
and instead the listStatus() returns BlockLocations with storage IDs
embedded. This patch makes use of this improvement to reduce an
additional RPC to the NN to fetch the storage locations.

Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26
---
A fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
5 files changed, 380 insertions(+), 344 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/5148/12
-- 
To view, visit http://gerrit.cloudera.org:8080/5148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Mostafa Mokhtar 


[Impala-ASF-CR] IMPALA-4172/IMPALA-3653: Improvements to block metadata loading

2016-11-29 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
..


Patch Set 11:

(40 comments)

http://gerrit.cloudera.org:8080/#/c/5148/11/fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java
File fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java:

Line 29:  * ids. This is internally implemented as a static object to share the 
storageIDs across
> Several sentences Starting with "This", sometimes not clear what "This" ref
Done


Line 44: // sequential 0-based integer disk id used by the hdfs scanners. 
This assumes that
> used by the BE scanners.
Done


Line 46: private ConcurrentHashMap storageIdToInt =
> storageUuidToDiskId
Done


Line 50: // with the corresponding latest 0-based integer ID. Given a 
storage UUID, we first
> with -> to
Done


Line 51: // look in 'storageIdtoInt' map if it already exists, else we 
generate a new ID for
> no need for this second sentence, that's what we do in getDiskId()
Done


Line 64:   if (Strings.isNullOrEmpty(storageUuid)) return diskId;
> let's move this check to the caller and require the storageUuid to be non-n
Done


Line 68:   // across the cluster, it is possible that we have a good hit 
rate.
> it is possible -> we expect to have a good hit rate.
Done


Line 72: // At this point, it could be possible due to race, that 
storageId might've been
> // Mapping might have been added by another thread that entered the synchro
Done


Line 76: // No mapping exists, create a new disk ID for 'storageId'
> stoprageUuid
Done


Line 80:   // Host hasn't been populated yet. Start with a 0-based id.
> // First disk id of this host.
Done


http://gerrit.cloudera.org:8080/#/c/5148/11/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

Line 289:* given directory 'dirPath'. Only blocks corresponding to paths 
from 'partsByPath'
> Mention that the block metadata is effectively dropped and re-created.
Done


Line 307:   RemoteIterator fileStatusIter = 
fs.listFiles(dirPath, true);
> move right before the while loop
Done


Line 308:   // Reset the current state of partitions under dirPath from 
perPartitionFileDescMap_
> Reset -> Clear
Done


Line 310:   for (Path partDir: partsByPath.keySet()) {
> Iterate over the entrySet(), you are using an unnecessary get(partDir) belo
Done


Line 321: String fileName = fileStatus.getPath().getName();
> move closer to where it's used, i.e. right above L336
Done


Line 322: if (!FileSystemUtil.isValidPartitionFile(fileStatus)) 
continue;
> isValidDataFile()
Done


Line 323: // At this point, file status points to a valid file under 
the directory. Now we
> // Find the partition that this file belongs (if any).
Done


Line 325: Path partPathDir = fileStatus.getPath().getParent();
> move closer to where it's used, i.e. L357
It is used in the next 2,3 lines? Moved partPathDirName to line before the loop.


Line 330: // Skip if this file doesn't correspond to any of the 
existing partitions.
> Skip if this file does not belong to any known partition.
Done


Line 332:   LOG.debug("File " + fileStatus.getPath().toString() + " 
doesn't correspond " +
> add if to check log level
Done


Line 333:   " to a valid partition. Skipping metadata load for this 
file.");
> to a known partition
Done


Line 361: 
Preconditions.checkState(perPartitionFileDescMap_.containsKey(partPathDirName));
> avoid duplicate lookup with get() + Preconditions.checkNotNull()
Done


Line 363: // Update the partitions metadata corresponding to this file 
descriptor
> partitions' metadata that this file belongs to.
Done


Line 371: LOG.warn("Unknown disk id count for filesystem " + fs + ":" + 
unknownDiskIdCount);
> add if to check log level
Done


Line 374:   throw new RuntimeException("Error loading block md for 
directory "
> block metadata
Done


Line 380:* Loads the disk IDs for BlockLocation 'location' and its 
corresponding file block.
> Comment should mention the storage UUID to disk id mapping.
Done


Line 386: String[] hosts;
> Do you know if these are the unique hosts? I'm wondering if storageIds and 
Earlier code had a if block for this. Per my understanding HDFS doesn't place 
more than one copy of the same block on the same machine. Still, I think its 
good to have it for debugging. I'm adding it with an error message.


Line 394: for (int i =0; i < storageIds.length; ++i) {
> int i = 0; (space after "=")
Done


Line 410: for (Path partPath: partsByPath.keySet()) {
> iterate over Map.Entry, you are doing a get() in the next line
Done


Line 412:   // For each valid file in the partPath, synthesize the block 
metadata.
> For each data file in the 

[Impala-ASF-CR] IMPALA-4482: Use ALTER TABLE / RECOVER PARTITIONS when loading tpcds.store sales

2016-11-29 Thread David Knupp (Code Review)
David Knupp has posted comments on this change.

Change subject: IMPALA-4482: Use ALTER TABLE / RECOVER PARTITIONS when loading 
tpcds.store_sales
..


Patch Set 3:

Any further thoughts on this?

-- 
To view, visit http://gerrit.cloudera.org:8080/5177
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iaae97d1d44201aeeacacdd39adbae35753512950
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Harrison Sheinblatt 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4554: fix projection of nested collections with mt dop > 0

2016-11-29 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4554: fix projection of nested collections with mt_dop > 0
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5270/1/testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet.test
File testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet.test:

Line 27: 
> remove
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5270
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I42e72eae8dfa78f7d53708eb8f2f0da8b780692d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4554: fix projection of nested collections with mt dop > 0

2016-11-29 Thread Tim Armstrong (Code Review)
Hello Alex Behm,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5270

to look at the new patch set (#2).

Change subject: IMPALA-4554: fix projection of nested collections with mt_dop > 0
..

IMPALA-4554: fix projection of nested collections with mt_dop > 0

Change-Id: I42e72eae8dfa78f7d53708eb8f2f0da8b780692d
---
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/unnest-node.cc
M testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet.test
3 files changed, 21 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/5270/2
-- 
To view, visit http://gerrit.cloudera.org:8080/5270
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I42e72eae8dfa78f7d53708eb8f2f0da8b780692d
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 


[Impala-ASF-CR] IMPALA-4554: fix projection of nested collections with mt dop > 0

2016-11-29 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4554: fix projection of nested collections with mt_dop > 0
..


Patch Set 1: Code-Review+2

(1 comment)

Perfect, thank you!

http://gerrit.cloudera.org:8080/#/c/5270/1/testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet.test
File testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet.test:

Line 27: 
remove


-- 
To view, visit http://gerrit.cloudera.org:8080/5270
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I42e72eae8dfa78f7d53708eb8f2f0da8b780692d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Alex Behm 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4554: fix projection of nested collections with mt dop > 0

2016-11-29 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/5270

Change subject: IMPALA-4554: fix projection of nested collections with mt_dop > 0
..

IMPALA-4554: fix projection of nested collections with mt_dop > 0

Change-Id: I42e72eae8dfa78f7d53708eb8f2f0da8b780692d
---
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/unnest-node.cc
M testdata/workloads/functional-query/queries/QueryTest/mt-dop-parquet.test
3 files changed, 22 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/70/5270/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5270
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I42e72eae8dfa78f7d53708eb8f2f0da8b780692d
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4172/IMPALA-3653: Improvements to block metadata loading

2016-11-29 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
..


Patch Set 11:

(40 comments)

http://gerrit.cloudera.org:8080/#/c/5148/11/fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java
File fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java:

Line 29:  * ids. This is internally implemented as a static object to share the 
storageIDs across
Several sentences Starting with "This", sometimes not clear what "This" 
references.

Would be nice to list the reasons in bullet point, something like this.

The storage UUID to disk id is global (shared across tables) for the following 
reasons:
1. Mention consistent scan range to disk thread assignment
2. Mention memory benefit
3. any more?


Line 44: // sequential 0-based integer disk id used by the hdfs scanners. 
This assumes that
used by the BE scanners.


Line 46: private ConcurrentHashMap storageIdToInt =
storageUuidToDiskId


Line 50: // with the corresponding latest 0-based integer ID. Given a 
storage UUID, we first
with -> to


Line 51: // look in 'storageIdtoInt' map if it already exists, else we 
generate a new ID for
no need for this second sentence, that's what we do in getDiskId()


Line 64:   if (Strings.isNullOrEmpty(storageUuid)) return diskId;
let's move this check to the caller and require the storageUuid to be non-null 
and non-empty in here


Line 68:   // across the cluster, it is possible that we have a good hit 
rate.
it is possible -> we expect to have a good hit rate.


Line 72: // At this point, it could be possible due to race, that 
storageId might've been
// Mapping might have been added by another thread that entered the 
synchronized block first.


Line 76: // No mapping exists, create a new disk ID for 'storageId'
stoprageUuid


Line 80:   // Host hasn't been populated yet. Start with a 0-based id.
// First disk id of this host.


http://gerrit.cloudera.org:8080/#/c/5148/11/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

Line 289:* given directory 'dirPath'. Only blocks corresponding to paths 
from 'partsByPath'
Mention that the block metadata is effectively dropped and re-created.

Please provide a brief summary of the steps involved in this function. 
something like:
1. Clear block metadata of partitions
2. Call HDFS to list all files
3. etc.


Line 307:   RemoteIterator fileStatusIter = 
fs.listFiles(dirPath, true);
move right before the while loop


Line 308:   // Reset the current state of partitions under dirPath from 
perPartitionFileDescMap_
Reset -> Clear


Line 310:   for (Path partDir: partsByPath.keySet()) {
Iterate over the entrySet(), you are using an unnecessary get(partDir) below.


Line 321: String fileName = fileStatus.getPath().getName();
move closer to where it's used, i.e. right above L336


Line 322: if (!FileSystemUtil.isValidPartitionFile(fileStatus)) 
continue;
isValidDataFile()


Line 323: // At this point, file status points to a valid file under 
the directory. Now we
// Find the partition that this file belongs (if any).


Line 325: Path partPathDir = fileStatus.getPath().getParent();
move closer to where it's used, i.e. L357


Line 330: // Skip if this file doesn't correspond to any of the 
existing partitions.
Skip if this file does not belong to any known partition.


Line 332:   LOG.debug("File " + fileStatus.getPath().toString() + " 
doesn't correspond " +
add if to check log level


Line 333:   " to a valid partition. Skipping metadata load for this 
file.");
to a known partition


Line 361: 
Preconditions.checkState(perPartitionFileDescMap_.containsKey(partPathDirName));
avoid duplicate lookup with get() + Preconditions.checkNotNull()


Line 363: // Update the partitions metadata corresponding to this file 
descriptor
partitions' metadata that this file belongs to.


Line 371: LOG.warn("Unknown disk id count for filesystem " + fs + ":" + 
unknownDiskIdCount);
add if to check log level


Line 374:   throw new RuntimeException("Error loading block md for 
directory "
block metadata


Line 380:* Loads the disk IDs for BlockLocation 'location' and its 
corresponding file block.
Comment should mention the storage UUID to disk id mapping.


Line 386: String[] hosts;
Do you know if these are the unique hosts? I'm wondering if storageIds and 
hosts are guaranteed to have the same length.


Line 394: for (int i =0; i < storageIds.length; ++i) {
int i = 0; (space after "=")


Line 410: for (Path partPath: partsByPath.keySet()) {
iterate over Map.Entry, you are doing a get() in the next line


Line 412:   // For each valid file in the partPath, synthesize the block 
metadata.
For each data 

[Impala-ASF-CR] IMPALA-4450: qgen: use string concatenation operator for postgres queries

2016-11-29 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-4450: qgen: use string concatenation operator for 
postgres queries
..


Patch Set 4: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/5034
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I149b695889addfd7df4ca5f40dc991456da51687
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs from query model

2016-11-29 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-4343,IMPALA-4354: qgen: model INSERTs; write INSERTs 
from query model
..


Patch Set 5:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/5162/5/tests/comparison/model_translator.py
File tests/comparison/model_translator.py:

Line 105:   def _write_query(self, query):
Where does _write_query get used?


PS5, Line 430: is not
How about changing the order of if and else to get rid of the not:
if insert_clause.column_permutations is None:


http://gerrit.cloudera.org:8080/#/c/5162/5/tests/comparison/query.py
File tests/comparison/query.py:

PS5, Line 35: self.parent
It's a good idea to add a comment explaining what each variable means.


Line 40:   def table_exprs(self):
I think it's kind of strange to have both @abstractproperty and the 
implementation. Is this a common pattern?


Line 50: return table_exprs
return self.with_clause.table_exprs if self.with_clause else []


PS5, Line 61: Query
I think the word query is not the best choice any more. Let's call this 
SelectStatement?


Line 84: 
why a blank line?


PS5, Line 661: column_permutations is an optional sequence of Column objects
Can you explain in more detail what this is? (e.g. is it a list? what are the 
element types?)


-- 
To view, visit http://gerrit.cloudera.org:8080/5162
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I38e24da78c49e908449b35f0a6276ebe4236ddba
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Michael Brown 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4553: ntpd must be synchronized for kudu to start.

2016-11-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4553: ntpd must be synchronized for kudu to start.
..


Patch Set 3: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5258
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I371e01e21246a8c0ece98ca7d4bf6761615127b4
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4553: ntpd must be synchronized for kudu to start.

2016-11-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4553: ntpd must be synchronized for kudu to start.
..


IMPALA-4553: ntpd must be synchronized for kudu to start.

When ntpd is not synchronized, kudu initialization fails on the master
node:

F1129 16:37:28.969956 15230 master_main.cc:68] Check failed:
_s.ok() Bad status: Service unavailable: Cannot initialize clock:
Error reading clock. Clock considered unsynchronized

Change-Id: I371e01e21246a8c0ece98ca7d4bf6761615127b4
Reviewed-on: http://gerrit.cloudera.org:8080/5258
Reviewed-by: Jim Apple 
Tested-by: Impala Public Jenkins
---
M testdata/cluster/admin
1 file changed, 7 insertions(+), 0 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Jim Apple: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/5258
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I371e01e21246a8c0ece98ca7d4bf6761615127b4
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

2016-11-29 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status 
reporting
..


Patch Set 4:

Henry, I tried doing some of the refactoring we discussed, but they don't seem 
like clear improvements to me because of how intertwined FES and PFE currently 
are.  For example, FES::ReportStatusCb() accesses PFE state directly and calls 
PFE methods like PFE::Cancel(), and so executing that code from either inside 
or after PFE::Close() seems confusing at best.

I do agree further cleanup here between FES and PFE classes is needed, but I 
think without a more wholistic cleanup we don't get anywhere better.  And I 
think the cleanup needs to be driven by the needs of MT or we will just redo it 
shortly.  So, I'd like to move forward with the fix in current form, which I 
think is still a nice improvement over the existing code since we get rid of 
some unnecessary thread-shared state and complexity.  Okay with you?

-- 
To view, visit http://gerrit.cloudera.org:8080/5250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4397 addendum: remove stray semicolon

2016-11-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4397 addendum: remove stray semicolon
..


Patch Set 1:

Build started: http://35.164.73.121:8080/job/gerrit-verify-dryrun/54/

-- 
To view, visit http://gerrit.cloudera.org:8080/5261
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie16e403ae0eb657f0614a1a90b0556f9f1d1056e
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4550: Fix CastExpr analysis for substituted slots

2016-11-29 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4550: Fix CastExpr analysis for substituted slots
..


Patch Set 3: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/5267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7f29cdc359558fad6df455b8eec0e0eaed00e996
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4397 addendum: remove stray semicolon

2016-11-29 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-4397 addendum: remove stray semicolon
..


Patch Set 1:

Failed because https://issues.cloudera.org/browse/IMPALA-4557

-- 
To view, visit http://gerrit.cloudera.org:8080/5261
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie16e403ae0eb657f0614a1a90b0556f9f1d1056e
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4550: Fix CastExpr analysis for substituted slots

2016-11-29 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-4550: Fix CastExpr analysis for substituted slots
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5267/2/testdata/workloads/functional-query/queries/QueryTest/exprs.test
File testdata/workloads/functional-query/queries/QueryTest/exprs.test:

Line 2572: where  (cast(a.string_col as String) > 'a');
> string (lowercase)
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7f29cdc359558fad6df455b8eec0e0eaed00e996
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4550: Fix CastExpr analysis for substituted slots

2016-11-29 Thread Lars Volker (Code Review)
Hello Alex Behm,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5267

to look at the new patch set (#3).

Change subject: IMPALA-4550: Fix CastExpr analysis for substituted slots
..

IMPALA-4550: Fix CastExpr analysis for substituted slots

During slot substitution, the type of the child of a CastExpr can
change. If the previous child type matched the CastExpr, then the cast
was flagged as noOp_. During substitution and subsequent re-analysis
the noOp_ flag was not revisited so that no cast was performed, even
after it had become necessary.

The fix is to always set noOp_ to the correct value in
CastExpr.analyze().

Change-Id: I7f29cdc359558fad6df455b8eec0e0eaed00e996
---
M fe/src/main/java/org/apache/impala/analysis/CastExpr.java
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
2 files changed, 16 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/5267/3
-- 
To view, visit http://gerrit.cloudera.org:8080/5267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7f29cdc359558fad6df455b8eec0e0eaed00e996
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-4550: Fix CastExpr analysis for substituted slots

2016-11-29 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4550: Fix CastExpr analysis for substituted slots
..


Patch Set 2: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5267/2/testdata/workloads/functional-query/queries/QueryTest/exprs.test
File testdata/workloads/functional-query/queries/QueryTest/exprs.test:

Line 2572: where  (cast(a.string_col as String) > 'a');
string (lowercase)


-- 
To view, visit http://gerrit.cloudera.org:8080/5267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7f29cdc359558fad6df455b8eec0e0eaed00e996
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4550: Fix CastExpr analysis for substituted slots

2016-11-29 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-4550: Fix CastExpr analysis for substituted slots
..


Patch Set 1:

(5 comments)

Thanks for the review. Please see PS2.

http://gerrit.cloudera.org:8080/#/c/5267/1//COMMIT_MSG
Commit Message:

Line 15: The fix is to set the noOp_ flag to the correct value in all cases.
> The fix is to always set noOp_ to the correct value in CastExpr.analyze().
Done


http://gerrit.cloudera.org:8080/#/c/5267/1/testdata/workloads/functional-query/queries/QueryTest/exprs.test
File testdata/workloads/functional-query/queries/QueryTest/exprs.test:

Line 2568: # explicit cast.
> a no-op explicit cast.
Done


Line 2569: SELECT a.id
> Add /* + straight_join */ to make sure the join order is fixed. A change in
Done


Line 2570: FROM   functional.alltypes a
> use alltypestiny, no need to do this "large" join
Done


Line 2572: WHERE  (Cast(a.string_col as String) > 'a');
> nit: use consistent casing for SQL keywords (everything lowercase like in t
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7f29cdc359558fad6df455b8eec0e0eaed00e996
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4550: Fix CastExpr analysis for substituted slots

2016-11-29 Thread Lars Volker (Code Review)
Hello Alex Behm,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5267

to look at the new patch set (#2).

Change subject: IMPALA-4550: Fix CastExpr analysis for substituted slots
..

IMPALA-4550: Fix CastExpr analysis for substituted slots

During slot substitution, the type of the child of a CastExpr can
change. If the previous child type matched the CastExpr, then the cast
was flagged as noOp_. During substitution and subsequent re-analysis
the noOp_ flag was not revisited so that no cast was performed, even
after it had become necessary.

The fix is to always set noOp_ to the correct value in
CastExpr.analyze().

Change-Id: I7f29cdc359558fad6df455b8eec0e0eaed00e996
---
M fe/src/main/java/org/apache/impala/analysis/CastExpr.java
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
2 files changed, 16 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/5267/2
-- 
To view, visit http://gerrit.cloudera.org:8080/5267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7f29cdc359558fad6df455b8eec0e0eaed00e996
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 


[Impala-ASF-CR] IMPALA-4397 addendum: remove stray semicolon

2016-11-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4397 addendum: remove stray semicolon
..


Patch Set 1: Verified-1

Build failed: http://35.164.73.121:8080/job/gerrit-verify-dryrun/52/

-- 
To view, visit http://gerrit.cloudera.org:8080/5261
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie16e403ae0eb657f0614a1a90b0556f9f1d1056e
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4357: Fix DROP TABLE to pass analysis if the table fails to load

2016-11-29 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has uploaded a new patch set (#2).

Change subject: IMPALA-4357: Fix DROP TABLE to pass analysis if the table fails 
to load
..

IMPALA-4357: Fix DROP TABLE to pass analysis if the table fails to load

If a table fails to load, eg. because it was deleted externally from
Kudu, we should still allow 'DROP TABLE' to pass analysis. Otherwise,
you may be unable to drop tables that are in a bad state.

Testing:
- Updates existing Kudu tests to reflect the new behavior, and fixes
a couple of problems with those tests that were causing them to pass
spuriously (as well as fixing the same problem with another test in
the file while I'm here).

Change-Id: I6b41fc3c0e95508ab67f1d420b033b02ec75a5da
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/DropTableOrViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/PartitionSpecBase.java
M tests/query_test/test_kudu.py
4 files changed, 30 insertions(+), 19 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/5144/2
-- 
To view, visit http://gerrit.cloudera.org:8080/5144
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6b41fc3c0e95508ab67f1d420b033b02ec75a5da
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-4550: Fix CastExpr analysis for substituted slots

2016-11-29 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4550: Fix CastExpr analysis for substituted slots
..


Patch Set 1: Code-Review+2

(5 comments)

Nice!

http://gerrit.cloudera.org:8080/#/c/5267/1//COMMIT_MSG
Commit Message:

Line 15: The fix is to set the noOp_ flag to the correct value in all cases.
The fix is to always set noOp_ to the correct value in CastExpr.analyze().


http://gerrit.cloudera.org:8080/#/c/5267/1/testdata/workloads/functional-query/queries/QueryTest/exprs.test
File testdata/workloads/functional-query/queries/QueryTest/exprs.test:

Line 2568: # explicit cast.
a no-op explicit cast.


Line 2569: SELECT a.id
Add /* + straight_join */ to make sure the join order is fixed. A change in 
join order would make us lose coverage.


Line 2570: FROM   functional.alltypes a
use alltypestiny, no need to do this "large" join


Line 2572: WHERE  (Cast(a.string_col as String) > 'a');
nit: use consistent casing for SQL keywords (everything lowercase like in the 
rest of this file)


-- 
To view, visit http://gerrit.cloudera.org:8080/5267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7f29cdc359558fad6df455b8eec0e0eaed00e996
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Alex Behm 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4550: Fix CastExpr analysis for substituted slots

2016-11-29 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/5267

Change subject: IMPALA-4550: Fix CastExpr analysis for substituted slots
..

IMPALA-4550: Fix CastExpr analysis for substituted slots

During slot substitution, the type of the child of a CastExpr can
change. If the previous child type matched the CastExpr, then the cast
was flagged as noOp_. During substitution and subsequent re-analysis
the noOp_ flag was not revisited so that no cast was performed, even
after it had become necessary.

The fix is to set the noOp_ flag to the correct value in all cases.

Change-Id: I7f29cdc359558fad6df455b8eec0e0eaed00e996
---
M fe/src/main/java/org/apache/impala/analysis/CastExpr.java
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
2 files changed, 16 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/5267/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5267
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I7f29cdc359558fad6df455b8eec0e0eaed00e996
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 


[Impala-ASF-CR] IMPALA-4363: Add Parquet timestamp validation

2016-11-29 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has uploaded a new patch set (#9).

Change subject: IMPALA-4363: Add Parquet timestamp validation
..

IMPALA-4363: Add Parquet timestamp validation

Before this patch, we would simply read the INT96 Parquet timestamp
representation and assume that it's valid. However, not all bit
permutations represent a valid timestamp. One of the boost functions
raised an exception (that we didn't catch) when passed an invalid
boost date object, which resulted in a crash. This patch fixes
problem by validating that the date falls into 1400.. year
range and time falls into 00:00:00..23:59:59.9 range as we
are scanning Parquet.

Change-Id: I9988449aa0dc0f39fabb91ce6cce0dd8a06e8bcf
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M be/src/runtime/timestamp-value.h
M common/thrift/generate_error_codes.py
M testdata/data/README
A testdata/data/out_of_range_timestamp.parquet
M testdata/workloads/functional-query/queries/QueryTest/grant_revoke.test
A 
testdata/workloads/functional-query/queries/QueryTest/out-of-range-timestamp-abort-on-error.test
A 
testdata/workloads/functional-query/queries/QueryTest/out-of-range-timestamp-continue-on-error.test
M 
testdata/workloads/functional-query/queries/QueryTest/parquet-resolution-by-name.test
M testdata/workloads/functional-query/queries/QueryTest/parquet.test
M tests/common/impala_test_suite.py
M tests/query_test/test_scanners.py
13 files changed, 150 insertions(+), 36 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/4968/9
-- 
To view, visit http://gerrit.cloudera.org:8080/4968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9988449aa0dc0f39fabb91ce6cce0dd8a06e8bcf
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-4363: Add Parquet timestamp validation

2016-11-29 Thread Taras Bobrovytsky (Code Review)
Hello Alex Behm,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4968

to look at the new patch set (#9).

Change subject: IMPALA-4363: Add Parquet timestamp validation
..

IMPALA-4363: Add Parquet timestamp validation

Before this patch, we would simply read the INT96 Parquet timestamp
representation and assume that it's valid. However, not all bit
permutations represent a valid timestamp. One of the boost functions
raised an exception (that we didn't catch) when passed an invalid
boost date object, which resulted in a crash. This patch fixes
problem by validating that the date falls into 1400.. year
range and time falls into 00:00:00..23:59:59.9 range as we
are scanning Parquet.

Change-Id: I9988449aa0dc0f39fabb91ce6cce0dd8a06e8bcf
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M be/src/runtime/timestamp-value.h
M common/thrift/generate_error_codes.py
M testdata/data/README
A testdata/data/out_of_range_timestamp.parquet
M testdata/workloads/functional-query/queries/QueryTest/grant_revoke.test
A 
testdata/workloads/functional-query/queries/QueryTest/out-of-range-timestamp-abort-on-error.test
A 
testdata/workloads/functional-query/queries/QueryTest/out-of-range-timestamp-continue-on-error.test
M 
testdata/workloads/functional-query/queries/QueryTest/parquet-resolution-by-name.test
M testdata/workloads/functional-query/queries/QueryTest/parquet.test
M tests/common/impala_test_suite.py
M tests/query_test/test_scanners.py
13 files changed, 150 insertions(+), 36 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/4968/9
-- 
To view, visit http://gerrit.cloudera.org:8080/4968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9988449aa0dc0f39fabb91ce6cce0dd8a06e8bcf
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-3398: Rework Impala documentation to be non-Cloudera-specific

2016-11-29 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-3398: Rework Impala documentation to be 
non-Cloudera-specific
..


Patch Set 2:

(13 comments)

> (12 comments)
 > 
 > Because I'm advocating re-doing the automated portions of this
 > change at a later time, I don't want to promote this patch set
 > as-is. Is it practical to say "abandon" for the code review but
 > still come back and pick out bits and pieces (e.g. the changes in
 > impala_jdbc.xml) to become part of a later code review?

Technically, yes - "Abandon"ed patches remain just as visible and can be 
resurrected with one click.

However, the usual way to do things is to just address the comments and upload 
a new "Patch Set", as I have done.

http://gerrit.cloudera.org:8080/#/c/5239/2//COMMIT_MSG
Commit Message:

Line 7: IMPALA-3398: Rework Impala documentation to be non-Cloduera-specific
> "Cloudera"
Done


PS2, Line 16: sed -i 's/ rev=""//g' $FILENAME
> Let's leave this step out, for a couple of reasons:
Done


http://gerrit.cloudera.org:8080/#/c/5239/2/docs/impala_html.ditaval
File docs/impala_html.ditaval:

PS2, Line 6: 
> That's fine. We will need to make the same addition in the Cloudera doc bui
SGTM.

Keep in mind that this gerrit instance, though hosted at cloudera.org, is for 
Apache Impala, so whatever choices Cloudera makes about their git repo is not 
really in-scope here.


http://gerrit.cloudera.org:8080/#/c/5239/2/docs/topics/impala_alter_table.xml
File docs/topics/impala_alter_table.xml:

PS2, Line 25:   ALTER TABLE 
statement
> For the  tags, maybe we could change those to audience="HTML" so
I think that's orthogonal to this change. It's usually best if patches address 
a single item, along with maybe some other trivial ones, not big ones like "Are 
index terms included".

As such, I'd suggest it go in another commit and we keep this one focused on 
excising "Cloudera"/"CDH"/"Cloudera Manager".


PS2, Line 220: 
> The reason I'm so liberal with release numbers and JIRA numbers with this r
SGTM


PS2, Line 406: 
> Here's a case where perhaps there is a corresponding IMPALA- JIRA number we
Done


http://gerrit.cloudera.org:8080/#/c/5239/2/docs/topics/impala_conversion_functions.xml
File docs/topics/impala_conversion_functions.xml:

PS2, Line 30:   Although in , the 
SHOW FUNCTIONS output for
> Just checking that this was a change introduced by Jim. This is a case wher
Yes, it was.


PS2, Line 443: 
> These castto*() functions might be intended to be permanently for internal 
I think that's an issue for a future patch, since it's a question of docs 
correctness, not of ASF/Cloudera separation.


PS2, Line 619: 
> Hmm, this might have a corresponding IMPALA- JIRA number that should go in 
I'd call this future work.


http://gerrit.cloudera.org:8080/#/c/5239/2/docs/topics/impala_jdbc.xml
File docs/topics/impala_jdbc.xml:

PS2, Line 82: In Impala 2.0 and later, you must use the Hive 0.13 JDBC 
driver.  If you are
: already using JDBC applications with an earlier Impala 
release, you must update
: your JDBC driver, because the Hive 0.12 driver that was 
formerly the only choice
: is not compatible with Impala
> Urgh. "Must" seems like too strong a statement. I wonder if the right thing
Removed "must"s.

I'd say Apache Impala's docs should talk about open-source JDBC drivers and 
vendor's docs should talk about their own JDBC drivers.


PS2, Line 120:   To get the JAR files, install the Hive JDBC driver on 
each host in the cluster that will run
> It's worth checking if there's a generic Hadoop or Hive URL we could refer 
Couldn't find one; added TODO


PS2, Line 139:   hive-common-X.XX.X.jar
 :   hive-jdbc-X.XX.X.jar
 :   hive-metastore-X.XX.X.jar
 :   hive-servi
> We should confirm that these filenames are accurate if someone goes the gen
I poked around on Google and it appears so.


PS2, Line 161:   https://github.com/onefoursix/Cloudera-Impala-JDBC-Example; 
scope="external" format="html">this
> Ironic that there's a reference to "Cloudera Impala" that perhaps we could 
indeed!


-- 
To view, visit http://gerrit.cloudera.org:8080/5239
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib3f63fb309e0617d7fe014231bb0ab0ad67c8474
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Anonymous Coward #250
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: John Russell 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3398: Rework Impala documentation to be non-Cloudera-specific

2016-11-29 Thread Jim Apple (Code Review)
Jim Apple has uploaded a new patch set (#3).

Change subject: IMPALA-3398: Rework Impala documentation to be 
non-Cloudera-specific
..

IMPALA-3398: Rework Impala documentation to be non-Cloudera-specific

This fix is a prototype for discussion. The changes were partially
automated through:

replace 'audience="Cloudera"' 'audience="Hidden"' -- $FILENAME

sed -ri 's/(rev="[^"]*)(CDH-[0-9]+)([^"]*")/\1\3/g' $FILENAME

sed -ri 's/(rev="[^"]*) "/\1"/g' $FILENAME

sed -ri 's/(rev="[^"]* ) /\1/g' $FILENAME

Change-Id: Ib3f63fb309e0617d7fe014231bb0ab0ad67c8474
---
M docs/impala_html.ditaval
M docs/impala_pdf.ditaval
M docs/topics/impala_alter_table.xml
M docs/topics/impala_conversion_functions.xml
M docs/topics/impala_jdbc.xml
M docs/topics/impala_operators.xml
6 files changed, 93 insertions(+), 157 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/39/5239/3
-- 
To view, visit http://gerrit.cloudera.org:8080/5239
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib3f63fb309e0617d7fe014231bb0ab0ad67c8474
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Anonymous Coward #250
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: John Russell 


[Impala-ASF-CR] IMPALA-4363: Add Parquet timestamp validation

2016-11-29 Thread Taras Bobrovytsky (Code Review)
Hello Alex Behm,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4968

to look at the new patch set (#9).

Change subject: IMPALA-4363: Add Parquet timestamp validation
..

IMPALA-4363: Add Parquet timestamp validation

Before this patch, we would simply read the INT96 Parquet timestamp
representation and assume that it's valid. However, not all bit
permutations represent a valid timestamp. One of the boost functions
raised an exception (that we didn't catch) when passed an invalid
boost date object, which resulted in a crash. This patch fixes
problem by validating that the date falls into 1400.. year
range and time falls into 00:00:00..23:59:59.9 range as we
are scanning Parquet.

Change-Id: I9988449aa0dc0f39fabb91ce6cce0dd8a06e8bcf
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M be/src/runtime/timestamp-value.h
M common/thrift/generate_error_codes.py
M testdata/data/README
A testdata/data/out_of_range_timestamp.parquet
M testdata/workloads/functional-query/queries/QueryTest/grant_revoke.test
A 
testdata/workloads/functional-query/queries/QueryTest/out-of-range-timestamp-abort-on-error.test
A 
testdata/workloads/functional-query/queries/QueryTest/out-of-range-timestamp-continue-on-error.test
M 
testdata/workloads/functional-query/queries/QueryTest/parquet-resolution-by-name.test
M testdata/workloads/functional-query/queries/QueryTest/parquet.test
M tests/common/impala_test_suite.py
M tests/query_test/test_scanners.py
13 files changed, 152 insertions(+), 36 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/4968/9
-- 
To view, visit http://gerrit.cloudera.org:8080/4968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9988449aa0dc0f39fabb91ce6cce0dd8a06e8bcf
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-4550: Fix CastExpr analysis for substituted slots

2016-11-29 Thread Lars Volker (Code Review)
Lars Volker has abandoned this change.

Change subject: IMPALA-4550: Fix CastExpr analysis for substituted slots
..


Abandoned

Needs rebase.

-- 
To view, visit http://gerrit.cloudera.org:8080/5265
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: Id8841fe7fd9106041e9b68970adf265d2aa22be2
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 


[Impala-ASF-CR] IMPALA-4550: Fix CastExpr analysis for substituted slots

2016-11-29 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/5265

Change subject: IMPALA-4550: Fix CastExpr analysis for substituted slots
..

IMPALA-4550: Fix CastExpr analysis for substituted slots

During slot substitution, the type of the child of a CastExpr can
change. If the previous child type matched the CastExpr, then the cast
was flagged as noOp_. During substitution and subsequent re-analysis
the noOp_ flag was not revisited so that no cast was performed, even
after it had become necessary.

The fix is to set the noOp_ flag to the correct value in all cases.

Change-Id: Id8841fe7fd9106041e9b68970adf265d2aa22be2
---
M fe/src/main/java/org/apache/impala/analysis/CastExpr.java
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
2 files changed, 16 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/5265/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5265
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Id8841fe7fd9106041e9b68970adf265d2aa22be2
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker 


[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

2016-11-29 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status 
reporting
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5250/3/be/src/runtime/plan-fragment-executor.cc
File be/src/runtime/plan-fragment-executor.cc:

PS3, Line 421: id PlanFragmentExecutor::Sen
nit: This is in 2 places now. Here and ReportStatusCb().


-- 
To view, visit http://gerrit.cloudera.org:8080/5250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4363: Add Parquet timestamp validation

2016-11-29 Thread Taras Bobrovytsky (Code Review)
Hello Alex Behm,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4968

to look at the new patch set (#9).

Change subject: IMPALA-4363: Add Parquet timestamp validation
..

IMPALA-4363: Add Parquet timestamp validation

Before this patch, we would simply read the INT96 Parquet timestamp
representation and assume that it's valid. However, not all bit
permutations represent a valid timestamp. One of the boost functions
raised an exception (that we didn't catch) when passed an invalid
boost date object, which resulted in a crash. This patch fixes
problem by validating that the date falls into 1400.. year
range and time falls into 00:00:00..23:59:59.9 range as we
are scanning Parquet.

Change-Id: I9988449aa0dc0f39fabb91ce6cce0dd8a06e8bcf
---
M be/src/exec/parquet-column-readers.cc
M be/src/exec/parquet-column-readers.h
M be/src/runtime/timestamp-value.h
M common/thrift/generate_error_codes.py
M testdata/data/README
A testdata/data/out_of_range_timestamp.parquet
M testdata/workloads/functional-query/queries/QueryTest/grant_revoke.test
A 
testdata/workloads/functional-query/queries/QueryTest/out-of-range-timestamp-abort-on-error.test
A 
testdata/workloads/functional-query/queries/QueryTest/out-of-range-timestamp-continue-on-error.test
M 
testdata/workloads/functional-query/queries/QueryTest/parquet-resolution-by-name.test
M testdata/workloads/functional-query/queries/QueryTest/parquet.test
M tests/common/impala_test_suite.py
M tests/query_test/test_scanners.py
13 files changed, 152 insertions(+), 36 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/4968/9
-- 
To view, visit http://gerrit.cloudera.org:8080/4968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9988449aa0dc0f39fabb91ce6cce0dd8a06e8bcf
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-4172/IMPALA-3653: Improvements to block metadata loading

2016-11-29 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has uploaded a new patch set (#11).

Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
..

IMPALA-4172/IMPALA-3653: Improvements to block metadata loading

This patch improves the block metadata loading (locations and disk
storage IDs) for partitioned and un-partitioned tables in the Catalog
server.

Without this patch:
--
We loop throuh each and every file in the table/partition directories
and call getFileBlockLocations() on it to obtain the block metadata.
This results in large no. of RPC calls to namenode, especially with
tables with large no. of files/partitions.

With this patch:
---
We move the block metadata querying to use listStatus() call which
accepts a directory as input and fetches the 'BlockLocation' objects
for every file recursively in that directory. This improves the
metadata loading in the following ways.

- For non-partitioned tables, we query all the BlockLocations in a
single RPC call in the base table directory and load the corresponding
disk IDs.

- For partitioned tables, we query the BlockLocations for all the
partitions residing under the base table directories in a single RPC
and then load every partition with non-default partition directory
separately.

Also, this patch does away with VolumeIds returned by the HDFS NN
and uses the new StorageIDs returned by the BlockLocation class.
These StorageIDs are UUID strings and hence are mapped to a
per-node 0-based index as expected by the backend. In the upcoming
versions of Hadoop APIs, getFileBlockStorageLocations() is deprecated
and instead the listStatus() returns BlockLocations with storage IDs
embedded. This patch makes use of this improvement to reduce an
additional RPC to the NN to fetch the storage locations.

Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26
---
A fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
4 files changed, 358 insertions(+), 341 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/5148/11
-- 
To view, visit http://gerrit.cloudera.org:8080/5148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Mostafa Mokhtar 


[Impala-ASF-CR] IMPALA-4363: Add Parquet timestamp validation

2016-11-29 Thread Taras Bobrovytsky (Code Review)
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-4363: Add Parquet timestamp validation
..


Patch Set 8:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/4968/8/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

Line 473: ErrorMsg* msg = NULL;
> see comment below. this pointer is never freed.  what's the objection to pu
Ok, I agree, it's better to put the logic inside ValidateSlot().


Line 484:   // This point is reached if slot validation succeeds, but slot 
conversion fails.
> but don't we need to validate the converted value? i.e. the timestamp might
Convert slot should never return an invalid value. I tried it manually and it 
works correctly.


Line 603: *msg = new ErrorMsg(TErrorCode::PARQUET_TIMESTAMP_OUT_OF_RANGE,
> this memory is leaked.
This is not a problem any more.


http://gerrit.cloudera.org:8080/#/c/4968/7/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

PS7, Line 160: g buffer. The size of the buffer should be a
> What I'm asking is: don't some days legitimately have more than NUMBER_OF_N
Yes, that's true, some days can have an extra second: 
https://en.wikipedia.org/wiki/Leap_second So far there hasn't been a case where 
a day has 2 leap seconds, but it's not impossible in the future.

If we do this, the way seconds are printed to screen would be incorrect, Impala 
would print "24:00:00" instead of "23:59:60".

Also, we would need to be able to cast string as timestamp that looks like 
"23:59:60" if we want to support leap seconds.

I'm not really sure what's the right solution here. Should we just increase the 
constant by 1?

By the way, even Oracle doesn't handle leap seconds properly (the number of 
seconds can't be greater that 59): 
http://stackoverflow.com/questions/31136211/how-to-handle-leap-seconds-in-oracle


-- 
To view, visit http://gerrit.cloudera.org:8080/4968
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9988449aa0dc0f39fabb91ce6cce0dd8a06e8bcf
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Taras Bobrovytsky 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

2016-11-29 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status 
reporting
..


Patch Set 3:

(1 comment)

Patchset 4 addresses the comments that don't involve refactoring, in case we 
want to use this option. I'll try some minor refactoring in the next patchset.

http://gerrit.cloudera.org:8080/#/c/5250/3/be/src/runtime/plan-fragment-executor.h
File be/src/runtime/plan-fragment-executor.h:

PS3, Line 66: setting that flag to 0 disables periodic reporting altogether
> Update comment to say that at least one report (the final one) will always 
Isn't that what the next sentence is saying? Or am I misunderstanding your 
comment?


-- 
To view, visit http://gerrit.cloudera.org:8080/5250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

2016-11-29 Thread Dan Hecht (Code Review)
Dan Hecht has uploaded a new patch set (#4).

Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status 
reporting
..

IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

The PlanFragmentExecutor has some state shared between the main
execution thread and the periodic reporting thread that isn't
synchronized properly.  IMPALA-4504 describes one such problem, and that
bug was introduced in an attempt to fix another similar race.

Let's just simplify all of this and remove this shared state.  Instead,
the profile thread will always be responsible for sending periodic
'!done' messages, and the main execution thread will always be
responsible for sending the final 'done' message (after joining the
periodic thread).

This will allow for even more simplification, in particular the
interaction between FragementExecState and PlanFragementExecutor, but
I'm not doing that now as to avoid more conflicts with the MT work.

Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
---
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
2 files changed, 41 insertions(+), 96 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/5250/4
-- 
To view, visit http://gerrit.cloudera.org:8080/5250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 


[Impala-ASF-CR] IMPALA-4172/IMPALA-3653: Improvements to block metadata loading

2016-11-29 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
..


Patch Set 9:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/5148/9/fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java
File fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java:

Line 38: private static DiskIdMapper INSTANCE = new DiskIdMapper();
> just make it public
Done


http://gerrit.cloudera.org:8080/#/c/5148/9/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

Line 403:   for (List partLists: partsByPath.values()) {
> It seems simpler to move this logic into the while loop above and update th
Done


Line 417:   loadDiskIds(perFsFileBlocks);
> I think we can get rid of this function and move the logic into the while l
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3398: Rework Impala documentation to be non-Cloduera-specific

2016-11-29 Thread John Russell (Code Review)
John Russell has posted comments on this change.

Change subject: IMPALA-3398: Rework Impala documentation to be 
non-Cloduera-specific
..


Patch Set 2:

(12 comments)

Because I'm advocating re-doing the automated portions of this change at a 
later time, I don't want to promote this patch set as-is. Is it practical to 
say "abandon" for the code review but still come back and pick out bits and 
pieces (e.g. the changes in impala_jdbc.xml) to become part of a later code 
review?

http://gerrit.cloudera.org:8080/#/c/5239/2//COMMIT_MSG
Commit Message:

PS2, Line 16: sed -i 's/ rev=""//g' $FILENAME
Let's leave this step out, for a couple of reasons:
- There's no harm to have rev="" with an empty attribute value.
- There are some cases where rev="CDH-xyz" could also be annotated with the 
IMPALA- number, so I think it would be useful to leave behind the empty 
attribute as a reminder that this was a change due to bug fix or feature 
addition.
- Sometimes for new feature docs, I add rev="" when I don't have the JIRA 
number handy and come back and fill it in later.


http://gerrit.cloudera.org:8080/#/c/5239/2/docs/impala_html.ditaval
File docs/impala_html.ditaval:

PS2, Line 6: 
That's fine. We will need to make the same addition in the Cloudera doc build 
so that the attribute name works transparently in both contexts.


http://gerrit.cloudera.org:8080/#/c/5239/2/docs/topics/impala_alter_table.xml
File docs/topics/impala_alter_table.xml:

PS2, Line 25:   ALTER TABLE 
statement
For the  tags, maybe we could change those to audience="HTML" so 
they were visible in the PDF. The Cloudera doc team has always been resistant 
to using index entries but I do have them (just hidden) all through the doc 
source. We should do a trial and see if enabling them produces a decent-looking 
index in the PDF.


PS2, Line 220: 
The reason I'm so liberal with release numbers and JIRA numbers with this rev= 
attribute is in hopes of automating an "errata" mechanism in future. I.e. "what 
changed in such-and-such a release, what changed due to such-and-such an 
issue". I have an 80% working prototype that I should demo for you guys, to 
judge should that be something that becomes part of the upstream docs or 
continue development on it as a Cloudera-only extension.


PS2, Line 406: 
Here's a case where perhaps there is a corresponding IMPALA- JIRA number we 
should dig up and include in the rev= attribute. That's why I would say to 
leave an empty rev="" behind, so as not to lose that aspect of the history.


http://gerrit.cloudera.org:8080/#/c/5239/2/docs/topics/impala_conversion_functions.xml
File docs/topics/impala_conversion_functions.xml:

PS2, Line 30:   Although in , the 
SHOW FUNCTIONS output for
Just checking that this was a change introduced by Jim. This is a case where we 
could re-eveluate whether to take out that note, if the situation it refers to 
is no longer applicable. (I haven't confirmed in this particular case yet.)


PS2, Line 443: 
These castto*() functions might be intended to be permanently for internal 
Impala use only, in which case we could remove the corresponding info entirely.


PS2, Line 619: 
Hmm, this might have a corresponding IMPALA- JIRA number that should go in the 
rev=.


http://gerrit.cloudera.org:8080/#/c/5239/2/docs/topics/impala_jdbc.xml
File docs/topics/impala_jdbc.xml:

PS2, Line 82: In Impala 2.0 and later, you must use the Hive 0.13 JDBC 
driver.  If you are
: already using JDBC applications with an earlier Impala 
release, you must update
: your JDBC driver, because the Hive 0.12 driver that was 
formerly the only choice
: is not compatible with Impala
Urgh. "Must" seems like too strong a statement. I wonder if the right thing to 
do here is to say you definitely can use the Hive driver, and maybe there's a 
vendor-specific driver. Because so many times in the past we've had to warn 
some people away from the Hive driver for performance reasons, and direct other 
people to the Hive driver to work around some bug in the Cloudera driver.


PS2, Line 120:   To get the JAR files, install the Hive JDBC driver on 
each host in the cluster that will run
It's worth checking if there's a generic Hadoop or Hive URL we could refer to 
here.


PS2, Line 139:   hive-common-X.XX.X.jar
 :   hive-jdbc-X.XX.X.jar
 :   hive-metastore-X.XX.X.jar
 :   hive-servi
We should confirm that these filenames are accurate if someone goes the generic 
route when installing the Hive driver.


PS2, Line 161:   https://github.com/onefoursix/Cloudera-Impala-JDBC-Example; 
scope="external" format="html">this
Ironic that there's a reference to "Cloudera Impala" that perhaps we could 
never actually get rid of.


-- 
To view, visit http://gerrit.cloudera.org:8080/5239
To unsubscribe, visit 

[Impala-ASF-CR] Fix E2E test infrastructure to handle missing exceptions correctly

2016-11-29 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: Fix E2E test infrastructure to handle missing exceptions 
correctly
..


Patch Set 2: Code-Review+2

We should do an exhaustive run before merging.

-- 
To view, visit http://gerrit.cloudera.org:8080/5260
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7d83c5db59e8a239e4e70694a1e625af6f21419c
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4512: Add a script that builds Impala on stock Ubuntu 14.04.

2016-11-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4512: Add a script that builds Impala on stock Ubuntu 
14.04.
..


IMPALA-4512: Add a script that builds Impala on stock Ubuntu 14.04.

This is a simpler alternative to bootstrap_development.sh - it
acquires enough dependencies to build, but does not attempt to load
the test data or even build the tests. This is sometimes a lightweight
testing method used by Apache PPMC members who are voting on a release
of an incubating project.

Change-Id: If34e398052a61dfda9825b1cf3a918eb61736048
Reviewed-on: http://gerrit.cloudera.org:8080/5154
Reviewed-by: Jim Apple 
Tested-by: Impala Public Jenkins
---
M README.md
A bin/bootstrap_build.sh
2 files changed, 39 insertions(+), 1 deletion(-)

Approvals:
  Impala Public Jenkins: Verified
  Jim Apple: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/5154
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: If34e398052a61dfda9825b1cf3a918eb61736048
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4512: Add a script that builds Impala on stock Ubuntu 14.04.

2016-11-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4512: Add a script that builds Impala on stock Ubuntu 
14.04.
..


Patch Set 5: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5154
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If34e398052a61dfda9825b1cf3a918eb61736048
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] Fix E2E test infrastructure to handle missing exceptions correctly

2016-11-29 Thread Michael Ho (Code Review)
Michael Ho has uploaded a new patch set (#2).

Change subject: Fix E2E test infrastructure to handle missing exceptions 
correctly
..

Fix E2E test infrastructure to handle missing exceptions correctly

This change fixes a bug in the E2E infrastructure that handles
the case when an expected exception wasn't thrown. The code was
expecting that test_section['CATCH'] to be a string but in
reality it's a list of strings. It also clarifies the error
message about the missing exception. This change also enforces
that the CATCH subsection in tests cannot be empty.

Change-Id: I7d83c5db59e8a239e4e70694a1e625af6f21419c
---
M tests/common/impala_test_suite.py
M tests/util/test_file_parser.py
2 files changed, 7 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/5260/2
-- 
To view, visit http://gerrit.cloudera.org:8080/5260
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7d83c5db59e8a239e4e70694a1e625af6f21419c
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3875: Thrift threaded server hang in some cases

2016-11-29 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/5263

Change subject: IMPALA-3875: Thrift threaded server hang in some cases
..

IMPALA-3875: Thrift threaded server hang in some cases

We use socket timeouts for our backend connections. We set these
timeouts only after we've open()'d the connection, which ideally
should be fine. However, our TSaslTransport stack does read()'s and
write()'s over the network on an open(), which means that on a secure
cluster we send and recieve non-TCP-handshake packets on open(). This
is because the current code tries to establish a SASL handshake during
open().

If for any reason the peer server does not respond to the read()'s
during the open() call (after connect() is successful), the client
will wait on read() indefinitely. This patch sets the socket timeout
before we call open(), so that the read()'s and write()'s during the
open() are subject to the timeout as well.

We should also consider making a larger change where this SASL
handshake does not take place during an open(), but instead after the
open() call is completed, so as to have the open() semantics be the
same for both secure and insecure clusters.

Change-Id: I6c8f91a88f723e0e58e81bb385c5a8f190021868
---
M be/src/runtime/client-cache.cc
1 file changed, 5 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/5263/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5263
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I6c8f91a88f723e0e58e81bb385c5a8f190021868
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-4553: ntpd must be synchronized for kudu to start.

2016-11-29 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-4553: ntpd must be synchronized for kudu to start.
..


Patch Set 3: Code-Review+2

carry Tim's

-- 
To view, visit http://gerrit.cloudera.org:8080/5258
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I371e01e21246a8c0ece98ca7d4bf6761615127b4
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[Impala-ASF-CR] Start a docs build system.

2016-11-29 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: Start a docs build system.
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5238/2/docs/Makefile
File docs/Makefile:

> High-level question: would it make sense to use CMake for consistency? It w
I considered it, but I thought for a build this primitive it would be best to 
start with the simpler tool. My perception is that Make is a simpler tool, and 
we already expect users to have it (or Ninja) installed.

What do you think?


-- 
To view, visit http://gerrit.cloudera.org:8080/5238
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9551b75268cb7cb29a58367a3ef03b127dccbfca
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Anonymous Coward #250
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4553: ntpd must be synchronized for kudu to start.

2016-11-29 Thread Jim Apple (Code Review)
Hello Todd Lipcon, Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5258

to look at the new patch set (#3).

Change subject: IMPALA-4553: ntpd must be synchronized for kudu to start.
..

IMPALA-4553: ntpd must be synchronized for kudu to start.

When ntpd is not synchronized, kudu initialization fails on the master
node:

F1129 16:37:28.969956 15230 master_main.cc:68] Check failed:
_s.ok() Bad status: Service unavailable: Cannot initialize clock:
Error reading clock. Clock considered unsynchronized

Change-Id: I371e01e21246a8c0ece98ca7d4bf6761615127b4
---
M testdata/cluster/admin
1 file changed, 7 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/5258/3
-- 
To view, visit http://gerrit.cloudera.org:8080/5258
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I371e01e21246a8c0ece98ca7d4bf6761615127b4
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-2057: Better error message for incorrect avro decimal column declaration

2016-11-29 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-2057: Better error message for incorrect avro decimal 
column declaration
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5255/1/fe/src/main/java/org/apache/impala/util/AvroSchemaParser.java
File fe/src/main/java/org/apache/impala/util/AvroSchemaParser.java:

Line 131: String logicalType = schema.getProp("logicalType");
The control-flow here is unnecessarily convoluted (it was before your patch 
with returning null and the fall-through of the case). I think we should clean 
it up instead of adding to the complexity. It would make more sense to:

  1. check if a logical type is present, and throw an exception if it's 
non-NULL (since BYTES requires a logical type)
  2. call getDecimalType(schema, logicalType)
  3. getDecimalType() should now either return a valid type or throw an 
exception.


-- 
To view, visit http://gerrit.cloudera.org:8080/5255
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iad23706128223b6537d565471ef5d8faa91b0b5a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Start a docs build system.

2016-11-29 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: Start a docs build system.
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5238/2/docs/Makefile
File docs/Makefile:

High-level question: would it make sense to use CMake for consistency? It would 
be nice to consolidate so that people don't have to learn Make as well as CMake.


-- 
To view, visit http://gerrit.cloudera.org:8080/5238
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9551b75268cb7cb29a58367a3ef03b127dccbfca
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Anonymous Coward #250
Gerrit-Reviewer: John Russell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] Fix E2E test infrastructure to handle missing exceptions correctly

2016-11-29 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: Fix E2E test infrastructure to handle missing exceptions 
correctly
..


Patch Set 1:

(1 comment)

Did you do an exhaustive test run? Just to be sure that this doesn't break any 
tests.

http://gerrit.cloudera.org:8080/#/c/5260/1/tests/util/test_file_parser.py
File tests/util/test_file_parser.py:

Line 229:   assert len(exception_str.strip()) != 0, "Empty exception 
string."
Nit: to be consistent with how we check for empty strings above, we can just 
use: 

assert exception_str.strip(), "Empty exception string"


-- 
To view, visit http://gerrit.cloudera.org:8080/5260
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7d83c5db59e8a239e4e70694a1e625af6f21419c
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4397 addendum: remove stray semicolon

2016-11-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4397 addendum: remove stray semicolon
..


Patch Set 1:

Build started: http://35.164.73.121:8080/job/gerrit-verify-dryrun/52/

-- 
To view, visit http://gerrit.cloudera.org:8080/5261
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie16e403ae0eb657f0614a1a90b0556f9f1d1056e
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4397 addendum: remove stray semicolon

2016-11-29 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/5261

Change subject: IMPALA-4397 addendum: remove stray semicolon
..

IMPALA-4397 addendum: remove stray semicolon

Change-Id: Ie16e403ae0eb657f0614a1a90b0556f9f1d1056e
---
M be/src/codegen/mcjit-mem-mgr.h
1 file changed, 1 insertion(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/5261/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5261
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ie16e403ae0eb657f0614a1a90b0556f9f1d1056e
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4172/IMPALA-3653: Improvements to block metadata loading

2016-11-29 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change.

Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
..


Patch Set 9:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/5148/9/fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java
File fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java:

Line 38: private static DiskIdMapper INSTANCE = new DiskIdMapper();
just make it public


http://gerrit.cloudera.org:8080/#/c/5148/9/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

Line 403:   for (List partLists: partsByPath.values()) {
It seems simpler to move this logic into the while loop above and update the 
partition files and global stats incrementally. It will probably also be faster 
(look at partition.getSize()), but my main motivation is to make the code 
simpler to understand.

Since we now all the info we need in every LocatedFileStatus it seems we can 
map that file to its corresponding partition(s), and incrementally update all 
stats and auxiliary data structures.


Line 417:   loadDiskIds(perFsFileBlocks);
I think we can get rid of this function and move the logic into the while loop 
above. We can get rid of the classes FsKey and FileBlocksInfo.

Same reason as above: Let's make this code simpler.


-- 
To view, visit http://gerrit.cloudera.org:8080/5148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4553: ntpd must be synchronized for kudu to start.

2016-11-29 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-4553: ntpd must be synchronized for kudu to start.
..


Patch Set 2: Code-Review+1

Carry Todd's +1

-- 
To view, visit http://gerrit.cloudera.org:8080/5258
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I371e01e21246a8c0ece98ca7d4bf6761615127b4
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4553: ntpd must be synchronized for kudu to start.

2016-11-29 Thread Jim Apple (Code Review)
Hello Todd Lipcon,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5258

to look at the new patch set (#2).

Change subject: IMPALA-4553: ntpd must be synchronized for kudu to start.
..

IMPALA-4553: ntpd must be synchronized for kudu to start.

When ntpd is not synchronized, kudu initialization fails on the master
node:

F1129 16:37:28.969956 15230 master_main.cc:68] Check failed:
_s.ok() Bad status: Service unavailable: Cannot initialize clock:
Error reading clock. Clock considered unsynchronized

Change-Id: I371e01e21246a8c0ece98ca7d4bf6761615127b4
---
M testdata/cluster/admin
1 file changed, 7 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/5258/2
-- 
To view, visit http://gerrit.cloudera.org:8080/5258
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I371e01e21246a8c0ece98ca7d4bf6761615127b4
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-4512: Add a script that builds Impala on stock Ubuntu 14.04.

2016-11-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4512: Add a script that builds Impala on stock Ubuntu 
14.04.
..


Patch Set 5:

Build started: http://35.164.73.121:8080/job/gerrit-verify-dryrun/51/

-- 
To view, visit http://gerrit.cloudera.org:8080/5154
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If34e398052a61dfda9825b1cf3a918eb61736048
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4512: Add a script that builds Impala on stock Ubuntu 14.04.

2016-11-29 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: IMPALA-4512: Add a script that builds Impala on stock Ubuntu 
14.04.
..


Patch Set 5: Code-Review+2

rebase carry

-- 
To view, visit http://gerrit.cloudera.org:8080/5154
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If34e398052a61dfda9825b1cf3a918eb61736048
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2890: Support ALTER TABLE statements for Kudu tables

2016-11-29 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-2890: Support ALTER TABLE statements for Kudu tables
..


Patch Set 10: Code-Review+2

Rebase. Carry +2

-- 
To view, visit http://gerrit.cloudera.org:8080/5136
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I04bc87e04e05da5cc03edec79d13cedfd2012896
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Internal Jenkins
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4512: Add a script that builds Impala on stock Ubuntu 14.04.

2016-11-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4512: Add a script that builds Impala on stock Ubuntu 
14.04.
..


Patch Set 4:

Build started: http://35.164.73.121:8080/job/gerrit-verify-dryrun/50/

-- 
To view, visit http://gerrit.cloudera.org:8080/5154
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If34e398052a61dfda9825b1cf3a918eb61736048
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4527: Columns in Kudu tables created from Impala default to "NULL"

2016-11-29 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/5259

Change subject: IMPALA-4527: Columns in Kudu tables created from Impala default 
to "NULL"
..

IMPALA-4527: Columns in Kudu tables created from Impala default to "NULL"

This commit reverts the behavior introduced by IMPALA-3719 which used
the Kudu default behavior for column nullability if none was specified
in the CREATE TABLE statement. With this commit, non-key columns of Kudu
tables that are created from Impala are by default nullable unless
specified otherwise.

Change-Id: I950d9a9c64e3851e11a641573617790b340ece94
---
M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_stats.test
4 files changed, 50 insertions(+), 43 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/5259/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5259
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I950d9a9c64e3851e11a641573617790b340ece94
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dimitris Tsirogiannis 


[Impala-ASF-CR] IMPALA-3398: Rework Impala documentation to be non-Cloduera-specific

2016-11-29 Thread Jim Apple (Code Review)
Jim Apple has uploaded a new patch set (#2).

Change subject: IMPALA-3398: Rework Impala documentation to be 
non-Cloduera-specific
..

IMPALA-3398: Rework Impala documentation to be non-Cloduera-specific

This fix is a prototype for discussion. The changes were partially
automated through:

replace 'audience="Cloudera"' 'audience="Hidden"' -- $FILENAME

sed -ri 's/(rev="[^"]*)(CDH-[0-9]+)([^"]*")/\1\3/g' $FILENAME

sed -i 's/ rev=""//g' $FILENAME

sed -ri 's/(rev="[^"]*) "/\1"/g' $FILENAME

sed -ri 's/(rev="[^"]* ) /\1/g' $FILENAME

Change-Id: Ib3f63fb309e0617d7fe014231bb0ab0ad67c8474
---
M docs/impala_html.ditaval
M docs/impala_pdf.ditaval
M docs/topics/impala_alter_table.xml
M docs/topics/impala_conversion_functions.xml
M docs/topics/impala_jdbc.xml
M docs/topics/impala_operators.xml
6 files changed, 93 insertions(+), 157 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/39/5239/2
-- 
To view, visit http://gerrit.cloudera.org:8080/5239
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib3f63fb309e0617d7fe014231bb0ab0ad67c8474
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Anonymous Coward #250
Gerrit-Reviewer: Greg Rahn 
Gerrit-Reviewer: John Russell 


[Impala-ASF-CR] IMPALA-4553: ntpd must be synchronized for kudu to start.

2016-11-29 Thread Jim Apple (Code Review)
Jim Apple has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/5258

Change subject: IMPALA-4553: ntpd must be synchronized for kudu to start.
..

IMPALA-4553: ntpd must be synchronized for kudu to start.

When ntpd is not synchronized, kudu initialization fails on the master
node:

F1129 16:37:28.969956 15230 master_main.cc:68] Check failed:
_s.ok() Bad status: Service unavailable: Cannot initialize clock:
Error reading clock. Clock considered unsynchronized

Change-Id: I371e01e21246a8c0ece98ca7d4bf6761615127b4
---
M testdata/cluster/admin
1 file changed, 5 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/58/5258/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5258
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I371e01e21246a8c0ece98ca7d4bf6761615127b4
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 


[Impala-ASF-CR] IMPALA-4000: Restricted Sentry authorization for Kudu Tables

2016-11-29 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4000: Restricted Sentry authorization for Kudu Tables
..


IMPALA-4000: Restricted Sentry authorization for Kudu Tables

At this time, there is no comprehensive way of enforcing a Sentry
authorization policy against tables stored in Kudu. The following
behavior was implemented in this patch:
- Only the ALL privilege level can be granted to Kudu tables.
  Finer-grained levels such as only SELECT or only INSERT are not
  supported.
- Column level permissions on Kudu tables are not supported.
- Only users with ALL privileges on SERVER may create external Kudu
  tables.

Change-Id: I183f08ad8ce80deee011a6b90ad67b9cefc0452c
Reviewed-on: http://gerrit.cloudera.org:8080/5047
Reviewed-by: Taras Bobrovytsky 
Tested-by: Internal Jenkins
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/PrivilegeSpec.java
M fe/src/main/java/org/apache/impala/authorization/PrivilegeRequestBuilder.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeAuthStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/AuthorizationTest.java
6 files changed, 58 insertions(+), 2 deletions(-)

Approvals:
  Taras Bobrovytsky: Looks good to me, approved
  Internal Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/5047
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I183f08ad8ce80deee011a6b90ad67b9cefc0452c
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Taras Bobrovytsky 


[Impala-ASF-CR] IMPALA-4172/IMPALA-3653: Improvements to block metadata loading

2016-11-29 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has uploaded a new patch set (#9).

Change subject: IMPALA-4172/IMPALA-3653: Improvements to block metadata loading
..

IMPALA-4172/IMPALA-3653: Improvements to block metadata loading

This patch improves the block metadata loading (locations and disk
storage IDs) for partitioned and un-partitioned tables in the Catalog
server.

Without this patch:
--
We loop throuh each and every file in the table/partition directories
and call getFileBlockLocations() on it to obtain the block metadata.
This results in large no. of RPC calls to namenode, especially with
tables with large no. of files/partitions.

With this patch:
---
We move the block metadata querying to use listStatus() call which
accepts a directory as input and fetches the 'BlockLocation' objects
for every file recursively in that directory. This improves the
metadata loading in the following ways.

- For non-partitioned tables, we query all the BlockLocations in a
single RPC call in the base table directory and load the corresponding
disk IDs.

- For partitioned tables, we query the BlockLocations for all the
partitions residing under the base table directories in a single RPC
and then load every partition with non-default partition directory
separately.

Also, this patch does away with VolumeIds returned by the HDFS NN
and uses the new StorageIDs returned by the BlockLocation class.
These StorageIDs are UUID strings and hence are mapped to a
per-node 0-based index as expected by the backend. In the upcoming
versions of Hadoop APIs, getFileBlockStorageLocations() is deprecated
and instead the listStatus() returns BlockLocations with storage IDs
embedded. This patch makes use of this improvement to reduce an
additional RPC to the NN to fetch the storage locations.

Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26
---
A fe/src/main/java/org/apache/impala/catalog/DiskIdMapper.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
4 files changed, 371 insertions(+), 266 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/5148/9
-- 
To view, visit http://gerrit.cloudera.org:8080/5148
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie127658172e6e70dae441374530674a4ac9d5d26
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Mostafa Mokhtar 


[Impala-ASF-CR] IMPALA-4512: Add a script that builds Impala on stock Ubuntu 14.04.

2016-11-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4512: Add a script that builds Impala on stock Ubuntu 
14.04.
..


Patch Set 4:

Build failed: http://35.164.73.121:8080/job/gerrit-verify-dryrun/48/

-- 
To view, visit http://gerrit.cloudera.org:8080/5154
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If34e398052a61dfda9825b1cf3a918eb61736048
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4542: Fix use-after-free in some BE tests

2016-11-29 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4542: Fix use-after-free in some BE tests
..


IMPALA-4542: Fix use-after-free in some BE tests

Change RuntimeState::ReleaseResources() to use cached ExecEnv pointer,
rather than singleton.

Change-Id: I5302d665756183289edf227588da1c094e197584
Testing: Ran affected BE tests under ASAN.
Reviewed-on: http://gerrit.cloudera.org:8080/5243
Reviewed-by: Tim Armstrong 
Tested-by: Internal Jenkins
---
M be/src/runtime/runtime-state.cc
1 file changed, 1 insertion(+), 1 deletion(-)

Approvals:
  Internal Jenkins: Verified
  Tim Armstrong: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/5243
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I5302d665756183289edf227588da1c094e197584
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4542: Fix use-after-free in some BE tests

2016-11-29 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4542: Fix use-after-free in some BE tests
..


Patch Set 1: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5243
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5302d665756183289edf227588da1c094e197584
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] Start a docs build system.

2016-11-29 Thread Jim Apple (Code Review)
Jim Apple has uploaded a new patch set (#2).

Change subject: Start a docs build system.
..

Start a docs build system.

The docs can be built by running "make" from the docs directory. This
does not hook into buildall.sh for now, as users who run buildall.sh
do not usually edit docs/.

Change-Id: I9551b75268cb7cb29a58367a3ef03b127dccbfca
---
A docs/.gitignore
A docs/Makefile
2 files changed, 30 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/5238/2
-- 
To view, visit http://gerrit.cloudera.org:8080/5238
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9551b75268cb7cb29a58367a3ef03b127dccbfca
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Anonymous Coward #250
Gerrit-Reviewer: John Russell 


[Impala-ASF-CR] IMPALA-4512: Add a script that builds Impala on stock Ubuntu 14.04.

2016-11-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4512: Add a script that builds Impala on stock Ubuntu 
14.04.
..


Patch Set 4:

Build failed: http://35.164.73.121:8080/job/gerrit-verify-dryrun/47/

-- 
To view, visit http://gerrit.cloudera.org:8080/5154
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If34e398052a61dfda9825b1cf3a918eb61736048
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4512: Add a script that builds Impala on stock Ubuntu 14.04.

2016-11-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4512: Add a script that builds Impala on stock Ubuntu 
14.04.
..


Patch Set 4:

Build started: http://35.164.73.121:8080/job/gerrit-verify-dryrun/47/

-- 
To view, visit http://gerrit.cloudera.org:8080/5154
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If34e398052a61dfda9825b1cf3a918eb61736048
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4431: Add audit event log control mechanism to prevent disk overflow

2016-11-29 Thread Donghui Xu (Code Review)
Donghui Xu has posted comments on this change.

Change subject: IMPALA-4431: Add audit event log control mechanism to prevent 
disk  overflow
..


Patch Set 11:

Audit event log has no link file, so there is no link file switch problem. 
we don't need to limit the minimum number of log file to 2. It can be 1. It is 
0 means no limit.

-- 
To view, visit http://gerrit.cloudera.org:8080/4971
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8c3229cbdb6275f969c15258c9ccab6efeb24368
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Donghui Xu 
Gerrit-Reviewer: Donghui Xu 
Gerrit-Reviewer: Lars Volker 
Gerrit-HasComments: No


[Impala-ASF-CR] Add a build flag for the undefined behavior sanitizer, aka "ubsan".

2016-11-29 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change.

Change subject: Add a build flag for the undefined behavior sanitizer, aka 
"ubsan".
..


Patch Set 2:

> Are any of these overflows legitimate use of signed overflows? 
 > Assuming not, shouldn't we just fix the code that can lead to
 > signed overflows? Otherwise, while the behavior would become
 > defined, it's still not correct.

What is your definition of a legitimate use of a signed overflow?

Several are intentional and demonstrate an understanding of two's complement 
arithmetic.

-- 
To view, visit http://gerrit.cloudera.org:8080/5082
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I88c7234bd7c5eb7404490a0913d90470c10835e7
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2057: Better error message for incorrect avro decimal column declaration

2016-11-29 Thread anujphadke (Code Review)
anujphadke has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/5255

Change subject: IMPALA-2057: Better error message for incorrect avro decimal 
column declaration
..

IMPALA-2057: Better error message for incorrect avro decimal column
declaration

Adding a better error message when logical type is specified at a wrong
level or is not not specified in an avro decimal column declaration.

Change-Id: Iad23706128223b6537d565471ef5d8faa91b0b5a
---
M fe/src/main/java/org/apache/impala/util/AvroSchemaParser.java
1 file changed, 6 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/5255/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5255
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Iad23706128223b6537d565471ef5d8faa91b0b5a
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke 


[Impala-ASF-CR] IMPALA-4512: Add a script that builds Impala on stock Ubuntu 14.04.

2016-11-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-4512: Add a script that builds Impala on stock Ubuntu 
14.04.
..


Patch Set 4:

Build failed: http://35.164.73.121:8080/job/gerrit-verify-dryrun/46/

-- 
To view, visit http://gerrit.cloudera.org:8080/5154
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If34e398052a61dfda9825b1cf3a918eb61736048
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Jim Apple 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No