[Impala-ASF-CR] IMPALA-8438: Store WriteId and valid writeId list for table and partitio

2019-05-03 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13215 )

Change subject: IMPALA-8438: Store WriteId and valid writeId list for table and 
partitio
..


Patch Set 5:

(7 comments)

Thanks for your reviews. submit patch 6

http://gerrit.cloudera.org:8080/#/c/13215/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13215/3//COMMIT_MSG@7
PS3, Line 7: IMPALA-8438: Store WriteId and valid writeId list for table and 
partitio
> nit: missing n
Done


http://gerrit.cloudera.org:8080/#/c/13215/3//COMMIT_MSG@10
PS3, Line 10: fucntions
> nit, spell-check
Done


http://gerrit.cloudera.org:8080/#/c/13215/3/common/thrift/CatalogObjects.thrift
File common/thrift/CatalogObjects.thrift:

http://gerrit.cloudera.org:8080/#/c/13215/3/common/thrift/CatalogObjects.thrift@483
PS3, Line 483: 4: optional string valid_write_ids
> Instead of storing this in string format, do you think it would be better t
Keep the formatted String we can easily use hive provided functions to convert 
and use it.


http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java
File fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java:

http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java@237
PS3, Line 237: getHighestWriteId
> Do you think its better to throw UnsupportedOperationException in these met
One of the reason of adding these functions is to avoid run time exceptions for 
Hive 2 client, it will be convert to these values when catch 
UnSupportedException anyway.


http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java@258
PS3, Line 258:   public static long getMajorVersion() {
> +1
Done


http://gerrit.cloudera.org:8080/#/c/13215/4/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/13215/4/fe/src/main/java/org/apache/impala/catalog/Table.java@281
PS4, Line 281: String tblFullName = getFullName();
> tab used for whitespace
Done


http://gerrit.cloudera.org:8080/#/c/13215/5/fe/src/test/java/org/apache/impala/analysis/StmtMetadataLoaderTest.java
File fe/src/test/java/org/apache/impala/analysis/StmtMetadataLoaderTest.java:

http://gerrit.cloudera.org:8080/#/c/13215/5/fe/src/test/java/org/apache/impala/analysis/StmtMetadataLoaderTest.java@233
PS5, Line 233: // ToDo this assumes the acid tables have been created.
 : // They will be available after IMPALA-8439 is checked in.
 : testLoadAcidTables("select * from 
acid.insert_only_no_partitions");
 : testLoadAcidTables("select * from 
acid.insert_only_with_partitions");
> These tables should have deterministic writeIds, so the test could except a
The major purpose of the test is to test we can get writeIds from hive. We do 
not test the correctness of hive code. And I think the writeid list could be 
very dynamic which can cause flaky tests.



--
To view, visit http://gerrit.cloudera.org:8080/13215
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54
Gerrit-Change-Number: 13215
Gerrit-PatchSet: 5
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sudhanshu Arora 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Fri, 03 May 2019 21:35:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8438: Store WriteId and valid writeId list for table and partitio

2019-05-03 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13215 )

Change subject: IMPALA-8438: Store WriteId and valid writeId list for table and 
partitio
..


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/13215/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13215/3//COMMIT_MSG@7
PS3, Line 7: IMPALA-8438: Store WriteId and valid writeId list for table and 
partitio
nit: missing n


http://gerrit.cloudera.org:8080/#/c/13215/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13215/5//COMMIT_MSG@7
PS5, Line 7: IMPALA-8438: Store WriteId and valid writeId list for table and 
partitio
nit: missing n


http://gerrit.cloudera.org:8080/#/c/13215/5/fe/src/test/java/org/apache/impala/analysis/StmtMetadataLoaderTest.java
File fe/src/test/java/org/apache/impala/analysis/StmtMetadataLoaderTest.java:

http://gerrit.cloudera.org:8080/#/c/13215/5/fe/src/test/java/org/apache/impala/analysis/StmtMetadataLoaderTest.java@233
PS5, Line 233: // ToDo this assumes the acid tables have been created.
 : // They will be available after IMPALA-8439 is checked in.
 : testLoadAcidTables("select * from 
acid.insert_only_no_partitions");
 : testLoadAcidTables("select * from 
acid.insert_only_with_partitions");
These tables should have deterministic writeIds, so the test could except a 
given validWriteIds string.



--
To view, visit http://gerrit.cloudera.org:8080/13215
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54
Gerrit-Change-Number: 13215
Gerrit-PatchSet: 5
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sudhanshu Arora 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Fri, 03 May 2019 21:11:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8438: Store WriteId and valid writeId list for table and partitio

2019-05-03 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13215 )

Change subject: IMPALA-8438: Store WriteId and valid writeId list for table and 
partitio
..


Patch Set 5:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/13215/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13215/3//COMMIT_MSG@7
PS3, Line 7: partitio
nit, spell-check


http://gerrit.cloudera.org:8080/#/c/13215/3//COMMIT_MSG@10
PS3, Line 10: fucntions
nit, spell-check


http://gerrit.cloudera.org:8080/#/c/13215/3/common/thrift/CatalogObjects.thrift
File common/thrift/CatalogObjects.thrift:

http://gerrit.cloudera.org:8080/#/c/13215/3/common/thrift/CatalogObjects.thrift@483
PS3, Line 483: 4: optional string valid_write_ids
Instead of storing this in string format, do you think it would be better to 
parse the string and store these as individual fields? If we do that, can we 
get rid of storing the table_name since it is already available in THdfsTable?


http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java
File fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java:

http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java@237
PS3, Line 237: getHighestWriteId
Do you think its better to throw UnsupportedOperationException in these methods?


http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java@258
PS3, Line 258:   public static long getMajorVersion() {
> can we rename this to getMajorVersion or something since it's not the full
+1



--
To view, visit http://gerrit.cloudera.org:8080/13215
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54
Gerrit-Change-Number: 13215
Gerrit-PatchSet: 5
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sudhanshu Arora 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Fri, 03 May 2019 20:43:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8438: Store WriteId and valid writeId list for table and partitio

2019-05-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13215 )

Change subject: IMPALA-8438: Store WriteId and valid writeId list for table and 
partitio
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3064/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13215
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54
Gerrit-Change-Number: 13215
Gerrit-PatchSet: 5
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sudhanshu Arora 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Fri, 03 May 2019 20:09:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8438: Store WriteId and valid writeId list for table and partitio

2019-05-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13215 )

Change subject: IMPALA-8438: Store WriteId and valid writeId list for table and 
partitio
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3063/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13215
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54
Gerrit-Change-Number: 13215
Gerrit-PatchSet: 4
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sudhanshu Arora 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Fri, 03 May 2019 20:01:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8438: Store WriteId and valid writeId list for table and partitio

2019-05-03 Thread Yongzhi Chen (Code Review)
Hello Vihang Karajgaonkar, Sudhanshu Arora, Todd Lipcon, Csaba Ringhofer, 
Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13215

to look at the new patch set (#5).

Change subject: IMPALA-8438: Store WriteId and valid writeId list for table and 
partitio
..

IMPALA-8438: Store WriteId and valid writeId list for table and partitio

This happens when tables load metadata from HMS.
Add MetastoreShim fucntions to support HMS3 only functions.
Add validwriteIdlists to query profile through timeline.
Tests:
Manually tests HMS2 and HMS3, using log files to check
Unit tests against HMS3

ToDo:
WriteId and valid writeIds can be fetched in other time, need
more study on that.

Profile example:
Query Compilation: 5s057ms
   - Metadata load started: 63.006ms (63.006ms)
   - Metadata load finished. loaded-tables=2/2...: 4s801ms (4s738ms)
   - Loaded ValidWriteIdLists:
   acid.insert_only_no_partitions:6:9223372036854775807::
   acid.insert_only_with_partitions:3:9223372036854775807::
 : 4s921ms (120.580ms)
   - Analysis finished: 4s929ms (8.013ms)
Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54
---
M common/thrift/CatalogObjects.thrift
M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java
M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java
M fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java
M fe/src/main/java/org/apache/impala/catalog/FeTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java
M fe/src/test/java/org/apache/impala/analysis/StmtMetadataLoaderTest.java
14 files changed, 262 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/13215/5
--
To view, visit http://gerrit.cloudera.org:8080/13215
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54
Gerrit-Change-Number: 13215
Gerrit-PatchSet: 5
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sudhanshu Arora 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-8438: Store WriteId and valid writeId list for table and partitio

2019-05-03 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13215 )

Change subject: IMPALA-8438: Store WriteId and valid writeId list for table and 
partitio
..


Patch Set 4:

(16 comments)

http://gerrit.cloudera.org:8080/#/c/13215/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13215/3//COMMIT_MSG@12
PS3, Line 12: Tests:
> I think if we add some of this info to the profile or to EXPLAIN output we
Done


http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java
File fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java:

http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java@238
PS3, Line 238: return -1L;
> nit: please use capital L so this doesn't look like -11. (same below)
Done


http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java@258
PS3, Line 258:   public static long getMajorVersion() {
> can we rename this to getMajorVersion or something since it's not the full
Done


http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
File fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java:

http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@374
PS3, Line 374:   * @return the list of valid write IDs for the table in a string
> I had to trace my way through Hive code to understand what this meant. My u
We do not have use case, will remove it.


http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@378
PS3, Line 378: lidWriteIdLi
> can we use Optional here?
writeId is removed


http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java:

http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@236
PS3, Line 236: StringBuilder validIdsBuf = new StringBuilder("Loaded 
ValidWriteIdLists: ");
> How about putting this into the query timeline event marked above? This way
Done


http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@285
PS3, Line 285:* Returns the set of tables whose metadata needs to be loaded 
for the analysis of the
> this won't work with LocalCatalog mode enabled. Probably best to just remov
Removed the function


http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@617
PS3, Line 617:   // The last committed write ID which modified this partition.
> can you be more clear here? eg "The last committed write ID which modified
Done


http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@618
PS3, Line 618:   //-1 means writeId_ is irrelevant(not supported).
> -1L
Done


http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/catalog/Table.java@120
PS3, Line 120:   // Valid write id list for this table
> The way this got implemented, this is just a high watermark write id, which
The high water mark can help us get a list of valid writeIDs When apply on a 
ValidWriteIdList
/**
   * Find out if a range of write ids are valid.  Note that valid may have 
different meanings
   * for different implementations, as some will only want to see committed 
transactions and some
   * both committed and aborted.
   * @param minWriteId minimum write ID to look for, inclusive
   * @param maxWriteId maximum write ID to look for, inclusive
   * @return Indicate whether none, some, or all of these transactions are 
valid.
   */
  RangeResponse isWriteIdRangeValid(long minWriteId, long maxWriteId);

But the maxWriteId may comes from transaction manager?
I will remove the writeId for table by following your suggestion as we can 
easily get from stored ValidWriteIdList


http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/catalog/Table.java@285
PS3, Line 285:   writeIds = MetastoreShim.fetchValidWriteIds(client, 
tblFullName);
> per note elsewhere, I don't think we have a good use for this
Done


http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/catalog/Table.java@289
PS3, Line 289: }
> getFullName()
Done



[Impala-ASF-CR] IMPALA-8438: Store WriteId and valid writeId list for table and partitio

2019-05-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13215 )

Change subject: IMPALA-8438: Store WriteId and valid writeId list for table and 
partitio
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13215/4/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/13215/4/fe/src/main/java/org/apache/impala/catalog/Table.java@281
PS4, Line 281:  String tblFullName = getFullName();
tab used for whitespace



--
To view, visit http://gerrit.cloudera.org:8080/13215
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54
Gerrit-Change-Number: 13215
Gerrit-PatchSet: 4
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sudhanshu Arora 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Fri, 03 May 2019 19:20:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8438: Store WriteId and valid writeId list for table and partitio

2019-05-03 Thread Yongzhi Chen (Code Review)
Hello Vihang Karajgaonkar, Sudhanshu Arora, Todd Lipcon, Csaba Ringhofer, 
Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13215

to look at the new patch set (#4).

Change subject: IMPALA-8438: Store WriteId and valid writeId list for table and 
partitio
..

IMPALA-8438: Store WriteId and valid writeId list for table and partitio

This happens when tables load metadata from HMS.
Add MetastoreShim fucntions to support HMS3 only functions.
Add validwriteIdlists to query profile through timeline.
Tests:
Manually tests HMS2 and HMS3, using log files to check
Unit tests against HMS3

ToDo:
WriteId and valid writeIds can be fetched in other time, need
more study on that.

Profile example:
Query Compilation: 5s057ms
   - Metadata load started: 63.006ms (63.006ms)
   - Metadata load finished. loaded-tables=2/2...: 4s801ms (4s738ms)
   - Loaded ValidWriteIdLists:
   acid.insert_only_no_partitions:6:9223372036854775807::
   acid.insert_only_with_partitions:3:9223372036854775807::
 : 4s921ms (120.580ms)
   - Analysis finished: 4s929ms (8.013ms)
Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54
---
M common/thrift/CatalogObjects.thrift
M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java
M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java
M fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java
M fe/src/main/java/org/apache/impala/catalog/FeTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java
M fe/src/test/java/org/apache/impala/analysis/StmtMetadataLoaderTest.java
14 files changed, 262 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/13215/4
--
To view, visit http://gerrit.cloudera.org:8080/13215
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54
Gerrit-Change-Number: 13215
Gerrit-PatchSet: 4
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sudhanshu Arora 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-8438: Store WriteId and valid writeId list for table and partitio

2019-05-03 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13215 )

Change subject: IMPALA-8438: Store WriteId and valid writeId list for table and 
partitio
..


Patch Set 3:

(17 comments)

http://gerrit.cloudera.org:8080/#/c/13215/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13215/3//COMMIT_MSG@12
PS3, Line 12: Tests:
I think if we add some of this info to the profile or to EXPLAIN output we 
could write some better end-to-end tests that actually modify a table and see 
that we pick up the new write ids on REFRESH


http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java
File fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java:

http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java@238
PS3, Line 238: return -1l;
nit: please use capital L so this doesn't look like -11. (same below)


http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java@258
PS3, Line 258:   public static long getShimVersion() {
can we rename this to getMajorVersion or something since it's not the full 
version number?


http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
File fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java:

http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@374
PS3, Line 374:   * @param wrtId the write id to get the corresponding txn
I had to trace my way through Hive code to understand what this meant. My 
understanding is the following:

- if we pass no writeId, we'll get the current *latest* state of writes for a 
table
- if we pass a write ID, that's assuming that we are actually a writer on this 
table that has already been allocated an ID, and it'll translate our writeID 
back to a transaction list, and see which prior writes to the table were 
committed when our write started?

Do we have any use case for the latter yet? If not, maybe we should not expose 
it.


http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@378
PS3, Line 378: long writeId
can we use Optional here?


http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java:

http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@236
PS3, Line 236:   //Keep the log for HMS 3 related dev.
How about putting this into the query timeline event marked above? This way it 
will show up in profiles, which is moderately useful.


http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@285
PS3, Line 285: HdfsPartition hPartition = (HdfsPartition)part;
this won't work with LocalCatalog mode enabled. Probably best to just remove 
this, and instead add the relevant metadata to teh output of 'SHOW PARTITIONS' 
if you think it's useful for debugging.


http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@617
PS3, Line 617:   // For Acid table.
can you be more clear here? eg "The last committed write ID which modified this 
partition."


http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@618
PS3, Line 618:   private long writeId_ = -1l;
-1L


http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/catalog/Table.java@120
PS3, Line 120:   // Last commited write id for this table.
The way this got implemented, this is just a high watermark write id, which 
isn't necessarily the most recently committed. Not sure we have a strong reason 
to cache this.


http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/catalog/Table.java@127
PS3, Line 127: validWrtIds_
rename to 'validWriteIds_' (no need to abbreviate things)


http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/catalog/Table.java@285
PS3, Line 285:* @param wrtId the write id to get the corresponding txn
per note elsewhere, I don't think we have a good use for this


http://gerrit.cloudera.org:8080/#/c/13215/3/fe/src/main/java/org/apache/impala/catalog/Table.java@289
PS3, Line 289: if (LOG.isTraceEnabled()) LOG.trace("Get valid writeIds for 
table: 

[Impala-ASF-CR] IMPALA-8438: Store WriteId and valid writeId list for table and partitio

2019-05-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13215 )

Change subject: IMPALA-8438: Store WriteId and valid writeId list for table and 
partitio
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3048/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13215
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54
Gerrit-Change-Number: 13215
Gerrit-PatchSet: 3
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sudhanshu Arora 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Fri, 03 May 2019 00:00:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8438: Store WriteId and valid writeId list for table and partitio

2019-05-02 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13215 )

Change subject: IMPALA-8438: Store WriteId and valid writeId list for table and 
partitio
..


Patch Set 3:

(20 comments)

Patch 3 addressed review issues.

http://gerrit.cloudera.org:8080/#/c/13215/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13215/2//COMMIT_MSG@7
PS2, Line 7:
> nit: we generally don't add a . to the commit message title
Done


http://gerrit.cloudera.org:8080/#/c/13215/2//COMMIT_MSG@8
PS2, Line 8:
> nit: leave an empty line after the title in the commit message
Done


http://gerrit.cloudera.org:8080/#/c/13215/2//COMMIT_MSG@9
PS2, Line 9:
> nit: functions
Done


http://gerrit.cloudera.org:8080/#/c/13215/2//COMMIT_MSG@10
PS2, Line 10: Add MetastoreShim fucntions to support HMS3 only functions.
> nit: extra line before Tests
Done


http://gerrit.cloudera.org:8080/#/c/13215/2/common/thrift/CatalogObjects.thrift
File common/thrift/CatalogObjects.thrift:

http://gerrit.cloudera.org:8080/#/c/13215/2/common/thrift/CatalogObjects.thrift@307
PS2, Line 307:   // For acid table, store last committed write id
> nit: capital F + ending .
Done


http://gerrit.cloudera.org:8080/#/c/13215/2/common/thrift/CatalogObjects.thrift@479
PS2, Line 479:   // Set iff this is an acid table. The valid write ids list
> Can you add more information about the format, e.g. an example?
Done


http://gerrit.cloudera.org:8080/#/c/13215/2/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
File fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java:

http://gerrit.cloudera.org:8080/#/c/13215/2/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@371
PS2, Line 371:   * Get valid write ids for the acid table
 :   * @param client the client to access HMS
 :   * @param tableFullName the name for the table
 :   * @param wrtId the write id to get the corresponding txn
 :   * @return the list of valid write IDs for the table in a string
> nit: try to be consistent with ending .
Done


http://gerrit.cloudera.org:8080/#/c/13215/2/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@388
PS2, Line 388:   /**
 :* Get highest writeId in a given validwriteidlist
 :* @param validWriteIds ValidWriteIdList object in String
 :* @return highest writeId
 :*/
> nit: try to be consistent with ending .
Done


http://gerrit.cloudera.org:8080/#/c/13215/2/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java:

http://gerrit.cloudera.org:8080/#/c/13215/2/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@235
PS2, Line 235:  for (FeTable iTbl : loadedTbls_.values()) {
> optional: the logging could be moved to a separate function like LogWriteId
Done


http://gerrit.cloudera.org:8080/#/c/13215/2/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@240
PS2, Line 240: }
> nit: missing space?
Done


http://gerrit.cloudera.org:8080/#/c/13215/2/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@245
PS2, Line 245: t
> nit: extra space?
Done


http://gerrit.cloudera.org:8080/#/c/13215/2/fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java
File fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java:

http://gerrit.cloudera.org:8080/#/c/13215/2/fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java@172
PS2, Line 172:
> nit: . not needed
Done


http://gerrit.cloudera.org:8080/#/c/13215/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/13215/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@617
PS2, Line 617: For Acid table
> nit: consistency: "For acid tables."
Done


http://gerrit.cloudera.org:8080/#/c/13215/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@1023
PS2, Line 1023: if (thriftPartition.isSetWrite_id()) {
> nit: we generally add braces if the "if" doesn't fit to one line
Done


http://gerrit.cloudera.org:8080/#/c/13215/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@1025
PS2, Line 1025: }
> nit: braces or one line
Done


http://gerrit.cloudera.org:8080/#/c/13215/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@1069
PS2, Line 1069:
> @Override
Done


http://gerrit.cloudera.org:8080/#/c/13215/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/13215/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@924
PS2, Line 924: //TODO put the writeID load here for now.
 : loadValidWriteIdList(client);
> Can you 

[Impala-ASF-CR] IMPALA-8438: Store WriteId and valid writeId list for table and partitio

2019-05-02 Thread Yongzhi Chen (Code Review)
Hello Vihang Karajgaonkar, Sudhanshu Arora, Todd Lipcon, Csaba Ringhofer, 
Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13215

to look at the new patch set (#3).

Change subject: IMPALA-8438: Store WriteId and valid writeId list for table and 
partitio
..

IMPALA-8438: Store WriteId and valid writeId list for table and partitio

This happens when tables load metadata from HMS.
Add MetastoreShim fucntions to support HMS3 only functions.

Tests:
Manually tests HMS2 and HMS3, using log files to check
Unit tests against HMS3

ToDo:
WriteId and valid writeIds can be fetched in other time, need
more study on that.
Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54
---
M common/thrift/CatalogObjects.thrift
M fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java
M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java
M fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java
M fe/src/main/java/org/apache/impala/catalog/FeTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java
M fe/src/test/java/org/apache/impala/analysis/StmtMetadataLoaderTest.java
14 files changed, 292 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/13215/3
--
To view, visit http://gerrit.cloudera.org:8080/13215
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54
Gerrit-Change-Number: 13215
Gerrit-PatchSet: 3
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sudhanshu Arora 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar