Ashwani Raina has posted comments on this change. ( http://gerrit.cloudera.org:8080/20270 )
Change subject: KUDU-3461 [client] Avoid impala crash by returning error if invalid tablet id found ...................................................................... Patch Set 11: (21 comments) http://gerrit.cloudera.org:8080/#/c/20270/8//COMMIT_MSG Commit Message: PS8: > Yes, please -- it would be great. I suspect a similar problem might be the The handling of error falls in code area of LookUp and PickLeader that happens as part of finding the tablet location. That should be common for both read and write. At the outset, it seems that fix will work for both situations. But, it would be worth investigating this and verify. I have raised a jira ticket to investigate this - https://issues.apache.org/jira/browse/KUDU-3506 http://gerrit.cloudera.org:8080/#/c/20270/8//COMMIT_MSG@31 PS8, Line 31: The last statement i.e. #8 causes impalad (connected to impala-shell) to crash : With this change, last statement query fails and throws "Status::InvalidArgument() > Is this still so as of PS10? It is still valid though we are checking against error code now instead of error string. Changed it to "kInvalidArgument" to make it aligned with the test steps. http://gerrit.cloudera.org:8080/#/c/20270/8//COMMIT_MSG@36 PS8, Line 36: expect it to return > ditto Added error code "kInvalidArgument" However, the retry after this step succeeds as we have cleared the stale metacache entry for the tablet. Added this info as well. http://gerrit.cloudera.org:8080/#/c/20270/10/src/kudu/client/batcher.cc File src/kudu/client/batcher.cc: http://gerrit.cloudera.org:8080/#/c/20270/10/src/kudu/client/batcher.cc@506 PS10, Line 506: DML op > nit: DML Done http://gerrit.cloudera.org:8080/#/c/20270/8/src/kudu/client/client-test.cc File src/kudu/client/client-test.cc: http://gerrit.cloudera.org:8080/#/c/20270/8/src/kudu/client/client-test.cc@522 PS8, Line 522: ws); > I'd think _both_ the check for the Status type and the error message are ne I think it would make more sense to change this function signature when the moment calls for it. It would be too early to decide how this helper method will prove out to be in future. However, I am ok with retaining the error message check as it was before along with error code check for a robust verification. If we plan to add more scenarios in future, we could make necessary changes of returning the strings and having that extra error string check in callers instead of here. http://gerrit.cloudera.org:8080/#/c/20270/10/src/kudu/client/client-test.cc File src/kudu/client/client-test.cc: http://gerrit.cloudera.org:8080/#/c/20270/10/src/kudu/client/client-test.cc@193 PS10, Line 193: > nit: please keep these lines from 140 to 193 to be alphabetically ordered Done http://gerrit.cloudera.org:8080/#/c/20270/10/src/kudu/client/client-test.cc@521 PS10, Line 521: ASSERT_FALSE(overflow); > Does it make sense to make sure all the rows failed to be written? It may be ok with current error case but it may change depending on usage of this method in different error situations. For now, we could do as you suggested. http://gerrit.cloudera.org:8080/#/c/20270/10/src/kudu/client/client-test.cc@10216 PS10, Line 10216: 's exi > nit: DML Done http://gerrit.cloudera.org:8080/#/c/20270/10/src/kudu/client/client-test.cc@10226 PS10, Line 10226: recursion) is disabled > Pass the parameter by reference, not by value? Done http://gerrit.cloudera.org:8080/#/c/20270/10/src/kudu/client/client-test.cc@10257 PS10, Line 10257: > nit: the DDL client ? Done http://gerrit.cloudera.org:8080/#/c/20270/10/src/kudu/client/client-test.cc@10266 PS10, Line 10266: ASSERT_OK(alterer->Alter()); > nit: could you add a small blurb to point to the fact that the tablet backi Done http://gerrit.cloudera.org:8080/#/c/20270/10/src/kudu/client/client-test.cc@10277 PS10, Line 10277: > nit: the Status::Code enumeration and all its elements (e.g., kInvalidArgum Done http://gerrit.cloudera.org:8080/#/c/20270/10/src/kudu/client/client-test.cc@10288 PS10, Line 10288: > What happens if not enabling the thread-safe way of death tests? Is it cru It is the recommended way to mitigate the risks of testing in a multithreaded environment. If this is not enabled, the test gets stuck after reporting the following: ++ Death tests use fork(), which is unsafe particularly in a threaded context. For this test, Google Test detected 186 threads. See https://github.com/google/googletest/blob/master/docs/advanced.md#death-tests-and-threads for more explanation and suggested solutions, especially if this is the last message you see before your test times out. ++ The test eventually times out. I have added the details as comment. http://gerrit.cloudera.org:8080/#/c/20270/10/src/kudu/client/meta_cache.cc File src/kudu/client/meta_cache.cc: http://gerrit.cloudera.org:8080/#/c/20270/10/src/kudu/client/meta_cache.cc@542 PS10, Line 542: // First check the local metacache for tablet > Code style for here and below: please use C++-style comments (i.e. // ...), Old habits die hard :-) Changed. http://gerrit.cloudera.org:8080/#/c/20270/10/src/kudu/client/meta_cache.cc@544 PS10, Line 544: Status fastpath_status = meta_cache_->FastLookupTabletByKey(table_, : tablet_->partition().begin(), : MetaCache::L > nit: seems like an incorrect indentation? Done http://gerrit.cloudera.org:8080/#/c/20270/10/src/kudu/client/meta_cache.cc@621 PS10, Line 621: > nit: remove the extra space? Done http://gerrit.cloudera.org:8080/#/c/20270/10/src/kudu/client/meta_cache.cc@621 PS10, Line 621: > missing $1 placeholder? Done http://gerrit.cloudera.org:8080/#/c/20270/10/src/kudu/client/meta_cache.cc@1291 PS10, Line 1291: : Status MetaCache::FastLookupTabletByKey(const KuduTable* table, : PartitionKey partition_key, : MetaCache::LookupType lookup_type, : scoped_refptr<RemoteTablet>* remote_tablet) { : > Why to introduce an extra layer of indirection for DoFastPathLookup(), espe DoFastPathLookup is a private method inside MetaCache and we are calling this method from MetaCacheServerPicker scope. FastLookupTabletByKey is a public interface to make use of DoFastPathLookup. This is inline with all the other methods(public) that are being used in MetaCacheServerPicker scope. http://gerrit.cloudera.org:8080/#/c/20270/10/src/kudu/client/meta_cache.cc@1315 PS10, Line 1315: > Maybe, introduce a static/constexp literal and use it to avoid duplicating Done http://gerrit.cloudera.org:8080/#/c/20270/10/src/kudu/client/meta_cache.cc@1476 PS10, Line 1476: > nit: RPC Done http://gerrit.cloudera.org:8080/#/c/20270/10/src/kudu/client/meta_cache.cc@1476 PS10, Line 1476: > missing $1 placeholder? Done -- To view, visit http://gerrit.cloudera.org:8080/20270 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia09cf6fb1b1d10f1ad13a62b5c863bcd1e3ab26a Gerrit-Change-Number: 20270 Gerrit-PatchSet: 11 Gerrit-Owner: Ashwani Raina <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Ashwani Raina <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Wang Xixu <[email protected]> Gerrit-Comment-Date: Mon, 28 Aug 2023 20:06:12 +0000 Gerrit-HasComments: Yes
