Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11249 )

Change subject: KUDU-2469 pt 2: fail replicas on CFile corruption
......................................................................


Patch Set 9:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/cfile/bloomfile.cc
File src/kudu/cfile/bloomfile.cc:

http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/cfile/bloomfile.cc@257
PS9, Line 257:     return Status::Corruption("bloom file is compressed 
(compression not supported)",
> I suppose the type of corruption is interesting.
I'm also wondering why compression or a missing value index are grounds for 
corruption. They seem more like NotSupported to me (i.e backwards incompatible 
on-disk state that is being loaded by an older version of Kudu). Does it really 
make sense to fail these replicas? What happens if reader_->HandleCorruption() 
isn't called; does the server crash?


http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/cfile/cfile_reader.cc
File src/kudu/cfile/cfile_reader.cc:

http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/cfile/cfile_reader.cc@234
PS9, Line 234:       .CloneAndPrepend("failed to parse CFile pre-header");
Shouldn't the CloneAndPrepend be conditioned on !s.OK()?


http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/fs/io_context.h
File src/kudu/fs/io_context.h:

http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/fs/io_context.h@30
PS9, Line 30: user
s/user/module/ based on the text in the examples.


http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/integration-tests/disk_failure-itest.cc
File src/kudu/integration-tests/disk_failure-itest.cc:

http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/integration-tests/disk_failure-itest.cc@217
PS9, Line 217:     ::testing::Values(ErrorType::CFILE_CORRUPTION));
Shouldn't this parameterize on DISK_FAILURE too?


http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/integration-tests/disk_failure-itest.cc@240
PS9, Line 240:   // Wait a bit for some errors to occur.
Can we ASSERT_EVENTUALLY() on something here, to ensure that we only run the 
ClusterVerifier when an error has actually occurred? Is there a concern that 
rereplication may kick in and finish too quickly for us to observe an error in 
the test? Maybe we can look at a metric?


http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/integration-tests/disk_failure-itest.cc@250
PS9, Line 250:   // Set up a workload that does both reads and writes.
But you're setting the number of write threads to 0...


http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/integration-tests/disk_failure-itest.cc@272
PS9, Line 272:   // Wait a bit for some errors to occur.
Same feedback about waiting for an error here.


http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/tablet/deltafile.cc
File src/kudu/tablet/deltafile.cc:

http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/tablet/deltafile.cc@278
PS9, Line 278:     return Status::Corruption("file does not have a value 
index!");
Again, seems more like NotSupported than Corruption.


http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/tablet/deltafile.cc@291
PS9, Line 291:     return Status::Corruption("missing delta stats from the 
delta file metadata");
Same here.


http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/tserver/tablet_server-test.cc
File src/kudu/tserver/tablet_server-test.cc:

http://gerrit.cloudera.org:8080/#/c/11249/9/src/kudu/tserver/tablet_server-test.cc@1676
PS9, Line 1676:     auto log_req = MakeScopedCleanup([&] {
              :       LOG(INFO) << SecureDebugString(req);
              :     });
Does SCOPED_TRACE not work in here?



--
To view, visit http://gerrit.cloudera.org:8080/11249
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I63d541443bc68c83fd0ca6d51315143fee04d50f
Gerrit-Change-Number: 11249
Gerrit-PatchSet: 9
Gerrit-Owner: Andrew Wong <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Grant Henke <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Thu, 30 Aug 2018 17:58:18 +0000
Gerrit-HasComments: Yes

Reply via email to