Mike Percy has posted comments on this change.

Change subject: Allow for reserving disk space for non-Kudu processes

Patch Set 5:


File src/kudu/consensus/log.cc:

Line 111: DEFINE_int64(log_dir_reserved_bytes, 0,
> Nit: I think this would be more clear as fs_wal_dir_reserved_bytes, as it w
Oh, that's the one I had intended to mirror. Done.

Line 113: TAG_FLAG(log_dir_reserved_bytes, runtime);
> Hmm, I'm not sure if we should allow changing these values at runtime. If s
That's ok. They are already "soft" in terms of enforcement. In any case, this 
one will cause a crash.

I am using runtime modification in the integration tests.

File src/kudu/fs/block_manager-test.cc:

PS5, Line 951: We use a COARSE clock so we need to let a
             :   // little bit of time pass so we get at least one unit of time 
greater than
             :   // before when we call MonoTime::Now() again.
> Let's switch to FINE clock so we won't have this limitation.
I was hoping using COARSE would help keep the code paths fast but I agree it's 
probably overkill.

File src/kudu/fs/log_block_manager.cc:

Line 64: DEFINE_int32(full_disk_cache_seconds, 30,
> Nit: please prefix with log_block_manager. It's super long, yes, but it's t

Line 70: DEFINE_int64(data_dirs_reserved_bytes, 0,
> Please rename to fs_data_dirs_reserved_bytes (see my comment on log_dir_res

Line 96:                       "Number of unavailable log block containers",
> Nit: Number of Unavailable Block Containers (to be consistent with the exis

Line 98:                       "Number of non-full Block Containers that are 
under root paths that "
> Nit: "Number of non-full log block containers that are under root paths who

Line 265:   RWFile* data_file() const { return data_file_.get(); }
> We're only using this for the filename. Could you make narrow accessor that

Line 267:   int64_t preallocated_offset() const { return preallocated_offset_; }
> Where is this used?
Looks unused. Removed

Line 273:   Env* env() const { return block_manager_->env(); }
> Is this ever used?
Nope, also removed

Line 312:   int64_t preallocated_offset_ = 0;
> Huh, didn't know C++11 allowed this. Could you make the same change for tot

Line 1230:   // Indexes of root paths that are below their reserved space 
> Can we defer this work to the case where GetAvailableContainer() has return
I had to change the implementation because of a race I discovered in the cache 
expiration logic, so if we prepopulate this cache then it has to happen here. 
The race caused an infinite loop when the cache expiration was set to 0.

Line 1250:         return Status::IOError("Unable to allocate block: All data 
directories are full. "
> We should add ENOSPC to this status, right?

Line 1292:                                                      
> Preallocate() may not actually preallocate FLAGS_log_container_preallocate_
Yes, preallocation sounds a little bit broken.

However, what you're describing further on is a "hard" limit, as opposed to a 
"soft" limit, which is what I've implemented. A hard limit is much harder (ha 
ha) to implement because the failure cases are broad. We have already allocated 
a block. Now we're applying backpressure on writes. Doing a reservation at 
block allocation time, which is what this patch does, is much easier because 
block manager API clients don't even know that it's happening.

In short, if we want to do a hard limit, it's a whole 'nother thing, and could 
reasonably be implemented with a whole 'nother set of gflags. Also, if you hit 
a hard limit, the most reasonable thing to do is crash.

That said, unless I missed your point (it's certainly possible), I don't think 
it's particularly bad to overshoot on verifying sufficient space here with 
FLAGS_log_container_preallocate_bytes since even if we intended to preallocate 
less space than that, we're close to the limit and it's reasonable to stop 
writing to this disk.

Line 1413:     MonoTime now = MonoTime::Now(MonoTime::COARSE);
> FINE here too. And this can be done outside the lock.

Line 1432:   } // Release lock_.
> Nit: FWIW, I think this is implied by the scope (i.e. don't really see the 

Line 1758:   if (expires->ComesBefore(MonoTime::Now(MonoTime::COARSE))) return 
false; // Expired.
> Use FINE here and below; that's what we default to when introducing timeout

File src/kudu/fs/log_block_manager.h:

Line 156:   // Returns true if the given 'root_path' has been marked full and 
'now' is
> Nit: 'now' (with single quotes) suggests a function argument, perhaps "the 
Oh. I changed the signature since the last review and didn't update the 
comment. Fixed.

PS5, Line 158: Note that this cache uses an "exact match" on the root path 
name, so
             :   // marking /a/b full will not cause /a/b/c to be considered 
> This is true, but how is it relevant? The LBM uses only exact matches on ro

Line 160:   bool IsRootFull(const std::string& root_path, MonoTime* expires = 
nullptr) const;
> Nit: expires should actually be expires_out, to be consistent with the defi

Line 366:   FullDiskCache full_disk_cache_;
> Nit: add a comment.

File src/kudu/util/env_util-test.cc:

PS5, Line 34: class TestEnvUtil: public KuduTest {
            : };
> Nit: you can just remove this and use TEST_F(KuduTest, ...) below.
I don't like that, it's effectively using a global namespace. However I don't 
want to just use TEST(TestDiskSpaceCheck) because I am changing gflags and 
KuduTest uses a FlagSaver internally. I think it's fine as-is. However the name 
should actually be EnvUtilTest...

File src/kudu/util/env_util.cc:

Line 23: #include <sys/statvfs.h>
> Don't need this anymore.

Line 46: TAG_FLAG(disk_reserved_prefixes_with_bytes_free_for_testing, runtime);
> Why does this and the above flag need to be changeable at runtime? AFAICT y
I'm changing them in the integration tests.

To view, visit http://gerrit.cloudera.org:8080/3135
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifd0451d4dbddc1783019a53302de0263080939c7
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mpe...@apache.org>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mpe...@apache.org>
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-HasComments: Yes

Reply via email to