On Thu, Jan 9, 2020, at 16:39, Jose Garcia Sancio wrote: > Thanks Colin, > > LGTM in general. The Linux documentation ( > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/filesystems/proc.txt?id=HEAD#n1644) > defines these metrics as > > read_bytes > > ---------- > > > > I/O counter: bytes read > > Attempt to count the number of bytes which this process really did cause to > > be fetched from the storage layer. Done at the submit_bio() level, so it is > > accurate for block-backed filesystems. <please add status regarding NFS and > > CIFS at a later time> > > > > > > write_bytes > > ----------- > > > > I/O counter: bytes written > > Attempt to count the number of bytes which this process caused to be sent > > to > > the storage layer. This is done at page-dirtying time. > > > > It looks like there is also another metric (cancelled_write_bytes) that > affects the value reported in written_bytes. Do we want to take that into > account when reporting the JMX metric > kafka.server:type=KafkaServer,name=DiskWriteBytes? > > cancelled_write_bytes > > --------------------- > > > > The big inaccuracy here is truncate. If a process writes 1MB to a file and > > then deletes the file, it will in fact perform no writeout. But it will > > have > > been accounted as having caused 1MB of write. > > In other words: The number of bytes which this process caused to not > > happen, > > by truncating pagecache. A task can cause "negative" IO too. If this task > > truncates some dirty pagecache, some IO which another task has been > > accounted > > for (in its write_bytes) will not be happening. We _could_ just subtract > > that > > from the truncating task's write_bytes, but there is information loss in > > doing > > that.
Hi Jose, That's a good point, which I had overlooked! I think we should just subtract out the cancelled_write_bytes, since it doesn't reflect actual I/O that was done. I don't think the "cancelled" number typically gets that big, but there isn't really a reason to count bytes which we intended to write out but then never did. I added a discussion of this to the KIP. best, Colin > > > > On Mon, Jan 6, 2020 at 5:28 PM Colin McCabe <cmcc...@apache.org> wrote: > > > On Tue, Dec 10, 2019, at 11:10, Magnus Edenhill wrote: > > > Hi Colin, > > > > > > > Hi Magnus, > > > > Thanks for taking a look. > > > > > aren't those counters (ever increasing), rather than gauges > > (fluctuating)? > > > > Since this is in the Kafka broker, we're using Yammer. This might be > > confusing, but Yammer's concept of a "counter" is not actually monotonic. > > It can decrease as well as increase. > > > > In general Yammer counters require you to call inc(amount) or dec(amount) > > on them. This doesn't match up with what we need to do here, which is to > > (essentially) make a callback into the kernel by reading from /proc. > > > > The counter/gauge dichotomy doesn't affect the JMX, (I think?), so it's > > really kind of an implementation detail. > > > > > > > > You also mention CPU usage as a side note, you could use getrusage(2)'s > > > ru_utime (user) and ru_stime (sys) > > > to allow the broker to monitor its own CPU usage. > > > > > > > Interesting idea. It might be better to save that for a future KIP, > > though, to avoid scope creep. > > > > best, > > Colin > > > > > /Magnus > > > > > > Den tis 10 dec. 2019 kl 19:33 skrev Colin McCabe <cmcc...@apache.org>: > > > > > > > Hi all, > > > > > > > > I wrote KIP about adding support for exposing disk read and write > > > > metrics. Check it out here: > > > > > > > > https://cwiki.apache.org/confluence/x/sotSC > > > > > > > > best, > > > > Colin > > > > > > > > > > > > -- > -Jose >