Valdis, what a valuable answer. It opened my eyes. I didn't take the most
important thing in account, caches only help in cache hit!

I'll try using a disk on memory (residing on a tmpfs mount) for improving
this. Good idea!

Thank you so much for sharing this with me!!!

Regards

Em 05/07/2018 10:21 PM, <valdis.kletni...@vt.edu> escreveu:

On Thu, 05 Jul 2018 19:30:22 -0300, "Daniel." said:

> Sometime we have a machine that we work on and that is really really slow
> when doing I/O. I know that kernel will use memory to avoid doing I/O, and
> that it would be a kind of conservative in avoiding keep to much data on
> volatile memory susceptible to being lost on power failure. My question
is,
> how to do the opposite, and avoid I/O as much as possible, doesn't matter
> the risks?

You're trying to solve a problem that isn't the one you have....

The way the kernel avoids I/O is if a read or write is done, it keeps a
copy in memory
in case another request uses that same data again.

On most filesystems, a userspace write doesn't go directly to disk - it just
goes into the in-memory cache as a "dirty" page, and gets written out to
disk
later by a separate kernel thread.  In other words, unless your system has
gone
off the deep end into thrashing, writes to disk generally won't block.

Meanwhile, if a userspace read finds the data in the cache, it will just
return the
data, and again not block. Usually, the only time a disk I/O will block is
if it does
a read that is *not* in the in-memory cache already.

The end result is that the effectiveness of the cache depends on what
percent of
the reads are already in there.  And now the bad news...


> I'm using a virtual machine to test some ansible playbooks, the machine is
> just a testing environment so it will be created again and again and
again.
> (And again). The playbook generates a lot of I/O, from yum installs, and
> another commands that inspect ISO images to create repositories,  ... it

Ansible is famous for generating *lots* of disk reads (for example,
'lineinfile' will
usually end up reading most/all of the file, especially if the expected
line isn't in there.
And if you're testing against a blank system, the line probably isn't in
there, so you have
to read the whole file...   And how many times do you have more than one
'lineinfile'
that hits the same file? Because that's the only time the in-memory cache
will help, the
second time the file is referenced.   And I'll bet you that reading ISO
images to create
repositories generates a lot of non-cacheable data - almost guaranteed
unless you read
the same ISO image more than once.  Similarly for yum installs - each RPM
will only be
read once, clogging the in-memory cache.


> Anyway. The idea is that the flushing thread enters as soon as possible
and
> that the blocking happens as late as possible so that I leave disks
working but
> avoid I/O blocking.

Unfortunately, that's not how it works.  If you want to avoid blocking, you
want to
maximize the cache hits (which is unfortunately *very* difficult on a
system install
or ansible run).

You might be able to generate some win by either using a pre-populated tmpfs
and/or using an SSD for the VM's disk.

And you may want to look at more drastic solutions - for instance, using an
NFS
mount from the hypervisor machine as the source for your ISOs and
repositories.
 Under some conditions, that can be faster than the VM doing I/O that then
has
to be hypervisor handled, adding to the overhead.  (This sort of thing is an
*old* trick, dating back to Sun systems in the 3/50 class that had 4M of
memory.  It was faster to put the  swap space on a SCSI Fujitsu Eagle disk
attached
to a 3/280 server and accessed over NFS over Ethernet than using the much
slower 100M "shebox" IDE drive that could be directly attached to a 3/50)
_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Reply via email to