Simon Wilkinson wrote:

Yep, this is what's happening in the trace Achim provided, too. Every 4k
we write the chunk. I'm not sure how that's possible unless something is
closing the file a lot, or the cache is full of stuff we can't kick out.


Actually, it's entirely possible. Here's how it all goes wrong...

When the cache is full, every call to write results in us attempting to
empty the cache. On Linux the page cache means that we only call write
once for each 4k chunk. However, our attempts to empty the cache are a
little pathetic. We just attempt to store all of the chunks of the file
currently being written back to the fileserver. If it's a new file there
is only one such chunk - the one that we are currently writing. As
chunks are much larger than pages, and when a chunk is dirty we flush
the whole thing to the server, this is why we see repeated writes of the
same data. The process goes something like this:

*) Write page at 0k, dirties first chunk of file.
*) Discover cache is full, flush first chunk (0->1024k) to the file server
*) Write page at 4k, dirties first chunk of file
*) Cache is still full, flush first chunk to file server
*) Write page at 8k, dirties first chunk of file

... and so on.

The problem is that we don't make good decisions when we decide to flush
the cache. However, any change to flush items which are less active will
be a behaviour change - in particular, on a multi-user system it would
mean that one user could break write-on-close for other users simply by
filling the cache.

The problem here ist that afs_DoPartialWrite is called with each write. Normally it gets out without doing anything, but if the percentage of dirty chunks is to high it triggers a background store. However, this can happen multiple times before the background job starts executing. Therefore I introduced in AFS/OSD a new flag bit CStoring which is switched on when the background task is submitted and switched off when it's done. And during that time no new background stores are scheduled for this file.

Hartmut

Cheers,

Simon.

_______________________________________________
OpenAFS-info mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-info


--
-----------------------------------------------------------------
Hartmut Reuter                  e-mail          [email protected]
                                phone            +49-89-3299-1328
                                fax              +49-89-3299-1301
RZG (Rechenzentrum Garching)    web    http://www.rzg.mpg.de/~hwr
Computing Center of the Max-Planck-Gesellschaft (MPG) and the
Institut fuer Plasmaphysik (IPP)
-----------------------------------------------------------------
_______________________________________________
OpenAFS-info mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-info

Reply via email to