:
: I've heard from both of you that you think the other is wrong. This isn't
:very helpful, however, in finding the correct solution. What I'd like to hear
:from both of you is the reasons why swap is better as a device, or not. There
:seems to be some unstated architectural philosophy that needs to be stated
:before any informed decision can be made about what is the right direction to
:go in.
:
:-DG
:
:David Greenman
:Co-founder/Principal Architect, The FreeBSD Project - http://www.freebsd.org
Ok, I'll buy that. Though I will note that VN has been broken for over
a month - I found that out about half a month ago and posted the fact,
and if Poul is so interested in fixing it he could easily have done so
then. I'm the one who has spent all the time tracking it down, patching,
and testing it, and frankly that should count for a hellofalot more then
nil. Poul frankly does not have a leg to stand on to demand that I work
on a 'better' solution when I didn't cause the problem in the first place,
nor does he have any right to pollute the VM code to end-run around the
issue. It would be tragic if Poul's behavior were allowed to conclude
the matter.
That said, here is my position, split into a 'history' and my 'reasoning':
Previously we had /dev/drum, which didn't work, and the
SWAP device was layered around a device and swap I/O was vectored through
the device subsystem.
One month ago, after discussions, Peter Wemm removed the userland frontend
for the swap device, /dev/drum, and at the same time thought it would be
a good idea to collapse the VOP_STRATEGY routines (that ran through the
SWAP dev_t) into direct calls to swstrategy(). At the time I was
ambivalent but since the change seemed to clear out a bunch of cruft I
went with it.
However, this broke the VN device. I found this out about half a month
later (around half a month ago) but at the time was hip deep in NFS work
for the 3.4 release and could not address the issue. Now that the NFS
work is finished I revisited the now broken VN device to determine what
went wrong. It turns out that the VN device indirectly runs through the
swap device using several generic filesystem buffer related subsystems.
When the swap dev_t was removed, the result was a panic.
-
The VN device uses vm_pager_strategy() to implement swap backing store.
vm_pager_strategy() is a generic mechanism that I introduced into the VM
system a number of months ago for two reasons:
(1) As a sounding board for giving the vm_pager_*() functions a
strategy entry point, to see how well it would work.
(2) To implement the swap backing store feature for the VN device.
vm_pager_strategy() in turn looks at the pager ID and does a table lookup
call through to the swap_pager_strategy() function. This function takes
a VM object and a *generic* bp. While the function itself is
swap-specific, the design of the code is supposed to be as generic as
possible.
vm_pager_strategy() may need to cut up the bp into subrequests to deal
with clustering issues. It uses the generic 'chainbuf' routines to
manage the asynchronous I/O.
The 'chainbuf' routines are designed to be as generic as possible.. they
take BP's. They do *NOT* and are not intended to be special cased in
any way. This is my design for these routines, and even though they
are currently used only by the swap_pager_strategy() code I believe the
generic nature of the routines must be maintained.
I/O sequencing for the swap-backed VN device is thus:
VN->vm_pager_strategy->chainbuf->STRATEGY.
I advocate that the best solution is to simply undo part of the original
commit and restore SWAP's dev_t which was removed a month ago. Poul's
solution is considerably more complicated (the patch he offered will not
work in all cases, and I've already corrected all his other misstatements
that I do not really feel it appropriate to continue to correct his
misstatements when all he does with those corrections is use them as
a poolboard to make his next set of incorrect statements!).
I will tell you why Poul's patch will not work, but I am NOT going to
tell Poul because, damn it, he has no right to spend 5 minutes working
up a shoddy patch and then try to dictate to me that it should be used
over my well tested 8-hours-of-work patch!
Poul's patch will not work because flushchainbuf() is not the only chain
routine that flushes the buffers. getchainbuf() and waitchainbuf() may
*also* flush the buffers if there are already too many chain buffers
built up in the linked list. Poul only addressed flushchainbuf().
He obviously didn't bother to read the code carefully and he just
as obviously doesn't understand it. And not understanding it I can't
imagine how he can possibly believes he understasnds the concepts
behind it.
Poul's patch also pollutes the chainbuf code which is supposed to be
*GENERIC*.
-
Poul has demonstrated over and over again (if you read the thread) that
he understands not a single issue regarding this whole chain of events.
He advocates polluting my carefully designed and implemented VM code,
both the vm_pager_strategy() design and the chainbuf design, in order
to avoid creating a dev_t for the swap device.
I strongly believe that the VN/PAGER/CHAINBUFIO code sequence must be
maintained without such pollution. Since swap really is a device - it has
a fully functional strategy entry point and has logical separation from
the VM swap subsystem by taking on the responsibility for managing the
block devices mounted as swap - I see no reason why we can't implement
a simple dev_t for it. Giving it a dev_t allows us to use all the
standard filesystem buffer cache I/O routines with it, and we've
already demonstrated that it is useful to do so in the VN swap-backing
device. There could easily be other situations that come up in the
future that may also take advantage of this.
I will also point out that only a month ago the SWAP device *WAS* a
dev_t. It was removed. That removal was a mistake because nobody
(including me) realized that we just blew up VN. It's as simple
as that. Do we back out a small portion of the patch to make VN work
again, or do we spend hours or days rewriting an unrelated subsystem
to work around the loss of the dev_t?
-Matt
Matthew Dillon
<[EMAIL PROTECTED]>
To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message