Julian Elischer wrote:
Actually there have been times when I did want to mmap a datastream..
I think a datastream mapped into a user buffer-space is one of the
possible 0-copy methods people sometimes mention.
This is ugly. There are prettier ways of doing it.
-- Terry
To Unsubscribe: send
Julian Elischer wrote:
You can mmap() devices and you can mmap files..
you cannot mmap FIFOs or sockets.
for this reason I think that devices are still well represented by
vnodes. If we merged vnodes and vm objects,
then if devices were not vnodes, how would you represent
a vm area
:
:Julian Elischer wrote:
: Actually there have been times when I did want to mmap a datastream..
: I think a datastream mapped into a user buffer-space is one of the
: possible 0-copy methods people sometimes mention.
:
:This is ugly. There are prettier ways of doing it.
:
:-- Terry
:
:Julian Elischer wrote:
: You can mmap() devices and you can mmap files..
:
: you cannot mmap FIFOs or sockets.
:
: for this reason I think that devices are still well represented by
: vnodes. If we merged vnodes and vm objects,
: then if devices were not vnodes, how would you represent
: a
[ ... merging vnode and vm_object_t ... ]
Kirk McKusick wrote:
Every vnode in the system has an associated object. Every object
backed by a file (e.g., everything but anonymous objects) has an
associated vnode. So, the performance of one is pretty tied to the
performance of the other. Matt
Poul-Henning Kamp wrote:
In message [EMAIL PROTECTED], Kirk
McKusick writes:
Every vnode in the system has an associated object.
No: device vnodes dont...
I think the correct solution to that is to move devices away from
vnodes and into the fdesc layer, just like fifo's and sockets.
Matt Dillon wrote:
This is all preliminary. The question is whether we can
cover enough bases for this to be viable.
Here is a proposed struct file. Make f_data opaque (or
more opaque), add f_object, extend fileops (see next
structure), Added f_vopflags to
:I think we need to remember that we do not always have a
:backing object, nor is a backing object always desirable.
:
:The performance of an mmap'ed file, or swap-backed anonymous
:region is _significantly_ below that of unbacked objects.
:
:-- Terry
This is not true, Terry. There is no
In message [EMAIL PROTECTED], Kirk McKusick writes:
Every vnode in the system has an associated object.
No: device vnodes dont...
I think the correct solution to that is to move devices away from
vnodes and into the fdesc layer, just like fifo's and sockets.
--
Poul-Henning Kamp | UNIX
Dear Matt,
:
:Well, if that's the case, yank all uses of v_id from the nfs code,
:I'll do the namecache and vnodes can be deleted to the joy
of our users...
:
If you can yank v_id out from the kern/vfs_cache code, I
will make similar
fixes to the NFS code. I am not
On Wed, 18 Apr 2001, Poul-Henning Kamp wrote:
In message [EMAIL PROTECTED], Kirk McKusick writes:
Every vnode in the system has an associated object.
No: device vnodes dont...
I think the correct solution to that is to move devices away from vnodes
and into the fdesc layer, just
On Wed, 18 Apr 2001, Robert Watson wrote:
On Wed, 18 Apr 2001, Poul-Henning Kamp wrote:
address spaces, can be opened/closed and retain a seeking position, can be
This is what I get for sending messages in the morning after staying up
late -- needless to say, you can ignore the "retain a
In message [EMAIL PROTECTED], Rober
t Watson writes:
On Wed, 18 Apr 2001, Poul-Henning Kamp wrote:
In message [EMAIL PROTECTED], Kirk McKusick writes:
Every vnode in the system has an associated object.
No: device vnodes dont...
I think the correct solution to that is to move devices
On Wed, 18 Apr 2001, Poul-Henning Kamp wrote:
The vnode is our abstraction for objects that have
address spaces, can be opened/closed and retain a seeking position, can be
mapped, have protections, etc, etc.
This is simply not correct Robert, UNIX::sockets also have many of those
On Wed, 18 Apr 2001, Poul-Henning Kamp wrote:
I have not examined the full details of doing the shift yet, but it is
my impression that it actually will reduce the amount of code
duplication and special casing.
..
The only places we will need new magic is
open, which needs to fix
If this will get rid of or clean up the specfs garbage, then I'm all
for it. I would love to see a 'clean' fileops based device interface.
-Matt
:I have not examined the full details of doing the shift yet, but it is
:my impression that
In message [EMAIL PROTECTED], Matt Dillon writes:
If this will get rid of or clean up the specfs garbage, then I'm all
for it. I would love to see a 'clean' fileops based device interface.
specfs, aliased vnodes, you name it...
I think the aliased vnodes is the single most strong
Robert Watson wrote:
On Wed, 18 Apr 2001, Poul-Henning Kamp wrote:
As I indicated in my follow-up mail, the statement about seeking was
incorrect, that is a property of the open file structure; I believe the
remainder still holds true. When was the last time you tried mmap'ing or
seeking
:You can mmap() devices and you can mmap files..
:
:you cannot mmap FIFOs or sockets.
:
:for this reason I think that devices are still well represented by
:vnodes. If we merged vnodes and vm objects,
:then if devices were not vnodes, how would you represent
:a vm area that maps a device?
:
:--
Poul-Henning Kamp wrote:
In message [EMAIL PROTECTED], Matt Dillon writes:
If this will get rid of or clean up the specfs garbage, then I'm all
for it. I would love to see a 'clean' fileops based device interface.
specfs, aliased vnodes, you name it...
I think the aliased
On Wed, 18 Apr 2001, Julian Elischer wrote:
Poul-Henning Kamp wrote:
In message [EMAIL PROTECTED], Matt Dillon writes:
If this will get rid of or clean up the specfs garbage, then I'm all
for it. I would love to see a 'clean' fileops based device interface.
specfs,
:
:Great. Then we have aliased file pointers...
:that's not a great improvement..
:
:You'd still have to have 'per instance' storage somewhere,
:so that the openned devices could have different permissions, and still
:have them point to common data. so you still need
:aliases, except now it's not
On Wed, 18 Apr 2001, Matt Dillon wrote:
If a device or file can be mmap()'d, then the VM Object acts as the
cache layer for the object. We would in fact be able to remove nearly
*ALL* the caching crap from *ALL* the filesystem code. Filesystem
code would be responsible for
:Does this give you a cache coherence problem if the file system itself
:invokes data writes on files? Consider the UFS quota and extended
:attribute cases: here, the file system will invoke VOP_WRITE() on its
:vnodes to avoid understanding file system internals, so you can have such
:operations
In message [EMAIL PROTECTED], Julian Elischer writes:
If we merged vnodes and vm objects,
then if devices were not vnodes, how would you represent
a vm area that maps a device?
You would use a VM object of course, but it would be a special
kind of VM object, just like today...
--
Poul-Henning
In message [EMAIL PROTECTED], Matt Dillon writes:
:You can mmap() devices and you can mmap files..
:
:you cannot mmap FIFOs or sockets.
:
:for this reason I think that devices are still well represented by
:vnodes. If we merged vnodes and vm objects,
:then if devices were not vnodes, how would
In message [EMAIL PROTECTED], Matt Dillon writes:
Actually, all this talk does imply that VM objects should be independant
of vnodes. Devices may need to mmap (requiring a VM object), but
don't need all the baggage of a vnode. Julian is absolutely correct
there.
Well, you
This is all preliminary. The question is whether we can cover enough
bases for this to be viable.
Here is a proposed struct file. Make f_data opaque (or more opaque),
add f_object, extend fileops (see next structure), Added f_vopflags
to indicate the presence of a vnode
(oops, I forgot to add fo_truncate() to the fileops)
-Matt
To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-hackers" in the body of the message
On Wed, 18 Apr 2001, Poul-Henning Kamp wrote:
In message [EMAIL PROTECTED], Matt Dillon writes:
If this will get rid of or clean up the specfs garbage, then I'm all
for it. I would love to see a 'clean' fileops based device interface.
specfs, aliased vnodes, you name it...
I
Poul-Henning Kamp wrote:
In message [EMAIL PROTECTED], Matt Dillon writes:
Actually, all this talk does imply that VM objects should be independant
of vnodes. Devices may need to mmap (requiring a VM object), but
don't need all the baggage of a vnode. Julian is absolutely
On Mon, 16 Apr 2001 04:02:34 -0700,
Alfred Perlstein [EMAIL PROTECTED] said:
Alfred I'm also wondering why you can't track the number of
Alfred nodes that ought to be cleaned, well, you do, but it doesn't
Alfred look like it's used:
Alfred + numcachehv--;
Alfred +
In message [EMAIL PROTECTED], Kirk McKusick writes:
I am still of the opinion that merging VM objects and vnodes would
be a good idea. Although it would touch a huge number of lines of
code, when the dust settled, it would simplify some nasty bits of
the system.
When I first heard you say this
On Mon, 16 Apr 2001, Kirk McKusick wrote:
I am still of the opinion that merging VM objects and vnodes would be a
good idea. Although it would touch a huge number of lines of code, when
the dust settled, it would simplify some nasty bits of the system. This
merger is really independent of
:I'm interested in this idea, although profess a gaping blind spot in
:expertise in the area of the VM system. However, one of the aspects of
:our VFS that has always concerned me is that use of a single vnode
:simplelock funnels most of the relevant (and performance-sensitive) calls.
:The
:When I first heard you say this I thought you were off your rockers,
:but gradually I have come to think that you may be right.
:
:I think the task will be easier if we get the vnode/buf relationship
:untangled a bit first.
:
:I may also pay off to take vnodes out of diskoperations entirely
In message [EMAIL PROTECTED], Matt Dillon writes:
:When I first heard you say this I thought you were off your rockers,
:but gradually I have come to think that you may be right.
:
:I think the task will be easier if we get the vnode/buf relationship
:untangled a bit first.
:
:I may also pay off
:I don't think NFS relies on vnodes never being freed.
:
:It does, in some case nfs stashes a vnode pointer and the v_id
:value away, and some time later tries to use that pair to try to
:refind the vnode again. If you free vnodes, it will still think
:the pointer is a vnode and if junk
In message [EMAIL PROTECTED], Matt Dillon writes:
:I don't think NFS relies on vnodes never being freed.
:
:It does, in some case nfs stashes a vnode pointer and the v_id
:value away, and some time later tries to use that pair to try to
:refind the vnode again. If you free vnodes, it will
:
:In message [EMAIL PROTECTED], Matt Dillon writes:
::I don't think NFS relies on vnodes never being freed.
::
::It does, in some case nfs stashes a vnode pointer and the v_id
::value away, and some time later tries to use that pair to try to
::refind the vnode again. If you free vnodes, it
* Poul-Henning Kamp [EMAIL PROTECTED] [010417 10:56] wrote:
In message [EMAIL PROTECTED], Matt Dillon writes:
:I don't think NFS relies on vnodes never being freed.
:
:It does, in some case nfs stashes a vnode pointer and the v_id
:value away, and some time later tries to use that pair
In message [EMAIL PROTECTED], Alfred Perlstein writes:
I thought vnodes were in stable storage?
They are, that's the point Matt is not seeing yet.
Note that I really don't care for using stable storeage as a hack
to deal with this sort of thing.
Well, I have to admit that it is a pretty smart
:In message [EMAIL PROTECTED], Alfred Perlstein writes:
:
:I thought vnodes were in stable storage?
:
:They are, that's the point Matt is not seeing yet.
I know vnodes are in stable storage. I'm just saying that NFS
is the least of your worries in trying to change that.
:Note that I really don't care for using stable storeage as a hack
:to deal with this sort of thing.
:
:Well, I have to admit that it is a pretty smart way of dealing with
:it for remote operations, but the trouble is that it prevents us from
:ever lowering their number again.
:
:If Matt can
In message [EMAIL PROTECTED], Matt Dillon writes:
:In message [EMAIL PROTECTED], Alfred Perlstein writes:
:
:I thought vnodes were in stable storage?
:
:They are, that's the point Matt is not seeing yet.
I know vnodes are in stable storage. I'm just saying that NFS
is the least of your
:
:In message [EMAIL PROTECTED], Matt Dillon writes:
:
::In message [EMAIL PROTECTED], Alfred Perlstein writes:
::
::I thought vnodes were in stable storage?
::
::They are, that's the point Matt is not seeing yet.
:
:I know vnodes are in stable storage. I'm just saying that NFS
:is the
In message [EMAIL PROTECTED], Matt Dillon writes:
:
:In message [EMAIL PROTECTED], Matt Dillon writes:
:
::In message [EMAIL PROTECTED], Alfred Perlstein writes:
::
::I thought vnodes were in stable storage?
::
::They are, that's the point Matt is not seeing yet.
:
:I know vnodes are in
:reference to me. I'm not even sure why they bother to check v_id.
:The vp reference from an nfsnode is a hard reference.
:
:
:Well, if that's the case, yank all uses of v_id from the nfs code,
:I'll do the namecache and vnodes can be deleted to the joy of our users...
:
:--
]
Subject: Re: vm balance
On Mon, 16 Apr 2001, Kirk McKusick wrote:
I am still of the opinion that merging VM objects and vnodes would be a
good idea. Although it would touch a huge number of lines of code, when
the dust settled, it would simplify some nasty bits of the system
On Fri, 13 Apr 2001 20:08:57 +0900,
Seigo Tanimura tanimura said:
Alfred Are these changes planned for integration?
Seigo Yes, but not very soon as there are a few kinds of works that should
Seigo be done.
Seigo One is that a directory vnode may be held as the working directory of
Seigo a
* Seigo Tanimura [EMAIL PROTECTED] [010416 03:25] wrote:
On Fri, 13 Apr 2001 20:08:57 +0900,
Seigo Tanimura tanimura said:
Alfred Are these changes planned for integration?
Seigo Yes, but not very soon as there are a few kinds of works that should
Seigo be done.
Seigo One is that a
In message [EMAIL PROTECTED], Seigo Tanim
ura writes:
Those pieces of work were done in the last weekend, and the patch at
Seigo http://people.FreeBSD.org/~tanimura/patches/vnrecycle.diff
has been updated and now ready to commit.
I'm a bit worried about the amount of work done in the
* Seigo Tanimura [EMAIL PROTECTED] [010416 03:25] wrote:
On Fri, 13 Apr 2001 20:08:57 +0900,
Seigo Tanimura tanimura said:
Alfred Are these changes planned for integration?
Seigo Yes, but not very soon as there are a few kinds of works that should
Seigo be done.
Seigo One is that a
On Mon, 16 Apr 2001 12:36:03 +0200,
Poul-Henning Kamp [EMAIL PROTECTED] said:
Poul-Henning In message [EMAIL PROTECTED],
Seigo Tanim
Poul-Henning ura writes:
Those pieces of work were done in the last weekend, and the patch at
Seigo
In message [EMAIL PROTECTED], Seigo Tanim
ura writes:
Poul-Henning I'm a bit worried about the amount of work done in the
Poul-Henning cache_purgeleafdirs(), considering how often it is called,
Poul-Henning Do you have measured the performance impact of this to be an
Poul-Henning insignificant
In message [EMAIL PROTECTED], Seigo Tanim
ura writes:
Seigo http://people.FreeBSD.org/~tanimura/patches/vnrecycle.diff
has been updated and now ready to commit.
Ok, I ran a "cvs update ; make buildworld" here with and without
your patch.
without:
2049.846u 1077.358s 41:29.65 125.6%
Date: Tue, 10 Apr 2001 22:14:28 -0700
From: Julian Elischer [EMAIL PROTECTED]
To: Rik van Riel [EMAIL PROTECTED]
CC: Matt Dillon [EMAIL PROTECTED], David Xu [EMAIL PROTECTED],
[EMAIL PROTECTED], [EMAIL PROTECTED]
Subject: Re: vm balance
Rik van Riel wrote
On Thu, Apr 12, 2001 at 02:24:36PM -0700, a little birdie told me
that Matt Dillon remarked
Without vmiodirenable turned on, any directory exceeding
vfs.maxmallocbufspace becomes extremely expensive to work with
O(N * diskIO). With vmiodirenable turned on huge directories
On Sat, Apr 14, 2001 at 09:34:26AM -0500, Matthew D. Fuller wrote:
On Thu, Apr 12, 2001 at 02:24:36PM -0700, a little birdie told me
that Matt Dillon remarked
Without vmiodirenable turned on, any directory exceeding
vfs.maxmallocbufspace becomes extremely expensive to work with
:Speaking of vmiodirenable, what are the issues with it that it's not
:enabled by default? ISTR that it's been in a while, and most people
:pointed at it have reported success with it, and it seems to have solved
:problems here and there for a number of people. What's keeping it from
:the
In message [EMAIL PROTECTED], Matt Dillon writes:
::scaleability.
::
::Uhm, that is actually not true.
::
::We keep namecache entries around as long as we can use them, and that
::generally means that recreating them is a rather expensive operation,
::involving creation of vnode and very
On Thu, 12 Apr 2001 22:50:50 +0200,
Poul-Henning Kamp [EMAIL PROTECTED] said:
Poul-Henning We keep namecache entries around as long as we can use them, and that
Poul-Henning generally means that recreating them is a rather expensive operation,
Poul-Henning involving creation of vnode and very
* Seigo Tanimura [EMAIL PROTECTED] [010413 02:39] wrote:
On Thu, 12 Apr 2001 22:50:50 +0200,
Poul-Henning Kamp [EMAIL PROTECTED] said:
Poul-Henning We keep namecache entries around as long as we can use them, and that
Poul-Henning generally means that recreating them is a rather expensive
On Fri, 13 Apr 2001 02:58:07 -0700,
Alfred Perlstein [EMAIL PROTECTED] said:
Alfred * Seigo Tanimura [EMAIL PROTECTED] [010413 02:39] wrote:
On Thu, 12 Apr 2001 22:50:50 +0200,
Poul-Henning Kamp [EMAIL PROTECTED] said:
Poul-Henning We keep namecache entries around as long as we can use
On Tue, 10 Apr 2001, Matt Dillon wrote:
It's randomness that will kill performance. You know the old saying
about caches: They only work if you get cache hits, otherwise
they only slow things down.
I wonder ... how does FreeBSD handle negative directory entries?
That is, /bin/sh
:
:On Tue, 10 Apr 2001, Matt Dillon wrote:
:
:It's randomness that will kill performance. You know the old saying
:about caches: They only work if you get cache hits, otherwise
:they only slow things down.
:
:I wonder ... how does FreeBSD handle negative directory entries?
:
:That
In message [EMAIL PROTECTED], Matt Dillon writes:
:
:On Tue, 10 Apr 2001, Matt Dillon wrote:
:
:It's randomness that will kill performance. You know the old saying
:about caches: They only work if you get cache hits, otherwise
:they only slow things down.
:
:I wonder ... how does
:You should also know that negative entries, since they have no
:objects to "hang from" and consequently would clog up the name-cache,
:are limited by the sysctl:
: debug.ncnegfactor: 16
:which means that max 1/16 of the name cache entries can be negative
:entries. You can monitor the
On Thu, 12 Apr 2001, Matt Dillon wrote:
Again, keep in mind that the namei cache is strictly throw-away,
This seems to be the main difference between Linux and FreeBSD.
In Linux, open files directly refer to an entry in the dentry
(and inode) cache, so we really need to have dynamically
In message [EMAIL PROTECTED], Matt Dillon writes:
Again, keep in mind that the namei cache is strictly throw-away, but
entries can often be reconstituted later by the filesystem without I/O
due to the VM Page cache (and/or buffer cache depending on
vfs.vmiodirenable). So as with
:
:In message [EMAIL PROTECTED], Matt Dillon writes:
:
:Again, keep in mind that the namei cache is strictly throw-away, but
:entries can often be reconstituted later by the filesystem without I/O
:due to the VM Page cache (and/or buffer cache depending on
:vfs.vmiodirenable).
In message [EMAIL PROTECTED], Matt Dillon writes:
:
:In message [EMAIL PROTECTED], Matt Dillon writes:
:
:Again, keep in mind that the namei cache is strictly throw-away, but
:entries can often be reconstituted later by the filesystem without I/O
:due to the VM Page cache (and/or
::scaleability.
::
::Uhm, that is actually not true.
::
::We keep namecache entries around as long as we can use them, and that
::generally means that recreating them is a rather expensive operation,
::involving creation of vnode and very likely a vm object again.
:
:The vnode cache is a
I heard NetBSD has implemented a FreeBSD like VM, it also implemented
a VM balance in recent verion of NetBSD. some parameters like TEXT,
DATA and anonymous memory space can be tuned. is there anyone doing
such work on FreeBSD or has FreeBSD already implemented it?
--
David Xu
:I heard NetBSD has implemented a FreeBSD like VM, it also implemented
:a VM balance in recent verion of NetBSD. some parameters like TEXT,
:DATA and anonymous memory space can be tuned. is there anyone doing
:such work on FreeBSD or has FreeBSD already implemented it?
:
:--
:David Xu
FreeBSD implements a very sophisticated VM balancing algorithm. Nobody's
complaining about it so I don't think we need to really change it. Most
of the other UNIXes, including Linux, are actually playing catch-up to
FreeBSD's VM design.
I remember hearing/viewing a
On Tue, 10 Apr 2001, Matt Dillon wrote:
:I heard NetBSD has implemented a FreeBSD like VM, it also implemented
:a VM balance in recent verion of NetBSD. some parameters like TEXT,
:DATA and anonymous memory space can be tuned. is there anyone doing
:such work on FreeBSD or has FreeBSD
:In the balancing part, definately. FreeBSD seems to be the only
:system that has the balancing right. I'm planning on integrating
:some of the balancing tactics into Linux for the 2.5 kernel, but
:I'm not sure how to integrate the inode and dentry cache into the
:balancing scheme ...
:I'm
On Tue, 10 Apr 2001, Matt Dillon wrote:
:I'm curious about the other things though ... FreeBSD still seems
:to have the early 90's abstraction layer from Mach and the vnode
:cache doesn't seem to grow and shrink dynamically (which can be a
:big win for systems with lots of metadata
It's randomness that will kill performance. You know the old saying
about caches: They only work if you get cache hits, otherwise
they only slow things down.
-Matt
:Which is ok if there isn't too much activity with these data
Rik van Riel wrote:
I'm curious about the other things though ... FreeBSD still seems
to have the early 90's abstraction layer from Mach and the vnode
cache doesn't seem to grow and shrink dynamically (which can be a
big win for systems with lots of metadata activity).
So while it's
81 matches
Mail list logo