Re: Networked filesystems vs backing_dev_info

2007-10-28 Thread Petr Vandrovec

Peter Zijlstra wrote:

On Sat, 2007-10-27 at 23:30 +0200, Peter Zijlstra wrote:

So in short, stick a struct backing_dev_info into whatever represents a
client, initialize it using bdi_init(), destroy using bdi_destroy().


Oh, and the most important point, make your fresh I_NEW inodes point to
this bdi struct.


Mark it congested once you have 50 (or more) outstanding requests, clear
congestion when you drop below 50.
and you should be set.


Thanks.  Unfortunately I do not think that NCPFS will switch to 
backing_dev_info - it uses pagecache only for symlinks and directories, 
and even if it would use pagecache, as most of servers refuse concurrent 
requests even if TCP is used as a transport, there can be only one 
request in flight...


Petr

P.S.:  And if anyone wants to step in as ncpfs maintainer, feel free.  I 
did not see NetWare server for over year now...


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Networked filesystems vs backing_dev_info

2007-10-28 Thread Petr Vandrovec

Peter Zijlstra wrote:

On Sat, 2007-10-27 at 23:30 +0200, Peter Zijlstra wrote:

So in short, stick a struct backing_dev_info into whatever represents a
client, initialize it using bdi_init(), destroy using bdi_destroy().


Oh, and the most important point, make your fresh I_NEW inodes point to
this bdi struct.


Mark it congested once you have 50 (or more) outstanding requests, clear
congestion when you drop below 50.
and you should be set.


Thanks.  Unfortunately I do not think that NCPFS will switch to 
backing_dev_info - it uses pagecache only for symlinks and directories, 
and even if it would use pagecache, as most of servers refuse concurrent 
requests even if TCP is used as a transport, there can be only one 
request in flight...


Petr

P.S.:  And if anyone wants to step in as ncpfs maintainer, feel free.  I 
did not see NetWare server for over year now...


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Networked filesystems vs backing_dev_info

2007-10-27 Thread Peter Zijlstra
On Sat, 2007-10-27 at 23:30 +0200, Peter Zijlstra wrote:
> On Sat, 2007-10-27 at 16:02 -0500, Steve French wrote:
> > On 10/27/07, Peter Zijlstra <[EMAIL PROTECTED]> wrote:
> > > Hi,
> > >
> > > I had me a little look at bdi usage in networked filesystems.
> > >
> > >  NFS, CIFS, (smbfs), AFS, CODA and NCP
> > >
> > > And of those, NFS is the only one that I could find that creates
> > > backing_dev_info structures. The rest seems to fall back to
> > > default_backing_dev_info.
> > >
> > > With my recent per bdi dirty limit patches the bdi has become more
> > > important than it has been in the past. While falling back to the
> > > default_backing_dev_info isn't wrong per-se, it isn't right either.
> > >
> > > Could I implore the various maintainers to look into this issue for
> > > their respective filesystem. I'll try and come up with some patches to
> > > address this, but feel free to beat me to it.
> > 
> > I would like to understand more about your patches to see what bdi
> > values makes sense for CIFS and how to report possible congestion back
> > to the page manager. 
> 
> So, what my recent patches do is carve up the total writeback cache
> size, or dirty page limit as we call it, proportionally to a BDIs
> writeout speed. So a fast device gets more than a slow device, but will
> not starve it.
> 
> However, for this to work, each device, or remote backing store in the
> case of networked filesystems, need to have a BDI.
> 
> >   I had been thinking about setting bdi->ra_pages
> > so that we do more sensible readahead and writebehind - better
> > matching what is possible over the network and what the server
> > prefers.  
> 
> Well, you'd first have to create backing_dev_info instances before
> setting that value :-)
> 
> >   SMB/CIFS Servers typically allow a maximum of 50 requests
> > in parallel at one time from one client (although this is adjustable
> > for some).
> 
> That seems like a perfect point to set congestion.
> 
> So in short, stick a struct backing_dev_info into whatever represents a
> client, initialize it using bdi_init(), destroy using bdi_destroy().

Oh, and the most important point, make your fresh I_NEW inodes point to
this bdi struct.

> Mark it congested once you have 50 (or more) outstanding requests, clear
> congestion when you drop below 50.
> 
> and you should be set.
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Networked filesystems vs backing_dev_info

2007-10-27 Thread Peter Zijlstra
On Sat, 2007-10-27 at 16:02 -0500, Steve French wrote:
> On 10/27/07, Peter Zijlstra <[EMAIL PROTECTED]> wrote:
> > Hi,
> >
> > I had me a little look at bdi usage in networked filesystems.
> >
> >  NFS, CIFS, (smbfs), AFS, CODA and NCP
> >
> > And of those, NFS is the only one that I could find that creates
> > backing_dev_info structures. The rest seems to fall back to
> > default_backing_dev_info.
> >
> > With my recent per bdi dirty limit patches the bdi has become more
> > important than it has been in the past. While falling back to the
> > default_backing_dev_info isn't wrong per-se, it isn't right either.
> >
> > Could I implore the various maintainers to look into this issue for
> > their respective filesystem. I'll try and come up with some patches to
> > address this, but feel free to beat me to it.
> 
> I would like to understand more about your patches to see what bdi
> values makes sense for CIFS and how to report possible congestion back
> to the page manager. 

So, what my recent patches do is carve up the total writeback cache
size, or dirty page limit as we call it, proportionally to a BDIs
writeout speed. So a fast device gets more than a slow device, but will
not starve it.

However, for this to work, each device, or remote backing store in the
case of networked filesystems, need to have a BDI.

>   I had been thinking about setting bdi->ra_pages
> so that we do more sensible readahead and writebehind - better
> matching what is possible over the network and what the server
> prefers.  

Well, you'd first have to create backing_dev_info instances before
setting that value :-)

>   SMB/CIFS Servers typically allow a maximum of 50 requests
> in parallel at one time from one client (although this is adjustable
> for some).

That seems like a perfect point to set congestion.

So in short, stick a struct backing_dev_info into whatever represents a
client, initialize it using bdi_init(), destroy using bdi_destroy().

Mark it congested once you have 50 (or more) outstanding requests, clear
congestion when you drop below 50.

and you should be set.



signature.asc
Description: This is a digitally signed message part


Re: Networked filesystems vs backing_dev_info

2007-10-27 Thread Steve French
On 10/27/07, Peter Zijlstra <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I had me a little look at bdi usage in networked filesystems.
>
>  NFS, CIFS, (smbfs), AFS, CODA and NCP
>
> And of those, NFS is the only one that I could find that creates
> backing_dev_info structures. The rest seems to fall back to
> default_backing_dev_info.
>
> With my recent per bdi dirty limit patches the bdi has become more
> important than it has been in the past. While falling back to the
> default_backing_dev_info isn't wrong per-se, it isn't right either.
>
> Could I implore the various maintainers to look into this issue for
> their respective filesystem. I'll try and come up with some patches to
> address this, but feel free to beat me to it.

I would like to understand more about your patches to see what bdi
values makes sense for CIFS and how to report possible congestion back
to the page manager.   I had been thinking about setting bdi->ra_pages
so that we do more sensible readahead and writebehind - better
matching what is possible over the network and what the server
prefers.SMB/CIFS Servers typically allow a maximum of 50 requests
in parallel at one time from one client (although this is adjustable
for some). The CIFS client prefers to do writes 14 pages (an iovec of
56K) at a time (although many servers can efficiently handle multiple
of these 56K writes in parallel).  With minor changes CIFS could
handle even larger writes (to just under 64K for Windows and just
under 128K for Samba - the current CIFS Unix Extensions allow servers
to negotiate much larger writes, but lacking a "receivepage"
equivalent Samba does not currently support larger than 128K).
Ideally, to improve large file copy utilization, I would like to see
from 3-10 writes of 56K (or larger in the future) in parallel.   The
read path is harder since we only do 16K reads to Windows and Samba -
but we need to increase the number of these that are done in parallel
on the same inode.  There is a large Google Summer of Code patch for
this which needs more review.


-- 
Thanks,

Steve
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Networked filesystems vs backing_dev_info

2007-10-27 Thread Jan Harkes
On Sat, Oct 27, 2007 at 11:34:26AM +0200, Peter Zijlstra wrote:
> I had me a little look at bdi usage in networked filesystems.
> 
>  NFS, CIFS, (smbfs), AFS, CODA and NCP
> 
> And of those, NFS is the only one that I could find that creates
> backing_dev_info structures. The rest seems to fall back to
> default_backing_dev_info.

While a file is opened in Coda we associate the open file handle with a
local cache file. All read and write operations are redirected to this
local file and we even redirect inode->i_mapping. Actual reads and
writes are completely handled by the underlying file system. We send the
new file contents back to the servers only after all local references
have been released (last-close semantics).

As a result, there is no need for backing_dev_info structures in Coda,
if any congestion control is needed it will be handled by the underlying
file system where our locally cached copies are stored.

Jan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Networked filesystems vs backing_dev_info

2007-10-27 Thread Peter Zijlstra

On Sat, 2007-10-27 at 11:22 -0400, Jan Harkes wrote:
> On Sat, Oct 27, 2007 at 11:34:26AM +0200, Peter Zijlstra wrote:
> > I had me a little look at bdi usage in networked filesystems.
> > 
> >  NFS, CIFS, (smbfs), AFS, CODA and NCP
> > 
> > And of those, NFS is the only one that I could find that creates
> > backing_dev_info structures. The rest seems to fall back to
> > default_backing_dev_info.
> 
> While a file is opened in Coda we associate the open file handle with a
> local cache file. All read and write operations are redirected to this
> local file and we even redirect inode->i_mapping. Actual reads and
> writes are completely handled by the underlying file system. We send the
> new file contents back to the servers only after all local references
> have been released (last-close semantics).
> 
> As a result, there is no need for backing_dev_info structures in Coda,
> if any congestion control is needed it will be handled by the underlying
> file system where our locally cached copies are stored.

Ok, that works. Thanks for this explanation!

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Networked filesystems vs backing_dev_info

2007-10-27 Thread Peter Zijlstra

On Sat, 2007-10-27 at 11:22 -0400, Jan Harkes wrote:
 On Sat, Oct 27, 2007 at 11:34:26AM +0200, Peter Zijlstra wrote:
  I had me a little look at bdi usage in networked filesystems.
  
   NFS, CIFS, (smbfs), AFS, CODA and NCP
  
  And of those, NFS is the only one that I could find that creates
  backing_dev_info structures. The rest seems to fall back to
  default_backing_dev_info.
 
 While a file is opened in Coda we associate the open file handle with a
 local cache file. All read and write operations are redirected to this
 local file and we even redirect inode-i_mapping. Actual reads and
 writes are completely handled by the underlying file system. We send the
 new file contents back to the servers only after all local references
 have been released (last-close semantics).
 
 As a result, there is no need for backing_dev_info structures in Coda,
 if any congestion control is needed it will be handled by the underlying
 file system where our locally cached copies are stored.

Ok, that works. Thanks for this explanation!

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Networked filesystems vs backing_dev_info

2007-10-27 Thread Jan Harkes
On Sat, Oct 27, 2007 at 11:34:26AM +0200, Peter Zijlstra wrote:
 I had me a little look at bdi usage in networked filesystems.
 
  NFS, CIFS, (smbfs), AFS, CODA and NCP
 
 And of those, NFS is the only one that I could find that creates
 backing_dev_info structures. The rest seems to fall back to
 default_backing_dev_info.

While a file is opened in Coda we associate the open file handle with a
local cache file. All read and write operations are redirected to this
local file and we even redirect inode-i_mapping. Actual reads and
writes are completely handled by the underlying file system. We send the
new file contents back to the servers only after all local references
have been released (last-close semantics).

As a result, there is no need for backing_dev_info structures in Coda,
if any congestion control is needed it will be handled by the underlying
file system where our locally cached copies are stored.

Jan
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Networked filesystems vs backing_dev_info

2007-10-27 Thread Steve French
On 10/27/07, Peter Zijlstra [EMAIL PROTECTED] wrote:
 Hi,

 I had me a little look at bdi usage in networked filesystems.

  NFS, CIFS, (smbfs), AFS, CODA and NCP

 And of those, NFS is the only one that I could find that creates
 backing_dev_info structures. The rest seems to fall back to
 default_backing_dev_info.

 With my recent per bdi dirty limit patches the bdi has become more
 important than it has been in the past. While falling back to the
 default_backing_dev_info isn't wrong per-se, it isn't right either.

 Could I implore the various maintainers to look into this issue for
 their respective filesystem. I'll try and come up with some patches to
 address this, but feel free to beat me to it.

I would like to understand more about your patches to see what bdi
values makes sense for CIFS and how to report possible congestion back
to the page manager.   I had been thinking about setting bdi-ra_pages
so that we do more sensible readahead and writebehind - better
matching what is possible over the network and what the server
prefers.SMB/CIFS Servers typically allow a maximum of 50 requests
in parallel at one time from one client (although this is adjustable
for some). The CIFS client prefers to do writes 14 pages (an iovec of
56K) at a time (although many servers can efficiently handle multiple
of these 56K writes in parallel).  With minor changes CIFS could
handle even larger writes (to just under 64K for Windows and just
under 128K for Samba - the current CIFS Unix Extensions allow servers
to negotiate much larger writes, but lacking a receivepage
equivalent Samba does not currently support larger than 128K).
Ideally, to improve large file copy utilization, I would like to see
from 3-10 writes of 56K (or larger in the future) in parallel.   The
read path is harder since we only do 16K reads to Windows and Samba -
but we need to increase the number of these that are done in parallel
on the same inode.  There is a large Google Summer of Code patch for
this which needs more review.


-- 
Thanks,

Steve
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Networked filesystems vs backing_dev_info

2007-10-27 Thread Peter Zijlstra
On Sat, 2007-10-27 at 16:02 -0500, Steve French wrote:
 On 10/27/07, Peter Zijlstra [EMAIL PROTECTED] wrote:
  Hi,
 
  I had me a little look at bdi usage in networked filesystems.
 
   NFS, CIFS, (smbfs), AFS, CODA and NCP
 
  And of those, NFS is the only one that I could find that creates
  backing_dev_info structures. The rest seems to fall back to
  default_backing_dev_info.
 
  With my recent per bdi dirty limit patches the bdi has become more
  important than it has been in the past. While falling back to the
  default_backing_dev_info isn't wrong per-se, it isn't right either.
 
  Could I implore the various maintainers to look into this issue for
  their respective filesystem. I'll try and come up with some patches to
  address this, but feel free to beat me to it.
 
 I would like to understand more about your patches to see what bdi
 values makes sense for CIFS and how to report possible congestion back
 to the page manager. 

So, what my recent patches do is carve up the total writeback cache
size, or dirty page limit as we call it, proportionally to a BDIs
writeout speed. So a fast device gets more than a slow device, but will
not starve it.

However, for this to work, each device, or remote backing store in the
case of networked filesystems, need to have a BDI.

   I had been thinking about setting bdi-ra_pages
 so that we do more sensible readahead and writebehind - better
 matching what is possible over the network and what the server
 prefers.  

Well, you'd first have to create backing_dev_info instances before
setting that value :-)

   SMB/CIFS Servers typically allow a maximum of 50 requests
 in parallel at one time from one client (although this is adjustable
 for some).

That seems like a perfect point to set congestion.

So in short, stick a struct backing_dev_info into whatever represents a
client, initialize it using bdi_init(), destroy using bdi_destroy().

Mark it congested once you have 50 (or more) outstanding requests, clear
congestion when you drop below 50.

and you should be set.



signature.asc
Description: This is a digitally signed message part


Re: Networked filesystems vs backing_dev_info

2007-10-27 Thread Peter Zijlstra
On Sat, 2007-10-27 at 23:30 +0200, Peter Zijlstra wrote:
 On Sat, 2007-10-27 at 16:02 -0500, Steve French wrote:
  On 10/27/07, Peter Zijlstra [EMAIL PROTECTED] wrote:
   Hi,
  
   I had me a little look at bdi usage in networked filesystems.
  
NFS, CIFS, (smbfs), AFS, CODA and NCP
  
   And of those, NFS is the only one that I could find that creates
   backing_dev_info structures. The rest seems to fall back to
   default_backing_dev_info.
  
   With my recent per bdi dirty limit patches the bdi has become more
   important than it has been in the past. While falling back to the
   default_backing_dev_info isn't wrong per-se, it isn't right either.
  
   Could I implore the various maintainers to look into this issue for
   their respective filesystem. I'll try and come up with some patches to
   address this, but feel free to beat me to it.
  
  I would like to understand more about your patches to see what bdi
  values makes sense for CIFS and how to report possible congestion back
  to the page manager. 
 
 So, what my recent patches do is carve up the total writeback cache
 size, or dirty page limit as we call it, proportionally to a BDIs
 writeout speed. So a fast device gets more than a slow device, but will
 not starve it.
 
 However, for this to work, each device, or remote backing store in the
 case of networked filesystems, need to have a BDI.
 
I had been thinking about setting bdi-ra_pages
  so that we do more sensible readahead and writebehind - better
  matching what is possible over the network and what the server
  prefers.  
 
 Well, you'd first have to create backing_dev_info instances before
 setting that value :-)
 
SMB/CIFS Servers typically allow a maximum of 50 requests
  in parallel at one time from one client (although this is adjustable
  for some).
 
 That seems like a perfect point to set congestion.
 
 So in short, stick a struct backing_dev_info into whatever represents a
 client, initialize it using bdi_init(), destroy using bdi_destroy().

Oh, and the most important point, make your fresh I_NEW inodes point to
this bdi struct.

 Mark it congested once you have 50 (or more) outstanding requests, clear
 congestion when you drop below 50.
 
 and you should be set.
 

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/