Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?

2001-05-18 Thread Thies C. Arntzen

On Thu, May 17, 2001 at 11:10:57PM -0700, Rasmus Lerdorf wrote:
  Perhaps automatic detection could be option?  if (filesize  X)
  blockread else mmap?  It seems like the most intuitive way to implement
  it...
  
  
   Sounds a bit magical.  Why not just a block_readfile() function?
 
 
  Mainly the bloat factor, we already have a large core, imho, functions
  shouldn't be added unless there are no workarounds.  Also, it requires a
  little too much thought, into what sizes are good for mmap() and what
  sizes are good for block read's (it also requires knowledge of mmap(),
  because many people might automatically assume that block_read would
  always be faster).  I'm pretty sure if we polled php-general and php-qa
  (the more knowledgable user bases), most people wouldn't really
  understand what mmap does, or what it is for or when it is beneficial to
  use it.
 
  As for magical, well a bit, but good magic and internal magic (not
  syntactical magic).  I'd assume that most systems have a certain point
  where mmap is no longer more beneficial than reading a file by chunks.
  If we can find a reasonable number (or have a user specify that in a
  configuration option if really necessary), it saves the user the trouble
  of thinking about something which is pretty low-level and it reduces
  bloat.  I don't really see a downside to this magic.
 
 But, the issue here isn't one of which is faster.  The issue here is one
 of memory usage.  If you have a 600M iso image that you decide to
 readfile() for a download page of some sort, then you are going to end up
 with a 600M httpd process.  And soon you will have lots of those as more
 people hit the page.

mmap will not increase the size of your process as it doesn't
call sbrk(). 

 
 So to be truely magical here, PHP would have to check the amount of spare
 RAM on the system, divide that by MaxClients and set that as the largest
 filesize to mmap() because anything larger could result in the box going
 into swap.
 
 I obviously don't think such a check is feasible.  The only real question
 here is whether to add a user configurable max-mmap setting or to add a
 second function that never mmaps.

question is: do we really need this mmap stuff at all?

with readfile we should easyly be able to saturate a
pipe of any size using just read and write.

tc

 
 -Rasmus
 
 
 -- 
 PHP Development Mailing List http://www.php.net/
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 To contact the list administrators, e-mail: [EMAIL PROTECTED]
 

-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?

2001-05-18 Thread Sterling Hughes

Rasmus Lerdorf wrote:

Perhaps automatic detection could be option?  if (filesize  X)
blockread else mmap?  It seems like the most intuitive way to implement
it...


Sounds a bit magical.  Why not just a block_readfile() function?


Mainly the bloat factor, we already have a large core, imho, functions
shouldn't be added unless there are no workarounds.  Also, it requires a
little too much thought, into what sizes are good for mmap() and what
sizes are good for block read's (it also requires knowledge of mmap(),
because many people might automatically assume that block_read would
always be faster).  I'm pretty sure if we polled php-general and php-qa
(the more knowledgable user bases), most people wouldn't really
understand what mmap does, or what it is for or when it is beneficial to
use it.

As for magical, well a bit, but good magic and internal magic (not
syntactical magic).  I'd assume that most systems have a certain point
where mmap is no longer more beneficial than reading a file by chunks.
If we can find a reasonable number (or have a user specify that in a
configuration option if really necessary), it saves the user the trouble
of thinking about something which is pretty low-level and it reduces
bloat.  I don't really see a downside to this magic.

 
 But, the issue here isn't one of which is faster.  The issue here is one
 of memory usage.  If you have a 600M iso image that you decide to
 readfile() for a download page of some sort, then you are going to end up
 with a 600M httpd process.  And soon you will have lots of those as more
 people hit the page.



 

 So to be truely magical here, PHP would have to check the amount of spare
 RAM on the system, divide that by MaxClients and set that as the largest
 filesize to mmap() because anything larger could result in the box going
 into swap.



point taken :)

 
 I obviously don't think such a check is feasible.  The only real question
 here is whether to add a user configurable max-mmap setting or to add a
 second function that never mmaps.
 


+1 for the configuration option.

-Sterling


-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?

2001-05-18 Thread Andi Gutmans

At 10:40 PM 5/17/2001 -0700, Rasmus Lerdorf wrote:
   True.  But I guess my main issue is still that the behaviour changes
   radically based on a hidden configure check (ie. whether mmap is there
   or not) and that ensuring a block-by-block read in user space is
   inefficient for huge files.
  
 
 
  good point...  hrrmmm
 
  it seems like this is an option that should be available (somehow), yet
  I don't really like adding another option to the function, as it
  requires too much smarts on the behalf of the user (at what point does
  mmap() slow things down instead of speed them up, what is mmap(), etc.)

Well, we do check and only do the mmap() for files larger than the block
size.

  Perhaps automatic detection could be option?  if (filesize  X)
  blockread else mmap?  It seems like the most intuitive way to implement
  it...

Sounds a bit magical.  Why not just a block_readfile() function?

Why is it magical? It's an internal optimization and the developer will 
never know.
I wouldn't add another file but make it a bit smarted only to mmap() small 
files and use fread() (the C version) for the rest).

Andi


-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?

2001-05-18 Thread Andi Gutmans

At 11:10 PM 5/17/2001 -0700, Rasmus Lerdorf wrote:
  Perhaps automatic detection could be option?  if (filesize  X)
  blockread else mmap?  It seems like the most intuitive way to implement
  it...
  
  
   Sounds a bit magical.  Why not just a block_readfile() function?
 
 
  Mainly the bloat factor, we already have a large core, imho, functions
  shouldn't be added unless there are no workarounds.  Also, it requires a
  little too much thought, into what sizes are good for mmap() and what
  sizes are good for block read's (it also requires knowledge of mmap(),
  because many people might automatically assume that block_read would
  always be faster).  I'm pretty sure if we polled php-general and php-qa
  (the more knowledgable user bases), most people wouldn't really
  understand what mmap does, or what it is for or when it is beneficial to
  use it.
 
  As for magical, well a bit, but good magic and internal magic (not
  syntactical magic).  I'd assume that most systems have a certain point
  where mmap is no longer more beneficial than reading a file by chunks.
  If we can find a reasonable number (or have a user specify that in a
  configuration option if really necessary), it saves the user the trouble
  of thinking about something which is pretty low-level and it reduces
  bloat.  I don't really see a downside to this magic.

But, the issue here isn't one of which is faster.  The issue here is one
of memory usage.  If you have a 600M iso image that you decide to
readfile() for a download page of some sort, then you are going to end up
with a 600M httpd process.  And soon you will have lots of those as more
people hit the page.

So to be truely magical here, PHP would have to check the amount of spare
RAM on the system, divide that by MaxClients and set that as the largest
filesize to mmap() because anything larger could result in the box going
into swap.

I obviously don't think such a check is feasible.  The only real question
here is whether to add a user configurable max-mmap setting or to add a
second function that never mmaps.

I think we are getting carried away here. Why start bloating with 
configuration options and possible new functions? It's not as if the 
developer needs great control over this.
I'd either nuke mmap() completely and use regular file functions (it's 
usually not a big loss and I don't think it's a big deal) or take an 
arbitrary number which we think should be considered a large file 
(something like 256KB) and use mmap() only for smaller files.

Andi


-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?

2001-05-18 Thread Jani Taskinen

On Fri, 18 May 2001, Andi Gutmans wrote:

  of thinking about something which is pretty low-level and it reduces
  bloat.  I don't really see a downside to this magic.

But, the issue here isn't one of which is faster.  The issue here is one
of memory usage.  If you have a 600M iso image that you decide to
readfile() for a download page of some sort, then you are going to end up
with a 600M httpd process.  And soon you will have lots of those as more
people hit the page.

So to be truely magical here, PHP would have to check the amount of spare
RAM on the system, divide that by MaxClients and set that as the largest
filesize to mmap() because anything larger could result in the box going
into swap.

I obviously don't think such a check is feasible.  The only real question
here is whether to add a user configurable max-mmap setting or to add a
second function that never mmaps.

I think we are getting carried away here. Why start bloating with
configuration options and possible new functions? It's not as if the
developer needs great control over this.
I'd either nuke mmap() completely and use regular file functions (it's
usually not a big loss and I don't think it's a big deal) or take an
arbitrary number which we think should be considered a large file
(something like 256KB) and use mmap() only for smaller files.


I'm for nuking the mmap() from readfile().
readfile() is supposed to be used also on remote files and
IIRC mmap() is meant to be used only with regular files.

Do ftell()/fstat() work for remote files?

--Jani






-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?

2001-05-18 Thread Jani Taskinen


And then I find the magical 'if(!issock) {' line.. :)
But still I'd rather nuke mmap..HOW fast is it anyway
compared to reading/writing in chunks?

--Jani


On Fri, 18 May 2001, Jani Taskinen wrote:

On Fri, 18 May 2001, Andi Gutmans wrote:

  of thinking about something which is pretty low-level and it reduces
  bloat.  I don't really see a downside to this magic.

But, the issue here isn't one of which is faster.  The issue here is one
of memory usage.  If you have a 600M iso image that you decide to
readfile() for a download page of some sort, then you are going to end up
with a 600M httpd process.  And soon you will have lots of those as more
people hit the page.

So to be truely magical here, PHP would have to check the amount of spare
RAM on the system, divide that by MaxClients and set that as the largest
filesize to mmap() because anything larger could result in the box going
into swap.

I obviously don't think such a check is feasible.  The only real question
here is whether to add a user configurable max-mmap setting or to add a
second function that never mmaps.

I think we are getting carried away here. Why start bloating with
configuration options and possible new functions? It's not as if the
developer needs great control over this.
I'd either nuke mmap() completely and use regular file functions (it's
usually not a big loss and I don't think it's a big deal) or take an
arbitrary number which we think should be considered a large file
(something like 256KB) and use mmap() only for smaller files.


I'm for nuking the mmap() from readfile().
readfile() is supposed to be used also on remote files and
IIRC mmap() is meant to be used only with regular files.

Do ftell()/fstat() work for remote files?

--Jani









-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?

2001-05-18 Thread Sterling Hughes

Jani Taskinen wrote:

 On Fri, 18 May 2001, Andi Gutmans wrote:
 
 
of thinking about something which is pretty low-level and it reduces
bloat.  I don't really see a downside to this magic.

But, the issue here isn't one of which is faster.  The issue here is one
of memory usage.  If you have a 600M iso image that you decide to
readfile() for a download page of some sort, then you are going to end up
with a 600M httpd process.  And soon you will have lots of those as more
people hit the page.

So to be truely magical here, PHP would have to check the amount of spare
RAM on the system, divide that by MaxClients and set that as the largest
filesize to mmap() because anything larger could result in the box going
into swap.

I obviously don't think such a check is feasible.  The only real question
here is whether to add a user configurable max-mmap setting or to add a
second function that never mmaps.

I think we are getting carried away here. Why start bloating with
configuration options and possible new functions? It's not as if the
developer needs great control over this.
I'd either nuke mmap() completely and use regular file functions (it's
usually not a big loss and I don't think it's a big deal) or take an
arbitrary number which we think should be considered a large file
(something like 256KB) and use mmap() only for smaller files.

 
 
 I'm for nuking the mmap() from readfile().
 readfile() is supposed to be used also on remote files and
 IIRC mmap() is meant to be used only with regular files.
 
 Do ftell()/fstat() work for remote files?
 


Huh?

If its a remote url, or mmap() isn't found, then its not used, otherwise 
it is.  There's no difference as far as compatibility is concerned, 
using mmap() when its available is simply an optimization.

-Sterling


-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?

2001-05-18 Thread Jani Taskinen

On Thu, 17 May 2001, Sterling Hughes wrote:

If its a remote url, or mmap() isn't found, then its not used, otherwise
it is.  There's no difference as far as compatibility is concerned,

Yeah, I noticed that just after sending the email. :)

using mmap() when its available is simply an optimization.

How much does it really optimize? And what's the use of
any optimization if the web server dies because of it? :)

--Jani




-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?

2001-05-18 Thread Andi Gutmans

At 09:19 AM 5/18/2001 +0200, Jani Taskinen wrote:

And then I find the magical 'if(!issock) {' line.. :)
But still I'd rather nuke mmap..HOW fast is it anyway
compared to reading/writing in chunks?

I think it probably isn't really faster (at least not noticeably) because 
we are anyway writing to network which I think is our bottleneck. Also it 
is not necessarily faster on all systems (it is system independent).
I agree with you and think we should just nuke it from there. There is no 
good reason I can think of which justifies it in that code.

Andi


-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?

2001-05-18 Thread Jani Taskinen

On Fri, 18 May 2001, Andi Gutmans wrote:

At 09:19 AM 5/18/2001 +0200, Jani Taskinen wrote:

And then I find the magical 'if(!issock) {' line.. :)
But still I'd rather nuke mmap..HOW fast is it anyway
compared to reading/writing in chunks?

I think it probably isn't really faster (at least not noticeably) because
we are anyway writing to network which I think is our bottleneck. Also it
is not necessarily faster on all systems (it is system independent).
I agree with you and think we should just nuke it from there. There is no
good reason I can think of which justifies it in that code.

Agreed. Rather have a new function that uses mmap() ie. mmap_readfile()
or whatever the name would be for it.

I'd like to hear Sascha's reasoning for adding mmap() in the first place.
Maybe he had something special in his mind?

date: 1999/09/11 18:15:39;  author: sas;  state: Exp;  lines: +57 -19
optimize fpassthru/readfile to use mmap instead of fread
which especially increases speed on large files.

--Jani



-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?

2001-05-18 Thread Andi Gutmans

At 09:29 AM 5/18/2001 +0200, Jani Taskinen wrote:
On Fri, 18 May 2001, Andi Gutmans wrote:

 At 09:19 AM 5/18/2001 +0200, Jani Taskinen wrote:
 
 And then I find the magical 'if(!issock) {' line.. :)
 But still I'd rather nuke mmap..HOW fast is it anyway
 compared to reading/writing in chunks?
 
 I think it probably isn't really faster (at least not noticeably) because
 we are anyway writing to network which I think is our bottleneck. Also it
 is not necessarily faster on all systems (it is system independent).
 I agree with you and think we should just nuke it from there. There is no
 good reason I can think of which justifies it in that code.

Agreed. Rather have a new function that uses mmap() ie. mmap_readfile()
or whatever the name would be for it.

There is no need for a new function IMO.


I'd like to hear Sascha's reasoning for adding mmap() in the first place.
Maybe he had something special in his mind?

date: 1999/09/11 18:15:39;  author: sas;  state: Exp;  lines: +57 -19
optimize fpassthru/readfile to use mmap instead of fread
which especially increases speed on large files.

OK but I really don't think it's such a big deal. Especially with the 
slower network pipe it is hard for me to believe that it really makes a 
performance difference on most systems.

Andi


-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?

2001-05-18 Thread Wez Furlong

On 2001-05-18 05:43:50, Rasmus Lerdorf [EMAIL PROTECTED] wrote:
 Obviously the mmap will be faster, but if as in bug #10701, someone is

Ignore my last post regarding this bug...

--Wez.



-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?

2001-05-18 Thread Sascha Schumann

 Obviously the mmap will be faster, but if as in bug #10701, someone is
 adding headers or doing something else to really large files, things are
 going to break.

There seem to be some misconceptions about what we are really
doing.  We map a shared(*1), read-only copy of the file into
our address space, we don't allocate any memory, we don't
operate on the mmap'ed area, and this does not change when
you add headers or do something else to really large files.

The process of mapping does not cause a read of the whole
file at once nor does it allocate memory for the whole file.
When pages are accessed, a memory fault is generated and the
data is fetched from the disk.  When there is not enough free
physical RAM for storing the contents of a new page, old
pages get thrown away (they are read-only, so there is no
reason to swap them out).  (*2)

For delivering a 600MB ISO image, the simple read/write
approach with a 2KB buffer will cause about 600,000
context switches.  The mmap implementation will need less
than 10.  This can significantly decrease the load on busy
web-servers.

By leveraging the power of the underlying OS's buffer cache,
we enable the OS to handle the dirty aspects of writing huge
amounts of disk data to the network.  Most modern OS have
certain optimizations to deal with that task in the most
efficient way (i.e. zero-copy, sendfile()).  The simple
read/write approach circumvents all possible optimizations by
OS designers.

I just ran a quick test with a 400MB ISO image, Apache 1.3
CVS, PHP 4.0 CVS on a system with 256MB RAM.  The results of
http_load are below.

40 fetches, 20 max parallel, 1.68437e+10 bytes, in 361.079 seconds
4.21093e+08 mean bytes/connection
0.110779 fetches/sec, 4.66484e+07 bytes/sec
msecs/connect: 18.0746 mean, 692.648 max, 0.072 min
msecs/first-response: 1116.32 mean, 3546.76 max, 48.349 min


*1 There is a bug in the current code, as we should be using
   MAP_SHARED.  This might be contributing to what the user
   is describing in #10701.

*2 Some Linux 2.4.x trees seem to be broken in that respect and
   don't free pages quickly enough (or not at all).  This
   causes the system to freeze.  Linux 2.2 works as expected.
   I experienced this effect on 2.4.4-ac1 (TUX patch).

- Sascha Experience IRCG
  http://schumann.cx/http://schumann.cx/ircg


-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?

2001-05-18 Thread Rasmus Lerdorf

 There seem to be some misconceptions about what we are really
 doing.  We map a shared(*1), read-only copy of the file into
 our address space, we don't allocate any memory, we don't
 operate on the mmap'ed area, and this does not change when
 you add headers or do something else to really large files.

Yes, I didn't understand the underlying nature of mmap correctly.  After
Thies' message yesterday I did some reading and agree with you.  The mmap
approach is good provided the OS doesn't screw up along the way.

-Rasmus


-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?

2001-05-18 Thread Sascha Schumann

 *2 Some Linux 2.4.x trees seem to be broken in that respect and
don't free pages quickly enough (or not at all).  This
causes the system to freeze.  Linux 2.2 works as expected.
I experienced this effect on 2.4.4-ac1 (TUX patch).

As an addition to this, page aging works again in -ac10.

Furthermore, here are some references for interested parties.

McKusick et al, The Design and Implementation of the 4.4BSD
Operating System, 5.4 Per-Process Resources, 5.12 Page
Replacement

Mauro et al, Solaris Internals, 5.3.8 Page Faults in Address
Spaces, 5.4.1 The vnode Segment: seg_vn, 5.8 The Page Scanner

- Sascha Experience IRCG
  http://schumann.cx/http://schumann.cx/ircg


-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?

2001-05-18 Thread Jason Greene


 Mauro et al, Solaris Internals, 5.3.8 Page Faults in Address
 Spaces, 5.4.1 The vnode Segment: seg_vn, 5.8 The Page Scanner

This is an excelent reference, definately a favorite of mine.

-Jason



-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?

2001-05-17 Thread Sterling Hughes

Rasmus Lerdorf wrote:

 If a system has mmap() a readfile() will mmap the entire file to memory
 and then dump that while without mmap it will read it one block at a time.
 That's a siginificant memory difference and one that may not be expected.
 
 Obviously the mmap will be faster, but if as in bug #10701, someone is
 adding headers or doing something else to really large files, things are
 going to break.  Since readfile() already has an opotional argument I
 think the right approach is a separate function that turns the mmap off
 and reads the file block by block.  fpassthru() doesn't have an optional
 arg, so we could toggle it there for that function.
 
 Comments?
 


Well, out of the solutions, I think the optional argument to fpassthru() 
would be the best.  However, why not, as you stated in response to the 
bug report, read the file with a custom function and output it, 
therefore avoiding the mmap() as well?

-Sterling


-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?

2001-05-17 Thread Rasmus Lerdorf

  If a system has mmap() a readfile() will mmap the entire file to memory
  and then dump that while without mmap it will read it one block at a time.
  That's a siginificant memory difference and one that may not be expected.
 
  Obviously the mmap will be faster, but if as in bug #10701, someone is
  adding headers or doing something else to really large files, things are
  going to break.  Since readfile() already has an opotional argument I
  think the right approach is a separate function that turns the mmap off
  and reads the file block by block.  fpassthru() doesn't have an optional
  arg, so we could toggle it there for that function.
 
  Comments?
 


 Well, out of the solutions, I think the optional argument to fpassthru()
 would be the best.  However, why not, as you stated in response to the
 bug report, read the file with a custom function and output it,
 therefore avoiding the mmap() as well?

Well, the one problem with that is that it can be very inefficient because
it stops reading when it hits a newline and you end up always reading
partial buffers if the file has newlines in it.  Having a function that
quickly reads/dumps a file block by block would make this much more
efficient.

-Rasmus


-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?

2001-05-17 Thread Sterling Hughes

Rasmus Lerdorf wrote:

If a system has mmap() a readfile() will mmap the entire file to memory
and then dump that while without mmap it will read it one block at a time.
That's a siginificant memory difference and one that may not be expected.

Obviously the mmap will be faster, but if as in bug #10701, someone is
adding headers or doing something else to really large files, things are
going to break.  Since readfile() already has an opotional argument I
think the right approach is a separate function that turns the mmap off
and reads the file block by block.  fpassthru() doesn't have an optional
arg, so we could toggle it there for that function.

Comments?



Well, out of the solutions, I think the optional argument to fpassthru()
would be the best.  However, why not, as you stated in response to the
bug report, read the file with a custom function and output it,
therefore avoiding the mmap() as well?

 
 Well, the one problem with that is that it can be very inefficient because
 it stops reading when it hits a newline and you end up always reading
 partial buffers if the file has newlines in it.  Having a function that
 quickly reads/dumps a file block by block would make this much more
 efficient.
 


fread() should handle this, no?

-Sterling



-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?

2001-05-17 Thread Rasmus Lerdorf

 If a system has mmap() a readfile() will mmap the entire file to memory
 and then dump that while without mmap it will read it one block at a time.
 That's a siginificant memory difference and one that may not be expected.
 
 Obviously the mmap will be faster, but if as in bug #10701, someone is
 adding headers or doing something else to really large files, things are
 going to break.  Since readfile() already has an opotional argument I
 think the right approach is a separate function that turns the mmap off
 and reads the file block by block.  fpassthru() doesn't have an optional
 arg, so we could toggle it there for that function.
 
 Comments?
 
 
 
 Well, out of the solutions, I think the optional argument to fpassthru()
 would be the best.  However, why not, as you stated in response to the
 bug report, read the file with a custom function and output it,
 therefore avoiding the mmap() as well?
 
 
  Well, the one problem with that is that it can be very inefficient because
  it stops reading when it hits a newline and you end up always reading
  partial buffers if the file has newlines in it.  Having a function that
  quickly reads/dumps a file block by block would make this much more
  efficient.
 

 fread() should handle this, no?

True.  But I guess my main issue is still that the behaviour changes
radically based on a hidden configure check (ie. whether mmap is there
or not) and that ensuring a block-by-block read in user space is
inefficient for huge files.

-Rasmus


-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?

2001-05-17 Thread Sterling Hughes


fread() should handle this, no?

 
 True.  But I guess my main issue is still that the behaviour changes
 radically based on a hidden configure check (ie. whether mmap is there
 or not) and that ensuring a block-by-block read in user space is
 inefficient for huge files.



good point...  hrrmmm

it seems like this is an option that should be available (somehow), yet 
I don't really like adding another option to the function, as it 
requires too much smarts on the behalf of the user (at what point does 
mmap() slow things down instead of speed them up, what is mmap(), etc.)

Perhaps automatic detection could be option?  if (filesize  X) 
blockread else mmap?  It seems like the most intuitive way to implement 
it...

I also wrestled with a configure option use_mmap (which might be good in 
general), however, that really doesn't solve the problem at hand.

-Sterling



-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?

2001-05-17 Thread Rasmus Lerdorf

  True.  But I guess my main issue is still that the behaviour changes
  radically based on a hidden configure check (ie. whether mmap is there
  or not) and that ensuring a block-by-block read in user space is
  inefficient for huge files.
 


 good point...  hrrmmm

 it seems like this is an option that should be available (somehow), yet
 I don't really like adding another option to the function, as it
 requires too much smarts on the behalf of the user (at what point does
 mmap() slow things down instead of speed them up, what is mmap(), etc.)

Well, we do check and only do the mmap() for files larger than the block
size.

 Perhaps automatic detection could be option?  if (filesize  X)
 blockread else mmap?  It seems like the most intuitive way to implement
 it...

Sounds a bit magical.  Why not just a block_readfile() function?

-Rasmus


-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]




Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?

2001-05-17 Thread Sterling Hughes

 
Perhaps automatic detection could be option?  if (filesize  X)
blockread else mmap?  It seems like the most intuitive way to implement
it...

 
 Sounds a bit magical.  Why not just a block_readfile() function?


Mainly the bloat factor, we already have a large core, imho, functions 
shouldn't be added unless there are no workarounds.  Also, it requires a 
little too much thought, into what sizes are good for mmap() and what 
sizes are good for block read's (it also requires knowledge of mmap(), 
because many people might automatically assume that block_read would 
always be faster).  I'm pretty sure if we polled php-general and php-qa 
(the more knowledgable user bases), most people wouldn't really 
understand what mmap does, or what it is for or when it is beneficial to 
use it.

As for magical, well a bit, but good magic and internal magic (not 
syntactical magic).  I'd assume that most systems have a certain point 
where mmap is no longer more beneficial than reading a file by chunks. 
If we can find a reasonable number (or have a user specify that in a 
configuration option if really necessary), it saves the user the trouble 
of thinking about something which is pretty low-level and it reduces 
bloat.  I don't really see a downside to this magic.

-Sterling






-- 
PHP Development Mailing List http://www.php.net/
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
To contact the list administrators, e-mail: [EMAIL PROTECTED]