Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?
On Thu, May 17, 2001 at 11:10:57PM -0700, Rasmus Lerdorf wrote: Perhaps automatic detection could be option? if (filesize X) blockread else mmap? It seems like the most intuitive way to implement it... Sounds a bit magical. Why not just a block_readfile() function? Mainly the bloat factor, we already have a large core, imho, functions shouldn't be added unless there are no workarounds. Also, it requires a little too much thought, into what sizes are good for mmap() and what sizes are good for block read's (it also requires knowledge of mmap(), because many people might automatically assume that block_read would always be faster). I'm pretty sure if we polled php-general and php-qa (the more knowledgable user bases), most people wouldn't really understand what mmap does, or what it is for or when it is beneficial to use it. As for magical, well a bit, but good magic and internal magic (not syntactical magic). I'd assume that most systems have a certain point where mmap is no longer more beneficial than reading a file by chunks. If we can find a reasonable number (or have a user specify that in a configuration option if really necessary), it saves the user the trouble of thinking about something which is pretty low-level and it reduces bloat. I don't really see a downside to this magic. But, the issue here isn't one of which is faster. The issue here is one of memory usage. If you have a 600M iso image that you decide to readfile() for a download page of some sort, then you are going to end up with a 600M httpd process. And soon you will have lots of those as more people hit the page. mmap will not increase the size of your process as it doesn't call sbrk(). So to be truely magical here, PHP would have to check the amount of spare RAM on the system, divide that by MaxClients and set that as the largest filesize to mmap() because anything larger could result in the box going into swap. I obviously don't think such a check is feasible. The only real question here is whether to add a user configurable max-mmap setting or to add a second function that never mmaps. question is: do we really need this mmap stuff at all? with readfile we should easyly be able to saturate a pipe of any size using just read and write. tc -Rasmus -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED] -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?
Rasmus Lerdorf wrote: Perhaps automatic detection could be option? if (filesize X) blockread else mmap? It seems like the most intuitive way to implement it... Sounds a bit magical. Why not just a block_readfile() function? Mainly the bloat factor, we already have a large core, imho, functions shouldn't be added unless there are no workarounds. Also, it requires a little too much thought, into what sizes are good for mmap() and what sizes are good for block read's (it also requires knowledge of mmap(), because many people might automatically assume that block_read would always be faster). I'm pretty sure if we polled php-general and php-qa (the more knowledgable user bases), most people wouldn't really understand what mmap does, or what it is for or when it is beneficial to use it. As for magical, well a bit, but good magic and internal magic (not syntactical magic). I'd assume that most systems have a certain point where mmap is no longer more beneficial than reading a file by chunks. If we can find a reasonable number (or have a user specify that in a configuration option if really necessary), it saves the user the trouble of thinking about something which is pretty low-level and it reduces bloat. I don't really see a downside to this magic. But, the issue here isn't one of which is faster. The issue here is one of memory usage. If you have a 600M iso image that you decide to readfile() for a download page of some sort, then you are going to end up with a 600M httpd process. And soon you will have lots of those as more people hit the page. So to be truely magical here, PHP would have to check the amount of spare RAM on the system, divide that by MaxClients and set that as the largest filesize to mmap() because anything larger could result in the box going into swap. point taken :) I obviously don't think such a check is feasible. The only real question here is whether to add a user configurable max-mmap setting or to add a second function that never mmaps. +1 for the configuration option. -Sterling -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?
At 10:40 PM 5/17/2001 -0700, Rasmus Lerdorf wrote: True. But I guess my main issue is still that the behaviour changes radically based on a hidden configure check (ie. whether mmap is there or not) and that ensuring a block-by-block read in user space is inefficient for huge files. good point... hrrmmm it seems like this is an option that should be available (somehow), yet I don't really like adding another option to the function, as it requires too much smarts on the behalf of the user (at what point does mmap() slow things down instead of speed them up, what is mmap(), etc.) Well, we do check and only do the mmap() for files larger than the block size. Perhaps automatic detection could be option? if (filesize X) blockread else mmap? It seems like the most intuitive way to implement it... Sounds a bit magical. Why not just a block_readfile() function? Why is it magical? It's an internal optimization and the developer will never know. I wouldn't add another file but make it a bit smarted only to mmap() small files and use fread() (the C version) for the rest). Andi -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?
At 11:10 PM 5/17/2001 -0700, Rasmus Lerdorf wrote: Perhaps automatic detection could be option? if (filesize X) blockread else mmap? It seems like the most intuitive way to implement it... Sounds a bit magical. Why not just a block_readfile() function? Mainly the bloat factor, we already have a large core, imho, functions shouldn't be added unless there are no workarounds. Also, it requires a little too much thought, into what sizes are good for mmap() and what sizes are good for block read's (it also requires knowledge of mmap(), because many people might automatically assume that block_read would always be faster). I'm pretty sure if we polled php-general and php-qa (the more knowledgable user bases), most people wouldn't really understand what mmap does, or what it is for or when it is beneficial to use it. As for magical, well a bit, but good magic and internal magic (not syntactical magic). I'd assume that most systems have a certain point where mmap is no longer more beneficial than reading a file by chunks. If we can find a reasonable number (or have a user specify that in a configuration option if really necessary), it saves the user the trouble of thinking about something which is pretty low-level and it reduces bloat. I don't really see a downside to this magic. But, the issue here isn't one of which is faster. The issue here is one of memory usage. If you have a 600M iso image that you decide to readfile() for a download page of some sort, then you are going to end up with a 600M httpd process. And soon you will have lots of those as more people hit the page. So to be truely magical here, PHP would have to check the amount of spare RAM on the system, divide that by MaxClients and set that as the largest filesize to mmap() because anything larger could result in the box going into swap. I obviously don't think such a check is feasible. The only real question here is whether to add a user configurable max-mmap setting or to add a second function that never mmaps. I think we are getting carried away here. Why start bloating with configuration options and possible new functions? It's not as if the developer needs great control over this. I'd either nuke mmap() completely and use regular file functions (it's usually not a big loss and I don't think it's a big deal) or take an arbitrary number which we think should be considered a large file (something like 256KB) and use mmap() only for smaller files. Andi -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?
On Fri, 18 May 2001, Andi Gutmans wrote: of thinking about something which is pretty low-level and it reduces bloat. I don't really see a downside to this magic. But, the issue here isn't one of which is faster. The issue here is one of memory usage. If you have a 600M iso image that you decide to readfile() for a download page of some sort, then you are going to end up with a 600M httpd process. And soon you will have lots of those as more people hit the page. So to be truely magical here, PHP would have to check the amount of spare RAM on the system, divide that by MaxClients and set that as the largest filesize to mmap() because anything larger could result in the box going into swap. I obviously don't think such a check is feasible. The only real question here is whether to add a user configurable max-mmap setting or to add a second function that never mmaps. I think we are getting carried away here. Why start bloating with configuration options and possible new functions? It's not as if the developer needs great control over this. I'd either nuke mmap() completely and use regular file functions (it's usually not a big loss and I don't think it's a big deal) or take an arbitrary number which we think should be considered a large file (something like 256KB) and use mmap() only for smaller files. I'm for nuking the mmap() from readfile(). readfile() is supposed to be used also on remote files and IIRC mmap() is meant to be used only with regular files. Do ftell()/fstat() work for remote files? --Jani -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?
And then I find the magical 'if(!issock) {' line.. :) But still I'd rather nuke mmap..HOW fast is it anyway compared to reading/writing in chunks? --Jani On Fri, 18 May 2001, Jani Taskinen wrote: On Fri, 18 May 2001, Andi Gutmans wrote: of thinking about something which is pretty low-level and it reduces bloat. I don't really see a downside to this magic. But, the issue here isn't one of which is faster. The issue here is one of memory usage. If you have a 600M iso image that you decide to readfile() for a download page of some sort, then you are going to end up with a 600M httpd process. And soon you will have lots of those as more people hit the page. So to be truely magical here, PHP would have to check the amount of spare RAM on the system, divide that by MaxClients and set that as the largest filesize to mmap() because anything larger could result in the box going into swap. I obviously don't think such a check is feasible. The only real question here is whether to add a user configurable max-mmap setting or to add a second function that never mmaps. I think we are getting carried away here. Why start bloating with configuration options and possible new functions? It's not as if the developer needs great control over this. I'd either nuke mmap() completely and use regular file functions (it's usually not a big loss and I don't think it's a big deal) or take an arbitrary number which we think should be considered a large file (something like 256KB) and use mmap() only for smaller files. I'm for nuking the mmap() from readfile(). readfile() is supposed to be used also on remote files and IIRC mmap() is meant to be used only with regular files. Do ftell()/fstat() work for remote files? --Jani -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?
Jani Taskinen wrote: On Fri, 18 May 2001, Andi Gutmans wrote: of thinking about something which is pretty low-level and it reduces bloat. I don't really see a downside to this magic. But, the issue here isn't one of which is faster. The issue here is one of memory usage. If you have a 600M iso image that you decide to readfile() for a download page of some sort, then you are going to end up with a 600M httpd process. And soon you will have lots of those as more people hit the page. So to be truely magical here, PHP would have to check the amount of spare RAM on the system, divide that by MaxClients and set that as the largest filesize to mmap() because anything larger could result in the box going into swap. I obviously don't think such a check is feasible. The only real question here is whether to add a user configurable max-mmap setting or to add a second function that never mmaps. I think we are getting carried away here. Why start bloating with configuration options and possible new functions? It's not as if the developer needs great control over this. I'd either nuke mmap() completely and use regular file functions (it's usually not a big loss and I don't think it's a big deal) or take an arbitrary number which we think should be considered a large file (something like 256KB) and use mmap() only for smaller files. I'm for nuking the mmap() from readfile(). readfile() is supposed to be used also on remote files and IIRC mmap() is meant to be used only with regular files. Do ftell()/fstat() work for remote files? Huh? If its a remote url, or mmap() isn't found, then its not used, otherwise it is. There's no difference as far as compatibility is concerned, using mmap() when its available is simply an optimization. -Sterling -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?
On Thu, 17 May 2001, Sterling Hughes wrote: If its a remote url, or mmap() isn't found, then its not used, otherwise it is. There's no difference as far as compatibility is concerned, Yeah, I noticed that just after sending the email. :) using mmap() when its available is simply an optimization. How much does it really optimize? And what's the use of any optimization if the web server dies because of it? :) --Jani -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?
At 09:19 AM 5/18/2001 +0200, Jani Taskinen wrote: And then I find the magical 'if(!issock) {' line.. :) But still I'd rather nuke mmap..HOW fast is it anyway compared to reading/writing in chunks? I think it probably isn't really faster (at least not noticeably) because we are anyway writing to network which I think is our bottleneck. Also it is not necessarily faster on all systems (it is system independent). I agree with you and think we should just nuke it from there. There is no good reason I can think of which justifies it in that code. Andi -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?
On Fri, 18 May 2001, Andi Gutmans wrote: At 09:19 AM 5/18/2001 +0200, Jani Taskinen wrote: And then I find the magical 'if(!issock) {' line.. :) But still I'd rather nuke mmap..HOW fast is it anyway compared to reading/writing in chunks? I think it probably isn't really faster (at least not noticeably) because we are anyway writing to network which I think is our bottleneck. Also it is not necessarily faster on all systems (it is system independent). I agree with you and think we should just nuke it from there. There is no good reason I can think of which justifies it in that code. Agreed. Rather have a new function that uses mmap() ie. mmap_readfile() or whatever the name would be for it. I'd like to hear Sascha's reasoning for adding mmap() in the first place. Maybe he had something special in his mind? date: 1999/09/11 18:15:39; author: sas; state: Exp; lines: +57 -19 optimize fpassthru/readfile to use mmap instead of fread which especially increases speed on large files. --Jani -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?
At 09:29 AM 5/18/2001 +0200, Jani Taskinen wrote: On Fri, 18 May 2001, Andi Gutmans wrote: At 09:19 AM 5/18/2001 +0200, Jani Taskinen wrote: And then I find the magical 'if(!issock) {' line.. :) But still I'd rather nuke mmap..HOW fast is it anyway compared to reading/writing in chunks? I think it probably isn't really faster (at least not noticeably) because we are anyway writing to network which I think is our bottleneck. Also it is not necessarily faster on all systems (it is system independent). I agree with you and think we should just nuke it from there. There is no good reason I can think of which justifies it in that code. Agreed. Rather have a new function that uses mmap() ie. mmap_readfile() or whatever the name would be for it. There is no need for a new function IMO. I'd like to hear Sascha's reasoning for adding mmap() in the first place. Maybe he had something special in his mind? date: 1999/09/11 18:15:39; author: sas; state: Exp; lines: +57 -19 optimize fpassthru/readfile to use mmap instead of fread which especially increases speed on large files. OK but I really don't think it's such a big deal. Especially with the slower network pipe it is hard for me to believe that it really makes a performance difference on most systems. Andi -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?
On 2001-05-18 05:43:50, Rasmus Lerdorf [EMAIL PROTECTED] wrote: Obviously the mmap will be faster, but if as in bug #10701, someone is Ignore my last post regarding this bug... --Wez. -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?
Obviously the mmap will be faster, but if as in bug #10701, someone is adding headers or doing something else to really large files, things are going to break. There seem to be some misconceptions about what we are really doing. We map a shared(*1), read-only copy of the file into our address space, we don't allocate any memory, we don't operate on the mmap'ed area, and this does not change when you add headers or do something else to really large files. The process of mapping does not cause a read of the whole file at once nor does it allocate memory for the whole file. When pages are accessed, a memory fault is generated and the data is fetched from the disk. When there is not enough free physical RAM for storing the contents of a new page, old pages get thrown away (they are read-only, so there is no reason to swap them out). (*2) For delivering a 600MB ISO image, the simple read/write approach with a 2KB buffer will cause about 600,000 context switches. The mmap implementation will need less than 10. This can significantly decrease the load on busy web-servers. By leveraging the power of the underlying OS's buffer cache, we enable the OS to handle the dirty aspects of writing huge amounts of disk data to the network. Most modern OS have certain optimizations to deal with that task in the most efficient way (i.e. zero-copy, sendfile()). The simple read/write approach circumvents all possible optimizations by OS designers. I just ran a quick test with a 400MB ISO image, Apache 1.3 CVS, PHP 4.0 CVS on a system with 256MB RAM. The results of http_load are below. 40 fetches, 20 max parallel, 1.68437e+10 bytes, in 361.079 seconds 4.21093e+08 mean bytes/connection 0.110779 fetches/sec, 4.66484e+07 bytes/sec msecs/connect: 18.0746 mean, 692.648 max, 0.072 min msecs/first-response: 1116.32 mean, 3546.76 max, 48.349 min *1 There is a bug in the current code, as we should be using MAP_SHARED. This might be contributing to what the user is describing in #10701. *2 Some Linux 2.4.x trees seem to be broken in that respect and don't free pages quickly enough (or not at all). This causes the system to freeze. Linux 2.2 works as expected. I experienced this effect on 2.4.4-ac1 (TUX patch). - Sascha Experience IRCG http://schumann.cx/http://schumann.cx/ircg -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?
There seem to be some misconceptions about what we are really doing. We map a shared(*1), read-only copy of the file into our address space, we don't allocate any memory, we don't operate on the mmap'ed area, and this does not change when you add headers or do something else to really large files. Yes, I didn't understand the underlying nature of mmap correctly. After Thies' message yesterday I did some reading and agree with you. The mmap approach is good provided the OS doesn't screw up along the way. -Rasmus -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?
*2 Some Linux 2.4.x trees seem to be broken in that respect and don't free pages quickly enough (or not at all). This causes the system to freeze. Linux 2.2 works as expected. I experienced this effect on 2.4.4-ac1 (TUX patch). As an addition to this, page aging works again in -ac10. Furthermore, here are some references for interested parties. McKusick et al, The Design and Implementation of the 4.4BSD Operating System, 5.4 Per-Process Resources, 5.12 Page Replacement Mauro et al, Solaris Internals, 5.3.8 Page Faults in Address Spaces, 5.4.1 The vnode Segment: seg_vn, 5.8 The Page Scanner - Sascha Experience IRCG http://schumann.cx/http://schumann.cx/ircg -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?
Mauro et al, Solaris Internals, 5.3.8 Page Faults in Address Spaces, 5.4.1 The vnode Segment: seg_vn, 5.8 The Page Scanner This is an excelent reference, definately a favorite of mine. -Jason -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?
Rasmus Lerdorf wrote: If a system has mmap() a readfile() will mmap the entire file to memory and then dump that while without mmap it will read it one block at a time. That's a siginificant memory difference and one that may not be expected. Obviously the mmap will be faster, but if as in bug #10701, someone is adding headers or doing something else to really large files, things are going to break. Since readfile() already has an opotional argument I think the right approach is a separate function that turns the mmap off and reads the file block by block. fpassthru() doesn't have an optional arg, so we could toggle it there for that function. Comments? Well, out of the solutions, I think the optional argument to fpassthru() would be the best. However, why not, as you stated in response to the bug report, read the file with a custom function and output it, therefore avoiding the mmap() as well? -Sterling -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?
If a system has mmap() a readfile() will mmap the entire file to memory and then dump that while without mmap it will read it one block at a time. That's a siginificant memory difference and one that may not be expected. Obviously the mmap will be faster, but if as in bug #10701, someone is adding headers or doing something else to really large files, things are going to break. Since readfile() already has an opotional argument I think the right approach is a separate function that turns the mmap off and reads the file block by block. fpassthru() doesn't have an optional arg, so we could toggle it there for that function. Comments? Well, out of the solutions, I think the optional argument to fpassthru() would be the best. However, why not, as you stated in response to the bug report, read the file with a custom function and output it, therefore avoiding the mmap() as well? Well, the one problem with that is that it can be very inefficient because it stops reading when it hits a newline and you end up always reading partial buffers if the file has newlines in it. Having a function that quickly reads/dumps a file block by block would make this much more efficient. -Rasmus -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?
Rasmus Lerdorf wrote: If a system has mmap() a readfile() will mmap the entire file to memory and then dump that while without mmap it will read it one block at a time. That's a siginificant memory difference and one that may not be expected. Obviously the mmap will be faster, but if as in bug #10701, someone is adding headers or doing something else to really large files, things are going to break. Since readfile() already has an opotional argument I think the right approach is a separate function that turns the mmap off and reads the file block by block. fpassthru() doesn't have an optional arg, so we could toggle it there for that function. Comments? Well, out of the solutions, I think the optional argument to fpassthru() would be the best. However, why not, as you stated in response to the bug report, read the file with a custom function and output it, therefore avoiding the mmap() as well? Well, the one problem with that is that it can be very inefficient because it stops reading when it hits a newline and you end up always reading partial buffers if the file has newlines in it. Having a function that quickly reads/dumps a file block by block would make this much more efficient. fread() should handle this, no? -Sterling -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?
If a system has mmap() a readfile() will mmap the entire file to memory and then dump that while without mmap it will read it one block at a time. That's a siginificant memory difference and one that may not be expected. Obviously the mmap will be faster, but if as in bug #10701, someone is adding headers or doing something else to really large files, things are going to break. Since readfile() already has an opotional argument I think the right approach is a separate function that turns the mmap off and reads the file block by block. fpassthru() doesn't have an optional arg, so we could toggle it there for that function. Comments? Well, out of the solutions, I think the optional argument to fpassthru() would be the best. However, why not, as you stated in response to the bug report, read the file with a custom function and output it, therefore avoiding the mmap() as well? Well, the one problem with that is that it can be very inefficient because it stops reading when it hits a newline and you end up always reading partial buffers if the file has newlines in it. Having a function that quickly reads/dumps a file block by block would make this much more efficient. fread() should handle this, no? True. But I guess my main issue is still that the behaviour changes radically based on a hidden configure check (ie. whether mmap is there or not) and that ensuring a block-by-block read in user space is inefficient for huge files. -Rasmus -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?
fread() should handle this, no? True. But I guess my main issue is still that the behaviour changes radically based on a hidden configure check (ie. whether mmap is there or not) and that ensuring a block-by-block read in user space is inefficient for huge files. good point... hrrmmm it seems like this is an option that should be available (somehow), yet I don't really like adding another option to the function, as it requires too much smarts on the behalf of the user (at what point does mmap() slow things down instead of speed them up, what is mmap(), etc.) Perhaps automatic detection could be option? if (filesize X) blockread else mmap? It seems like the most intuitive way to implement it... I also wrestled with a configure option use_mmap (which might be good in general), however, that really doesn't solve the problem at hand. -Sterling -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?
True. But I guess my main issue is still that the behaviour changes radically based on a hidden configure check (ie. whether mmap is there or not) and that ensuring a block-by-block read in user space is inefficient for huge files. good point... hrrmmm it seems like this is an option that should be available (somehow), yet I don't really like adding another option to the function, as it requires too much smarts on the behalf of the user (at what point does mmap() slow things down instead of speed them up, what is mmap(), etc.) Well, we do check and only do the mmap() for files larger than the block size. Perhaps automatic detection could be option? if (filesize X) blockread else mmap? It seems like the most intuitive way to implement it... Sounds a bit magical. Why not just a block_readfile() function? -Rasmus -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
Re: [PHP-DEV] mmap in php_passthru_fd in file.c ?
Perhaps automatic detection could be option? if (filesize X) blockread else mmap? It seems like the most intuitive way to implement it... Sounds a bit magical. Why not just a block_readfile() function? Mainly the bloat factor, we already have a large core, imho, functions shouldn't be added unless there are no workarounds. Also, it requires a little too much thought, into what sizes are good for mmap() and what sizes are good for block read's (it also requires knowledge of mmap(), because many people might automatically assume that block_read would always be faster). I'm pretty sure if we polled php-general and php-qa (the more knowledgable user bases), most people wouldn't really understand what mmap does, or what it is for or when it is beneficial to use it. As for magical, well a bit, but good magic and internal magic (not syntactical magic). I'd assume that most systems have a certain point where mmap is no longer more beneficial than reading a file by chunks. If we can find a reasonable number (or have a user specify that in a configuration option if really necessary), it saves the user the trouble of thinking about something which is pretty low-level and it reduces bloat. I don't really see a downside to this magic. -Sterling -- PHP Development Mailing List http://www.php.net/ To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]