php-general Digest 10 Jan 2011 16:22:18 -0000 Issue 7125

Topics (messages 310625 through 310626):

gzdeflate and file_get_contents memory leak?
        310625 by: Ryan Reading

Re: Validate Domain Name by Regular Express
        310626 by: tedd

Administrivia:

To subscribe to the digest, e-mail:
        [email protected]

To unsubscribe from the digest, e-mail:
        [email protected]

To post to the list, e-mail:
        [email protected]


----------------------------------------------------------------------
--- Begin Message ---
I have a download script that streams the contents of multiple files
into a zip archive before passing it on the to browser to be
downloaded.  The script uses file_get_contents() and gzdeflate() to
loop over multiple files to create the archive.  Everything works
fine, except I have noticed that for a large number of files, this
script will exceed the php memory limit.  I have used
memory_get_peak_usage() to narrow down the source of the high memory
usage and found it to be the two above methods.  The methods are used
in a loop and the variable containing the file data is unset() and not
referenced in between calls.  The script peak memory usage for the
script should be a function of the single largest file that is
included in the archive, but it seems to be the aggregate of all
files.

Here is the pseudo-code for this loop:

header( /* specify header to indicate download */ );
foreach( $files as $file )
{
  echo zip_local_header_for($file);
  $data = file_get_contents( $file )
  $zdata = gzdeflate( $data );
  unset($data);
  unset($zdata);
}
echo zip_central_dir_for($files);

If I remove either the gzdeflate and replace the file_get_contents()
with a fread() based method, the script no longer experiences memory
problems.

Is this behavior as designed for these two functions (because PHP
scripts are usually short lived)?  Is there a way to get them to
release memory?  Is there something I'm missing?   Thanks.

-- Ryan

--- End Message ---
--- Begin Message ---
At 12:23 PM -0500 1/9/11, Daniel Brown wrote:
On Sun, Jan 9, 2011 at 11:58, tedd <[email protected]> wrote:

 For example --

 http://xn--19g.com

 > -- is square-root dot com. In all browsers except Safari, PUNYCODE is shown
 in the address bar, but in Safari it's shown as –.com

    Not sure if that's a typo or an issue in translation while the
email was being relayed through the tubes, but –.com directs to
xn--wqa.com here.

--
</Daniel P. Brown>

Daniel et al:

Translation of Unicode characters by various software programs is unpredictable -- this includes email applications.

While I can send/receive ˆ (square root) through my email program (Eudora) what your email program displays to you can be (as shown) something completely different. The mapping of the code-points (i.e., square-root) to what your program displays (much like a web site) depends upon how your email program works. If your email program has the correct Char Set and will map it to the what was actually received, then the character will be displayed correctly. If not, then things like –.com happen.

Unfortunately, this mapping problem has not been of great importance for most applications. As it is now, most applications work for English speaking people and that seems good enough, or so many manufactures think. However, as the "rest of the world" starts using applications (and logging on to the net) it will obviously become more advantageous for manufactures to make their software work correctly for other-than-English languages. Apple is doing that and last year the majority of their income came from overseas (i.e., other than USA).

The mapping of other than English characters was the problem addressed by the IDNS WG, where I added my minor contribution circa 2000. Unfortunately, homographic issues were not resolved by the WG. However, a solution was proposed (I entitled as the "Fruit-loop" solution) which was to color-code (flag) the characters in the address bar of a browser IF the URL contained a mixed Char Set. Unfortunately, that solution was not pursued and instead Browser manufactures choose to show raw PUNYCODE, which was never intended to be seen by the end users. A giant step backwards IMO.

Cheers,

tedd

--
-------
http://sperling.com/

--- End Message ---

Reply via email to