hi,

On Mon, Jan 7, 2013 at 6:30 PM, Nicolai Scheer <sc...@planetavent.de> wrote:

> Out of the urgent need to access files with a path longer than MAX_PATH on
> Windows, I started some research.
> At first I thought it might be a good idea to write my own stream wrapper
> extension (e.g. file_long://.....) .
>
> Before I started, I tried to find out, why those paths don't work in the
> current php code.
>
> According to [1] it is possible to use long_paths, if the path is prefixed
> correctly, e.g.
>
> \\?\
>
> for a local file path, and
>
> \\?\UNC\
>
> for a UNC path.
>
>
> I checked that fopen() and even open() in fact do work in C code with such
> long paths when using the prefix.
>
> So I bumped up MAXPATHLEN in php.h and tsrm_config_common.h to 32786 and
> recompiled a fresh php 5.3.20.
>
> Suprisingly a php script using a long path (including the prefix) did throw
> an error.
>
> Tracing that error leads to
>
> plain_wrapper.c:914 expand_filepath ->expand_filepath_ex ->virtual_file_ex
>
> These are the lines, that produce the error (tsrm_virtual_cwd.c:1255):
>
> #ifdef TSRM_WIN32
>     if (memchr(resolved_path, '*', path_length) ||
>         memchr(resolved_path, '?', path_length)) {
>         return 1;
>     }
> #endif
>
> Since there's a '?' in the string from the long path prefix the
> virtual_file_ex fails at this point.
> I did not quite understand the rationale behind this check.
> Of course, both checked characters are invalid for a regular file path.
> There seem to be some checking in tsrm_realpath_r()
> for paths like
>
> \\?\Volume{62d1c3f8-83b9-11de-b108-806e6f6e6963}\foo
>
> If I remove those memchr lines, everything magically works, e.g. fopen(),
> file_get_contents(), file_put_contents(), unlink(), rmdir(), mkdir(), etc.
>
> Only thing to do from userspace is to define the path as
>
> $path = "\\\\?\\x:\\long_stuff.......\\.....\\......\file.txt";
>
> There are a few macros that get irritated (e.g. IS_ABSOLUTE_PATH,
> IS_UNC_PATH) by the double double backslash in the path...

Yes, we do not allow kernel path, on purpose, see my comment below.

> My questions here:
>
> 1. What is the rationale behind the memchr checks for ? and *? Just
> filtering invalid paths?

When a path gets resolved (symbolic link, junction and the likes)
there are many variations that need to be dealt with.

> 2. Does allowing the "\\?\" prefix to bubble through the stream wrapper
> layer (which effectively makes it usable) break anything?
> 3. If not, is it possible to include this in php 5.3. or php 5.4?

No, not even 5.5 imo, or ever :)

> It would be indeed nice if the "\\?\" prefix was not needed in userspace
> and php would do the work. But just for now I really would like to see php
> support for long paths on windows at all. To my mind the changes needed for
> the prefix workaround are function is minimal-invasive. Correct me, if I'm
> wrong :)

I would not ever expose that prefix to userland, the consequences and
how we have to manage it are way too complicated for a user land
scripting languages (even in C apps it is not recommended).

A better solution I work on for previous php version (incl. 5.5 as I
won't make it in time) is an extension which would override existing
functions. Next major version (6) will support unicode filenames,
which will solve the 255 chars horrible limitation.

Cheers,
--
Pierre

@pierrejoye | http://blog.thepimp.net | http://www.libgd.org

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to