Re: [PHP-DEV] SHA3 is very slow

2017-04-01 Thread Yasuo Ohgaki
Hi Sara,

On Sat, Apr 1, 2017 at 12:24 PM, Sara Golemon  wrote:

> On Fri, Mar 31, 2017 at 10:12 PM, Yasuo Ohgaki  wrote:
> > I noticed that our SHA-3 is inefficient.
> >
> Entirely possible.  Feel free to improve it. :D


I would like to, but it wouldn't happen in short time.
I also would like to have SHAKE algorithm.
Perhaps, hash_shake($also, $msg, $len [, $binary=false])?
Anyone, please improve it :D

Regards,

--
Yasuo Ohgaki
yohg...@ohgaki.net


RE: [PHP-DEV] Directory separators on Windows

2017-04-01 Thread Anatol Belski


> -Original Message-
> From: Fleshgrinder [mailto:p...@fleshgrinder.com]
> Sent: Saturday, April 1, 2017 2:43 PM
> To: Anatol Belski ; Rasmus Schultz
> 
> Cc: PHP internals 
> Subject: Re: [PHP-DEV] Directory separators on Windows
> 
> On 4/1/2017 2:01 PM, Anatol Belski wrote:
> > 1. optionally - yes, otherwise it should do platform default 2. no,
> > this kind of operation is a pure parsing, no I/O related checks needed
> > 3. irrelevant, but can be defined
> >
> > Other points yet I'd care about
> > - result should be correct for target platform disregarding actual 
> > platform, fe
> target Linux path Windows, or Windows path on Mac, etc.
> > - validation, particularly for reserved words and chars, also other
> > platform aspects
> > - encodings have to be respected, or UTF-8 only, to define
> > - probably should be compatible with PHP stream wrapper namespaces
> >
> >
> > Thanks
> >
> > Anatol
> >
> 
> 1. How do you envision that? If the path is `/a/b/../c` where only `/a` 
> exists right
> now? It's unresolvable, assuming that `../` points to `/a` is wrong if `b/` 
> is a
> symbolic link that points to `/x/y`.
> 
> 2. Here I agree, casing cannot be decided without hitting the filesystem. Some
> are case-sensitive, some insensitive, and others configurable.
> 
Basically, it is the same as your points 8., 9. and 10. - it deals with the 
given path itself, so no symlinks, etc. In the snippet /a/b/../c it's parsed 
like follows

- parse up to /a/b/../
- scroll back to /a
- append the remain so it becomes /a/c

Similar process is with /a/./b would become /a/b and others. It is string 
traversing only. What is done with dirname() uses this approach. In general one 
can say - normalization is a path simplification, no drive access like 
realpath() does. For example, it lets to know the path itself would be correct 
before it comes to actual file operation, and not bother with I/O otherwise. 

> 3. Does not matter for Windows itself, it is case-insensitive.
> 
> (I continue the numbering for the points you raised.)
> 
> 4. How would we go about normalizing a Windows path to POSIX? `C:\a` is not
> necessarily the same as `/a`, or should it produce `C:/a`?
>
As mentioned in an earlier post, in might make sense to have flags to control 
the behavior. Maybe a signature like

string canonicalize_path(string $path, int $flags = 0);

The function OFC knows the current platform. Flags like PATH_TARGET_WINDOWS | 
PATH_UNIXIFY would control the path separator behaviors. Generally, regarding 
path without drive letter - on Windows I'd strongely advise to not to use it in 
configs, etc. because of multiple root issues mentioned already. But in 
principle, say one has same FS structure on different platforms and just wants 
to mirror it, that would be ok with flags like PATH_TARGET_LINUX | 
PATH_STRIP_DRIVE as Linux implies forward slashes. Or otherwise, fe the reverse 
case - generating a path on Linux that is to be used on Windows, flags might 
contain only PATH_TARGET_WINDOWS which would produce backslashes as system 
default. Maybe that's too much or unrelated, and only platform targets should 
be provided, dunno, just a mind game for now.

> 5. ๐Ÿ‘
> 
> 6. I vote for UTF-8 only. We already have locale dependent filesystem 
> functions,
> which also makes them kind of weird to use, especially in libraries. Another 
> very
> important aspect to take care of this point is normalization forms. 
> Filesystems
> generally store stuff as is, that means that we can create to files with the 
> same
> name, at least by the looks of it, which are actually different ones. Think 
> of `รค`
> which can also be `aฬˆ`. It is generally most advisable to stick to NFC, 
> because that
> is also how users usually produce those chars.
> 
Yeah, probably UTF-8 were the simplest for the cross platform implementation. 
Regarding the encoding variant - that's where more care would be needed. Fe see 
https://github.com/aws/aws-cli/issues/1639 , that's where we would care about 
PATH_TARGET_MAC specific things. Comparable, fe the situation, where you want 
to escapeshell* something, but it'll be invalid on another platform or possibly 
with another shell, how it currently works. 
> 7. ๐Ÿ‘ just forward I'd say.
> 
> 8. Collapse multiple separators (e.g. `a//b` ~> `a/b`).
> 
> 9. Resolve self-references, unless they are leading (e.g. `a/./b` ~> `a/b` but
> `./a/b` stays `./a/b`).
> 
> 10. Trim separators from the end (e.g. `a/` ~> `a`).
> 
These last 3 points, as well as above one, are canonicalization. Of course, in 
the imaginary function, it could be decoupled like PATH_NO_CANONIC if it's not 
wanted, or PATH_CANONICALIZE_ONLY to omit other conversions. It's only about to 
have the behaviors sensible. Fe possible other flags could be 
PATH_STRIP_TRAILING_SLASH, PATH_ALLOW_RELATIVE and other fine things. But by 
default, the function should do the default thing for the target platform, 
based on the current platform. Thus, producing NFD for

Re: [PHP-DEV] Directory separators on Windows

2017-04-01 Thread Rasmus Schultz
10 thumbs up ;-)

But this really demonstrates how badly we need this function - I bet any
number of those points may or may not be covered by any number of
implementations in the wild.

It would be so nice to have this done "right", once and for all.


On Sat, Apr 1, 2017 at 2:42 PM, Fleshgrinder  wrote:

> On 4/1/2017 2:01 PM, Anatol Belski wrote:
> > 1. optionally - yes, otherwise it should do platform default
> > 2. no, this kind of operation is a pure parsing, no I/O related checks
> needed
> > 3. irrelevant, but can be defined
> >
> > Other points yet I'd care about
> > - result should be correct for target platform disregarding actual
> platform, fe target Linux path Windows, or Windows path on Mac, etc.
> > - validation, particularly for reserved words and chars, also other
> platform aspects
> > - encodings have to be respected, or UTF-8 only, to define
> > - probably should be compatible with PHP stream wrapper namespaces
> >
> >
> > Thanks
> >
> > Anatol
> >
>
> 1. How do you envision that? If the path is `/a/b/../c` where only `/a`
> exists right now? It's unresolvable, assuming that `../` points to `/a`
> is wrong if `b/` is a symbolic link that points to `/x/y`.
>
> 2. Here I agree, casing cannot be decided without hitting the
> filesystem. Some are case-sensitive, some insensitive, and others
> configurable.
>
> 3. Does not matter for Windows itself, it is case-insensitive.
>
> (I continue the numbering for the points you raised.)
>
> 4. How would we go about normalizing a Windows path to POSIX? `C:\a` is
> not necessarily the same as `/a`, or should it produce `C:/a`?
>
> 5. ๐Ÿ‘
>
> 6. I vote for UTF-8 only. We already have locale dependent filesystem
> functions, which also makes them kind of weird to use, especially in
> libraries. Another very important aspect to take care of this point is
> normalization forms. Filesystems generally store stuff as is, that means
> that we can create to files with the same name, at least by the looks of
> it, which are actually different ones. Think of `รค` which can also be
> `aฬˆ`. It is generally most advisable to stick to NFC, because that is
> also how users usually produce those chars.
>
> 7. ๐Ÿ‘ just forward I'd say.
>
> 8. Collapse multiple separators (e.g. `a//b` ~> `a/b`).
>
> 9. Resolve self-references, unless they are leading (e.g. `a/./b` ~>
> `a/b` but `./a/b` stays `./a/b`).
>
> 10. Trim separators from the end (e.g. `a/` ~> `a`).
>
> --
> Richard "Fleshgrinder" Fussenegger
>


[PHP-DEV] [RFC] Prevent number_format() from returning negative zero

2017-04-01 Thread Craig Duncan
Hi internals.

Following a brief discussion on the behaviour of number_format() last year
I'd like to start discussion around an RFC to bring consistency to negative
zero.

When number_format() is passed -0 it doesn't display the negative sign,
however if it's passed something that rounds to -0 then it does display the
negative sign.
https://3v4l.org/k4roB

I believe these two operations should yield the same result. The RFC takes
the stance that the rounding version should be changed to never display -0,
but I'm open to hearing arguments to change number_format() to always
display the negative sign when it's working with a negative number.

https://wiki.php.net/rfc/number_format_negative_zero

Thanks,
Craig


Re: [PHP-DEV] Directory separators on Windows

2017-04-01 Thread Fleshgrinder
On 4/1/2017 2:01 PM, Anatol Belski wrote:
> 1. optionally - yes, otherwise it should do platform default
> 2. no, this kind of operation is a pure parsing, no I/O related checks needed
> 3. irrelevant, but can be defined
> 
> Other points yet I'd care about
> - result should be correct for target platform disregarding actual platform, 
> fe target Linux path Windows, or Windows path on Mac, etc.
> - validation, particularly for reserved words and chars, also other platform 
> aspects
> - encodings have to be respected, or UTF-8 only, to define
> - probably should be compatible with PHP stream wrapper namespaces
> 
> 
> Thanks
> 
> Anatol
> 

1. How do you envision that? If the path is `/a/b/../c` where only `/a`
exists right now? It's unresolvable, assuming that `../` points to `/a`
is wrong if `b/` is a symbolic link that points to `/x/y`.

2. Here I agree, casing cannot be decided without hitting the
filesystem. Some are case-sensitive, some insensitive, and others
configurable.

3. Does not matter for Windows itself, it is case-insensitive.

(I continue the numbering for the points you raised.)

4. How would we go about normalizing a Windows path to POSIX? `C:\a` is
not necessarily the same as `/a`, or should it produce `C:/a`?

5. ๐Ÿ‘

6. I vote for UTF-8 only. We already have locale dependent filesystem
functions, which also makes them kind of weird to use, especially in
libraries. Another very important aspect to take care of this point is
normalization forms. Filesystems generally store stuff as is, that means
that we can create to files with the same name, at least by the looks of
it, which are actually different ones. Think of `รค` which can also be
`aฬˆ`. It is generally most advisable to stick to NFC, because that is
also how users usually produce those chars.

7. ๐Ÿ‘ just forward I'd say.

8. Collapse multiple separators (e.g. `a//b` ~> `a/b`).

9. Resolve self-references, unless they are leading (e.g. `a/./b` ~>
`a/b` but `./a/b` stays `./a/b`).

10. Trim separators from the end (e.g. `a/` ~> `a`).

-- 
Richard "Fleshgrinder" Fussenegger

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP-DEV] Directory separators on Windows

2017-04-01 Thread Anatol Belski
Hi,

> -Original Message-
> From: Rasmus Schultz [mailto:ras...@mindplay.dk]
> Sent: Saturday, April 1, 2017 11:13 AM
> To: Pierre Joye 
> Cc: Kris Craig ; Sara Golemon ; PHP
> internals 
> Subject: Re: [PHP-DEV] Directory separators on Windows
> 
> > Also ucfirst is useless (or any case operations)
> 
> It's not useless, if you want a normalized path on Windows, it has to include 
> a
> drive-letter, and Windows FS isn't case-sensitive.
> 
> > Right now realpath will fail if the path does not exist
> 
> I know, that's one reason I don't use it.
> 
> It kind of solves a different problem, e.g. resolves ".." and "." elements in
> paths... as a rule, I don't ever use relative paths, but it would certainly 
> be nice to
> have a realpath() that works for files that haven't been created yet.
> 
> I don't think you can simply make realpath() also normalize the path, as this
> would be a breaking change?
> 
> I guess an improved realpath() could be used internally as part of a
> normalize_path() function, but it's not enough on it's own, since the real 
> path
> will still have platform-specific directory-separators, so a
> normalize_path() function would still be useful if realpath() gets improved.
> 
> So to summarize, a normalize_path() function should:
> 
> 1. Fully normalize to an absolute path with no platform-specific separators 2.
> Have corrected case (for files/dirs that do exist.) 3. Have normalized (upper-
> case) drive-letter on Windows
> 
1. optionally - yes, otherwise it should do platform default
2. no, this kind of operation is a pure parsing, no I/O related checks needed
3. irrelevant, but can be defined

Other points yet I'd care about
- result should be correct for target platform disregarding actual platform, fe 
target Linux path Windows, or Windows path on Mac, etc.
- validation, particularly for reserved words and chars, also other platform 
aspects
- encodings have to be respected, or UTF-8 only, to define
- probably should be compatible with PHP stream wrapper namespaces


Thanks

Anatol

> There's also network file-system paths on Windows with a different syntax to
> consider? I don't know much about that...
> 
> 
> On Fri, Mar 31, 2017 at 11:40 AM, Pierre Joye  wrote:
> 
> > On Fri, Mar 31, 2017 at 3:32 PM, Rasmus Schultz 
> > wrote:
> > > Well, this is the opposite of what I'm asking for, and does not
> > > address
> > the
> > > case where paths have been persisted in a file or database and the
> > > data gets accessed from different OS.
> > >
> > > I understand the reasons given for not changing this behavior in PHP
> > > itself, so maybe we could have a standard function that normalizes
> > > paths
> > to
> > > forward slashes? e.g. basically:
> > >
> > > /**
> > >  * Normalize a filesystem path.
> > >  *
> > >  * On windows systems, replaces backslashes with forward slashes
> > >  * and ensures drive-letter in upper-case.
> > >  *
> > >  * @param string $path
> > >  *
> > >  * @return string normalized path
> > >  */
> > > function normalize_path( $path ) {
> > > $path = str_replace('\\', '/', $path);
> > >
> > > return $path{1} === ':'
> > > ? ucfirst($path)
> > > : $path;
> > > }
> >
> > Also ucfirst is useless (or any case operations). realpath goes
> > further down by solving ugly things like  \\\ or // (code
> > concatenating paths without checking trailing /\.
> >
> > > At least WordPress, Drupal and probably most major CMS and
> > > frameworks
> > have
> > > this function or something equivalent. .
> >
> > Now I remember why they have to do that.
> >
> > realpath is not fully exposed in userland. virtual_file_ex should be
> > used and provide the option to validate path or not. Right now
> > realpath will fail if the path does not exist. I would suggest to
> > expose this functionality/option and that will solve the need to
> > implement such things in userland.
> >
> > ps: I discussed that long time with Dmitry and forgot to implement it,
> > I take the blame for not having that in 7.x :)
> >
> > Cheers,
> > Pierre
> >


Re: [PHP-DEV] Directory separators on Windows

2017-04-01 Thread Fleshgrinder
On 4/1/2017 1:03 PM, Anatol Belski wrote:
> " A Uniform Resource Identifier (URI) is a compact sequence of 
> characters that identifies an abstract or physical resource" they
> say. Fits perfectly with PHP streams.
> 

The problem I was referring to is not semantically. The problem is that
the code cannot easily distinguish between local and remote files. Of
course there are functions for it again, but this would be better
expressed as part of the type system. I know that this is kind of alien
to the primitive obsessive world of PHP, but proper type systems can
help a lot to make code simpler.

That being said, it's totally off topic here. :P

On 4/1/2017 1:03 PM, Anatol Belski wrote:
> Yeah, though that draft still ignores many Windows variants โ˜น
> 
> We went anyway a bit too deep in this complex matter. Probably a
> separate function is where the opinions could be joined.
> 
> Thanks
> 
> Anatol
> 

Agree, this is my last response on this here. :)

-- 
Richard "Fleshgrinder" Fussenegger

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP-DEV] Directory separators on Windows

2017-04-01 Thread Anatol Belski


> -Original Message-
> From: Fleshgrinder [mailto:p...@fleshgrinder.com]
> Sent: Saturday, April 1, 2017 12:00 AM
> To: Anatol Belski ; internals@lists.php.net; Rasmus Schultz
> 
> Subject: Re: [PHP-DEV] Directory separators on Windows
> 
> On 3/31/2017 9:29 PM, Anatol Belski wrote:
> > I can only link to this ๐Ÿ˜‰
> >
> > http://git.php.net/?p=php-src.git;a=commitdiff;h=ec78507bd46a05f77dbde
> > 3fa4091ab4c91e61cad
> >
> >  the new implementation was consistent but had to be reverted in 7.1
> > partially, because of BC, even the use is inappropriate. Well, still
> > normalization on Windows means having '\\' in terms of the platform
> > API used, but just as a show case. The dirname function itself is
> > based on the PHP implementation, not a platform API. But also, it
> > would produce same path with different separators on different
> > platform, if normalized.
> >
> 
> A good example that showcases that we actually could normalize to slashes,
> don't you think. :)
> 
Nope, actually the opposite. More as an illustration to what shouldn't be done, 
namely fixing in core what actually would belongs to an app. But for BC, it's 
another point.

> Besides, I still believe that it is very wrong of PHP to treat URIs/URLs the 
> same
> as paths. A path can be a URI, but a URI should only be a path if it has the
> `file://` scheme. The current approach just asks for remote code inclusion, 
> URL
> fopen anyone? Different story though.
> 
" A Uniform Resource Identifier (URI) is a compact sequence of
   characters that identifies an abstract or physical resource" they say. Fits 
perfectly with PHP streams.

> On 3/31/2017 9:29 PM, Anatol Belski wrote:
> > You're right, they both are documented. What is not defined is the
> > cross platform handling. There are some documents, yes, like RFC 3986,
> > or RFC 1738 and RFC 8089 which are still in the proposed state.
> > However there is none I knew that would care about crossplatform
> > nuances in full extent. Particularly an RFC defining all the possible
> > behaviors of the file:// scheme is what were needed, I guess. Thus my
> > conclusion is to take the path of less resistance, as what is not
> > defined is not necessary good but also is not necessary broken. Yeah,
> > it is complex, and particularly in PHP historically grown, and just
> > touching the water surface might already produce some high waves.
> >
> > The functions mentioned - of course, it were up to an application to
> > decide what to use it in a particular situation, but not forcibly
> > changing the core handling. Like in the snippet above, you would have
> > currently to do dirname(realpath($path)), but that is also not
> > crossplatform and won't work on a nonexistent file. So another
> > function instead of realpath, like dirname(normalize_path($path,
> > UNIXIFY_SLASH)) were in use. The implementation might be tricky in
> > some parts, but in general doable.
> >
> > Regards
> >
> > Anatol
> >
> 
> Well, RFC 8089 has many examples in its appendix regarding Windows. It's true
> that they say that it is non-standard, however, it is how Windows deals with 
> it
> since IE4.
> 
> https://blogs.msdn.microsoft.com/freeassociations/2005/05/19/the-bizarre-
> and-unhappy-story-of-file-urls/
> 
Yeah, though that draft still ignores many Windows variants โ˜น

We went anyway a bit too deep in this complex matter. Probably a separate 
function is where the opinions could be joined.

Thanks

Anatol


Re: [PHP-DEV] Directory separators on Windows

2017-04-01 Thread Fleshgrinder
On 4/1/2017 11:13 AM, Rasmus Schultz wrote:
> So to summarize, a normalize_path() function should:
> 
> 1. Fully normalize to an absolute path with no platform-specific separators
> 2. Have corrected case (for files/dirs that do exist.)
> 3. Have normalized (upper-case) drive-letter on Windows
> 
> There's also network file-system paths on Windows with a different syntax
> to consider? I don't know much about that...
> 

1. cannot be guaranteed by a normalization function, because the parts
the dots point to might not exist. Resolving them without knowing if we
are dealing with a symbolic or hard link is impossible.

UNC paths work the same as normal paths, the only difference is their
prefix (e.g. `\\ComputerName\`), in other words, they can be treated
like a schemeless URL.

Verbatim paths are not supported by PHP anyways, hence, they can be ignored.

-- 
Richard "Fleshgrinder" Fussenegger

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Directory separators on Windows

2017-04-01 Thread Rasmus Schultz
> Also ucfirst is useless (or any case operations)

It's not useless, if you want a normalized path on Windows, it has to
include a drive-letter, and Windows FS isn't case-sensitive.

> Right now realpath will fail if the path does not exist

I know, that's one reason I don't use it.

It kind of solves a different problem, e.g. resolves ".." and "." elements
in paths... as a rule, I don't ever use relative paths, but it would
certainly be nice to have a realpath() that works for files that haven't
been created yet.

I don't think you can simply make realpath() also normalize the path, as
this would be a breaking change?

I guess an improved realpath() could be used internally as part of a
normalize_path() function, but it's not enough on it's own, since the real
path will still have platform-specific directory-separators, so a
normalize_path() function would still be useful if realpath() gets improved.

So to summarize, a normalize_path() function should:

1. Fully normalize to an absolute path with no platform-specific separators
2. Have corrected case (for files/dirs that do exist.)
3. Have normalized (upper-case) drive-letter on Windows

There's also network file-system paths on Windows with a different syntax
to consider? I don't know much about that...


On Fri, Mar 31, 2017 at 11:40 AM, Pierre Joye  wrote:

> On Fri, Mar 31, 2017 at 3:32 PM, Rasmus Schultz 
> wrote:
> > Well, this is the opposite of what I'm asking for, and does not address
> the
> > case where paths have been persisted in a file or database and the data
> > gets accessed from different OS.
> >
> > I understand the reasons given for not changing this behavior in PHP
> > itself, so maybe we could have a standard function that normalizes paths
> to
> > forward slashes? e.g. basically:
> >
> > /**
> >  * Normalize a filesystem path.
> >  *
> >  * On windows systems, replaces backslashes with forward slashes
> >  * and ensures drive-letter in upper-case.
> >  *
> >  * @param string $path
> >  *
> >  * @return string normalized path
> >  */
> > function normalize_path( $path ) {
> > $path = str_replace('\\', '/', $path);
> >
> > return $path{1} === ':'
> > ? ucfirst($path)
> > : $path;
> > }
>
> Also ucfirst is useless (or any case operations). realpath goes
> further down by solving ugly things like  \\\ or // (code
> concatenating paths without checking trailing /\.
>
> > At least WordPress, Drupal and probably most major CMS and frameworks
> have
> > this function or something equivalent. .
>
> Now I remember why they have to do that.
>
> realpath is not fully exposed in userland. virtual_file_ex should be
> used and provide the option to validate path or not. Right now
> realpath will fail if the path does not exist. I would suggest to
> expose this functionality/option and that will solve the need to
> implement such things in userland.
>
> ps: I discussed that long time with Dmitry and forgot to implement it,
> I take the blame for not having that in 7.x :)
>
> Cheers,
> Pierre
>