Re: [PHP-DEV] New website for the PHP project

2019-02-05 Thread Tom Worster
I have two suggestions, assuming you proceed roughly as outlined in your 
original post.


1. Start with /community

A new community website [4], it can be a place for people to ask 
questions and discuss php in general - no one uses IRC anymore.


and use it to build and coordinate the dev team for this new php.net 
website. If you do a good job and the conversations and culture develop 
nicely then the scope of topics can expand as you already planned. It 
could be a success, providing something I think PHP needs, even if you 
don't reach all the other goals in your project.


2. Don't use a frontend framework. There's so little stability in this 
area that you should plan to own long-term maintenance of whatever you 
use, so it's better to start with your own, picking the ideas you like 
from the work of others.




Re: [PHP-DEV] Deprecation ideas for PHP 8

2019-01-23 Thread Tom Worster

Hi George,

Iiuc, the problem you're trying to solve is that PHP offers too many 
ways to do the same thing and if there were fewer then PHP code would be 
easier to write, read and maintain.


Differences in code that make no difference to the compiler are 
differences in style.


The conventional solution is to adopt a style spec and use a linter. 
I've experienced this in Ruby, JS and PHP and it's effective.


This conventional solution causes a lot less breakage than attempting to 
mitigate the problem by changing the language.


It's annoying enough to have to deal with the promulgation of a new 
style spec and corresponding linter in your organization. Imagine if 
such a change were to be promulgated by The Rulers of PHP.


Tom

[PHP-DEV] Re: 7.3 features in announcement

2018-12-03 Thread Tom Worster
On 2 Dec 2018, at 17:50, Christoph M. Becker wrote:

> Regarding other prominent features, I think the “Flexible Heredoc and
> Nowdoc Syntaxes”[2] and the “PCRE2 migration”[3] should certainly be
> mentioned.  Also the MBString improvements[4], as well as the
> deprecations[5], and also the file related Windows improvements[6].
>
> I'm likely missing other important changes, and may overestimate some of
> those I've mentionened above.  I'm looking forward to suggestions!
>
> [2] 
> [3] 
> [4] 
> [5] 
> [6] 

Hi Christoph,

Do I read right that in 7.3 pcre gets UCD 10 while mbstring gets UCD 11?

And intl's UCD depends on the linked ICU library, right?

Tom


[PHP-DEV] Re: [RFC] Improve openssl_random_pseudo_bytes()

2018-10-22 Thread Tom Worster

Hi Sammy,

On 22 Oct 2018, at 9:46, Sammy Kaye Powers wrote:


What makes the function obsolete? The addition of the `random_bytes()`


Yes.


What makes the function obsolete? The addition of the `random_bytes()`
CSPRNG (which uses the kernel's CSPRNG) doesn't invalidate OpenSSL's
CSPRNG.


According to one argument that has a lot of currency, it does.

From the point of view of a consumer of unpredictable randoms comparing 
two APIs: a CSPRNG implemented in user-space memory vs. the system call 
to the kernel's non-blocking CSPRNG, the user-space CSPRNG 1) cannot do 
anything you need that the system call cannot, and 2) relies on the 
kernel for its entropy and periodic reseeding. Hence the user-space 
CSPRNG adds potential failure modes and adds to the attack surface and 
is therefore less trustworthy.


The same thing stated from a different point of view: Nobody knows how 
to verify CSPRNG code so staking CSPRNGs is a bad idea.


This argument has perhaps most famously been made by Thomas Ptacek
https://sockpuppet.org/blog/2014/02/25/safely-generate-random-numbers/

The argument seems to win a lot including on this list where I've never 
seen it fail.


Personally, I find the argument generally convincing and I wouldn't dare 
argue with Thomas Ptacek. While it's a broad argument with sweeping 
consequences, I think it is sound general advice for most consumers and 
probably all PHP coders.


 and A Bad Choice™ in all versions of PHP except when OS==Windows 
AND 5.4.0 <= PHP < 7.0.


What makes it a Bad Choice™? :) Historically, it certainly has had
it's fair share of implementation disasters:

1. It implemented `RAND_pseudo_bytes()` (PRNG) instead of
`RAND_bytes()` (CSPRNG) all the way up until PHP 5.6.11:
https://github.com/php/php-src/blob/PHP-5.6.11/ext/openssl/openssl.c#L5408
but that got fixed: https://bugs.php.net/bug.php?id=70014
2. `RAND_bytes()` is not fork safe by default causing major issues for
packages like ramsey/uuid:
https://benramsey.com/blog/2016/04/ramsey-uuid/#when-uuids-collide but
that got fixed in PHP 5.6.24: https://bugs.php.net/bug.php?id=71915
3. It fails open (this patch will fix that)
4. The API is confusing with the second `$crypto_strong` parameter
which doesn't do anything (the other thing that this patch will fix)


Such a list supports concern over potential failure modes and attack 
surface. It's good to fix bugs but it's even better to avoid them. The 
only thing you can loose by avoiding openssl_random_pseudo_bytes is 
hazard.


Personal story: sometime early in the 7.0 epoc, I tried to argue that 
random_compat should not use openssl_random_pseudo_bytes except on 
Windows. I failed and, iirc, item 2 in your list was a consequence.



The only reason to keep this function is BC


I'm not sure of the reasons for why this would be the case. :)


It's because anyone authoring new code should use the safer option.


`RAND_bytes()` is a proper CSPRNG and isn't set up to be used as a
deterministic PRNG.


Yes.

(Btw, "a proper CSPRNG" might be misinterpreted as a **bold** claim. 
Some CSPRNG implementations have a relatively good record (so far) but 
that's about as much as you can confidently say. I saw some interesting 
research on formal verification at 33c3 but I believe it's still true 
that code review is the state of the art. 
https://fahrplan.events.ccc.de/congress/2016/Fahrplan/events/8099.html 
https://formal.iti.kit.edu/klebanov/software/entroposcope/ from which I 
also learned to my shame that statistical tests are useless as they only 
test the output bit mixing function.)



My original ping to internals was to alias
`openssl_random_pseudo_bytes()` to `random_bytes()`, but as others
have pointed out, having more than one CSPRNG isn't necessarily a bad
thing to have. :)


I missed those remarks. I think differently that it is a good thing if 
PHP offers only one CSRNG.



Thanks again for the feedback! Maybe the vote should be split up into
two parts: 1) Fail closed & 2) deprecate the second parameter. :)


Idk. It's your RFC and I kinda hijacked the thread.

Tom

[PHP-DEV] Re: [RFC] Improve openssl_random_pseudo_bytes()

2018-10-21 Thread Tom Worster

On 19 Oct 2018, at 16:46, Sammy Kaye Powers wrote:


I'd like to start a discussion on the "Improve
openssl_random_pseudo_bytes()" RFC:
https://wiki.php.net/rfc/improve-openssl-random-pseudo-bytes

TL;DR:

CSPRNG implementations should always fail closed so this change would
make `openssl_random_pseudo_bytes()` fail closed.

The second `$crypto_strong` parameter doesn't do anything despite the
docs stating otherwise. This unnecessarily confusing parameter would
be deprecated.


At first glance I believed you were proposing that 
`openssl_random_pseudo_bytes()` should fail with an exception and that 
this would be an improvement. I would agree with that. With a little 
more concentration I see you're proposing something less ambitious that 
I'm less enthusiastic about.


The function has been obsolete since 7.0 and A Bad Choice™ in all 
versions of PHP except when OS==Windows AND 5.4.0 <= PHP < 7.0.


The only reason to keep this function is BC but removing the second 
param breaks BC for ALL conscientious and safe uses, i.e. seeking 
unpredictable (i.e. crypto strong) randoms from 5.4.0 <= PHP < 7.0 on 
Windows. There's no valid reason to ask for predictable randoms from 
OpenSSL and, afaik, its not unpredictable (i.e. it's unsafe) on other 
OSs.


I'd love to see an RFC along the lines of: "Improve PHP's OpenSSL API by 
depreciating and eventually removing openssl_random_pseudo_bytes()". Idk 
the right schedule for removing it but how could deprecating it in 7.4 
do more harm than good?


Tom


[PHP-DEV] Re: [RFC] [Discussion] Operator functions

2017-09-08 Thread Tom Worster

On 8 Sep 2017, at 17:41, Andrea Faulds wrote:


Hi everyone!

Here's an RFC for a small, simple, self-contained feature with no 
backwards-compatibility breaks and which in fact doesn't even touch 
the language's syntax (it's 50%+1 eligible!) but which could make PHP 
a bit more expressive and consistent, especially with potential later 
features. It even has a test designed to impose minimal maintenance 
burden while testing a fairly large possibility space!


Anyway, the RFC in question is this: 
https://wiki.php.net/rfc/operator_functions


Please tell me what you think and suggest any potential improvements 
or anything you think might have been an omission.


Yes!

I have wanted this for many years. In the first programming language in 
which I achieved real proficiency, this was vernacular. It would make me 
happy to return to it in the language I now use most. An anonymous 
function that turns an operator into three lines looks dumb and makes me 
sad.


Tom

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] PHP 7.1.0 to 7.2.0beta2 mt_rand() modulo bias bug

2017-09-08 Thread Tom Worster

On 8 Sep 2017, at 8:31, Solar Designer wrote:


On Fri, Sep 08, 2017 at 07:56:23AM -0400, Tom Worster wrote:

From: Nikita Popov <nikita@gmail.com>


Sorry for the long delay. I've just applied
https://github.com/php/php-src/commit/fd07302024bc47082b13b32217147fd39d1e9e61
to the 7.2 branch.

Davey, Joe, do we want to take action here for 7.1? It's a pretty
severe
bias, but fixing it is going to change seed sequences. I think at 
this

point we're too far in the 7.1 cycle to apply this kind of change.


I think it is very unlikely that anyone has PHP software that relies 
on
predictable output given a 64-bit seed. And, yes, the bias is bad so 
I

would not worry about fixing it asap.


This sounds confused.  There's no 64-bit seed - PHP's mt_srand() only
supports 32-bit seeds.  Then you say "the bias is bad" and at the same
time "would not worry about fixing it asap", which look inconsistent.

The original problem I reported applies to 64-bit builds of PHP - 
which

is probably most builds these days - when mt_rand() is invoked with a
range that fits in 32 bits - which again is the typical case for the 
use

of ranges.  However, the bias can be large only for large ranges (yet
not exceeding 32 bits).  For typically used small ranges, the bias is
small.  Also, fixing the bug doesn't fully change the sequence of
generated random numbers - for typically used small ranges, the
probability that the fix changes a random number to another (for the
same seed) is small.  So the sequences will change, but not fully.  
I'm

not sure if this is good or bad, as sometimes complete failure of
something that worked for someone before is preferable; I merely point
out what will actually happen.

Later in the discussion, Nikita pointed out an extra problem (also
causing biases) that affected the rarely-used 64-bit ranges.  
Similarly,
fixing it doesn't fully change the sequence of generated random 
numbers -

again, for typically used small ranges (this time relative to the
64-bit space), the probability that the fix changes a random number to
another (for the same seed) is small.

Another detail is that these fixes make 32- and 64-bit builds of PHP
consistent, which isn't the case for 7.1.x now.  So retaining the bugs
in 7.1.x for consistent behavior doesn't exactly achieve that - it 
does

for consistency within 7.1.x series, but not across 32- vs. 64-bit
builds.  Fixing the bugs would achieve the latter, but break the 
former.


I have no strong preference here.  I merely point out the confusion 
and

try to correct it.


Yes, I was confused. I meant to talk about large ranges but even so your 
summary is an education so thank you.


My input is to offer an opinion on the relative importance of 
considerations.


Fixing the bias would be an urgent priority because I think I a lot of 
programs are written assuming a uniform distribution.


While I broadly agree with what you describe as "typical", it might be 
hard for a user to know how big a problem the bias is in their 
situation. Fixing the bias eliminates that doubt and the handwaving 
about what is typical.


Programs that exploit the predictable property are specialized 
(comparing different monte carlo experiments based on the same 
pseudorandom input is the only example I know) and I think much less 
common in PHP. (Note: these programs are also likely to need unbiased 
stats.) I doubt that something will fail (i.e. break as in BC break) due 
to inconsistency within 7.1.x but the change might cause some extra work 
or faulty experimental conclusions. If I were an experimenter dealing 
with this change I'd rather rerun the cases I ran on the buggy RNG than 
continue with a known bad RNG. And I'd rather do this sooner than later.


I think we serve this specialized community better (if it exists at 
all!?) fixing it in 7.1, which also helps make these users aware of the 
bug. Everyone else is probably either unaffected by the fix or their 
programs will behave better.


Tom

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] PHP 7.1.0 to 7.2.0beta2 mt_rand() modulo bias bug

2017-09-08 Thread Tom Worster

From: Nikita Popov 


Sorry for the long delay. I've just applied
https://github.com/php/php-src/commit/fd07302024bc47082b13b32217147fd39d1e9e61
to the 7.2 branch.

Davey, Joe, do we want to take action here for 7.1? It's a pretty 
severe

bias, but fixing it is going to change seed sequences. I think at this
point we're too far in the 7.1 cycle to apply this kind of change.


I think it is very unlikely that anyone has PHP software that relies on 
predictable output given a 64-bit seed. And, yes, the bias is bad so I 
would not worry about fixing it asap.


Tom

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] hash_hkdf() signature

2017-02-07 Thread Tom Worster

On 2/7/17 3:22 PM, Scott Arciszewski wrote:

One such real-world use case: Defuse v1 used HKDF without a salt.

https://github.com/defuse/php-encryption/blob/b87737b2eec06b13f025cabea847338fa203d1b4/Crypto.php#L157-L170
https://github.com/defuse/php-encryption/blob/b87737b2eec06b13f025cabea847338fa203d1b4/Crypto.php#L358

In version 2, we included a 32-byte random salt for each encryption, 
which
was stored next to the AES-256-CTR nonce in the ciphertext. (Both the 
nonce
and HKDF-salt, as well as the version information header, are covered 
by

the HMAC of the ciphertext.)

The end result: Instead of having to worry about birthday collisions 
after
you've seen 2^64 AES outputs (because 128-bit randomly generated 
nonce),

now you need 2^192 before you have a useful collision.


In this situation shouldn't you either use a longer random IKM or not 
use HKDF at all?


If your IKM is so weak that it needs a salt then shouldn't you use an 
iterated hash instead of HKDF?


Tom

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: internals Digest 3 Feb 2017 23:56:52 -0000 Issue 4435

2017-02-04 Thread Tom Worster

On 3 Feb 2017, at 18:56, internals-digest-h...@lists.php.net wrote:


HKDF w/o salt is OK, but with salt, it's much stronger than w/o it.


That's not correct.

The salt defends against certain attacks on predictable input key 
material, i.e. weak passwords. But HKDF should not normally be used for 
passwords because it is unsuitable.


There is something like a weird pattern to your attempts to help PHP 
programmers use the wrong function for the job -- HKDF for passwords, 
uniqid and mt_rand for unpredictable randoms.


Tom

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Improving mt_rand() seed

2017-02-02 Thread Tom Worster

On 1 Feb 2017, at 22:47, Yasuo Ohgaki wrote:


Posting RFC draft before discussion

https://wiki.php.net/rfc/improve_predictable_prng_random

This RFC includes results of recent PRNG related discussions.
I would like to keep it simple, but basic object feature will be
implemented.

Methods could raise exceptions for invalid operations rather than 
ignoring.


Comments?


I don't see in it any results of recent PRNG related discussions. I only 
see your ideas and a disregard for other opinions.



# 1 The very first sentence

"Current predictable PRNG, i.e. mt_rand() and rand(), produces very weak 
random values even produces non random values."


is plain nonsense. mt_rand implements the Mersenne Twister 19937 with 
32-bit seed. It's a standard algorithm that's been used widely. It does 
exactly what it's supposed to do.


The idea that it produces "very weak random values" expresses your 
obsession with its use to produce unpredictable values, which it cannot 
possibly do. Nobody else is trying to make it do this.



# 2 Purpose of object API

You've misunderstood the idea of the "object-based PRNG interface". It's 
purpose is not to instrument mt_rand internals but to introduce 
alternative new generators. We concluded last year that there's very 
little we can do about mt_rand without breaking BC and/or making more 
mess, so it should be left to rot while we offer users something better. 
The new OOP API might offer things like MT, Xorshift, PCG, legacy LCGs, 
or whatever you fancy (maybe even MT-PHP :), distributions and 
utilities.



# 3 This API

Where to start? Have you any experience with the random APIs of any 
decent statistical packages? Take a look at Boost Random, R, ...


Who wants a low-level interface to the MT algorithm's internals?


# 4 "Fix" for abuse of mt_rand

Your concern that people abuse mt_rand to get unpredictable values is 
valid. But I don't think your attempts to mitigate this are useful.


Hypothetically, say we take a radical approach and make mt_rand draw 
from php_random_bytes by default on every call. How much good would it 
do?


Unmaintained legacy apps are unaffected because nobody upgrades them 
from whatever PHP 4 or 5 they are on. Only apps that have stopped 
receiving security updates but that are nevertheless actively maintained 
to migrate them new major PHP versions can benefit. It's a niche we can 
better target with new APIs and education. W.r.t. unpredictable randoms, 
that's already done in 7.0.


Hence I don't believe abuse of mt_rand is something we can fix by 
modifying its behavior. It's too late.


But if you really insist on doing *something* to mt_rand then make its 
automatic seed use php_random_bytes and, if that fails, fall back to 
GENERATE_SEED. While I doubt it will make the world any safer, it is as 
effective as my hypothetical radical "fix" and should be harmless so 
long as you don't introduce new bugs.


Tom


Re: [PHP-DEV] Improving mt_rand() seed

2017-02-02 Thread Tom Worster

On 2 Feb 2017, at 8:24, Christoph M. Becker wrote:


On 02.02.2017 at 12:51, Yasuo Ohgaki wrote:

Although users must never do this, but there are codes that generate 
random

password/access key by mt_rand().


There is also code that stores clear text passwords.  How would you
prevent that?

IMHO, if users don't care to read the docs[1], it's their fault, and 
we

shouldn't waste our time to fix their bugs.

[1] 


We cannot fix these bugs without making mt_rand a CSPRNG, which means it 
is no longer mt_rand.


All we can do is mitigate the problem (to some unknowable extent) by 
seeding mt_rand from php_random_bytes. I don't care if we do this or not 
so long as the change is simple and BC, i.e. 32-bit seed that falls back 
to something else if php_random_bytes fails.


Tom

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: Improving mt_rand() seed

2017-01-29 Thread Tom Worster

On 1/28/17 4:32 PM, Yasuo Ohgaki wrote:


Could you give some examples?

I'm not sure what kind of IoT devices/OS that support PHP do not have
CSPRNG.


I'm sorry, my reply ended up with subject "Re: internals Digest 27 Jan 
2017 10:58:15 - Issue 4425". My fault. I'll copy it here...


There are two problems. One is [embedded OSs with crummy 
RNGs](http://samvartaka.github.io/cryptanalysis/2017/01/03/33c3-embedded-rngs).


The other is any OS in a "low-entropy environment", fancy-talk for the 
situation when the OS's techniques for gathering "noise" from devices 
are frustrated by their absence, or little to no activity on those 
devices, or the activity not being random.


I don't want to get into an argument about on which IoT Things you might 
find PHP. But we know its growing fast, the Things are significant in 
[botnets](https://krebsonsecurity.com/2016/09/krebsonsecurity-hit-with-record-ddos/), 
and that the Things often come with a web server for admin. It's not 
unreasonable to use PHP+SQLite to admin a Linux-based baby monitor, for 
example.



OSes can provide CSPRNG w/o hardware based RNG. Security on IoT 
matters
a lot, especially for IoT that supports PHP. CSPRNG features are in 
PHP

core
already. Secure PHP scripts wouldn't work anyway on such devices 
anyway.

e.g. generating nonce or like.


Correct. But we're talking about mt_rand() and therefore not about 
crypto. Apps that used it *correctly*, i.e. that did not relying on its 
output being unpredictable, will stop working if you make mt_rand() fail 
on these systems.


It's a policy that selects competent developers that understand 
mt_rand() for greatest punishment.




Issues are
 - Current mt_rand() is not fully exploited. It wastes more than 99% 
of its

random cycle.


This was discussed in Aug last year and dropped.


 - Current uniqid()'s entropy is extremely poor and there is fair 
chances

for collisions.


I don't much care about uniqid().



Question is
 - Are we going to keep these poor behaviors as PHP spec/standard 
forever

or not.


idk.

My only argument in this is to disagree with you that it's ok to change 
these functions so they don't work in situations in which they did 
before. You can improve their seeding without doing that. Just don't 
make them fail when the improved seeding fails.


Apart from that, I don't mind you flogging these two dead horses.

Tom


[PHP-DEV] Re: internals Digest 27 Jan 2017 10:58:15 -0000 Issue 4425

2017-01-27 Thread Tom Worster

On 27 Jan 2017, at 5:58, internals-digest-h...@lists.php.net wrote:



One would like to think so but low entropy environments exist. The 
problem

may even be getting more widespread as embedded systems become more
widespread.



Could you tell us which platforms could have problem with CSPRNG 
usage?


There are two problems. One is [embedded OSs with crummy 
RNGs](http://samvartaka.github.io/cryptanalysis/2017/01/03/33c3-embedded-rngs).


The other is any OS in a "low-entropy environment", fancy-talk for the 
situation when the OS's techniques for gathering "noise" from devices 
are frustrated by their absence, or little to no activity on those 
devices, or the activity not being random.


I don't want to get into an argument about on which IoT Things you might 
find PHP. But we know its growing fast, the Things are significant in 
[botnets](https://krebsonsecurity.com/2016/09/krebsonsecurity-hit-with-record-ddos/), 
and that the Things often come with a web server for admin. It's not 
unreasonable to use PHP+SQLite to admin a Linux-based baby monitor, for 
example.



As I stated before, I'm supposing CSPRNG availability is not a problem 
for

PHP environment today,
OSes provide CSPRNG value unless there is something really bad things
happened. i.e. hardware failure,
serious OS bug.


The "[Just](http://www.2uo.de/myths-about-urandom/) 
[use](https://sockpuppet.org/blog/2014/02/25/safely-generate-random-numbers/) 
[urandom](https://twitter.com/FiloSottile/status/765982275515408384)" 
meme spread virally in the last couple of years. That's good to help 
counter the mistrust that Linux man random(4) creates and get people 
away from more exotic RNGs. But it shouldn't be understood to mean "we 
can always trust urandom to be present and correct".




I could be wrong about this. Do you have idea what platforms will be
affected?


For example, Lauri Kenttä has been testing with Raspberry Pi. Depending 
what it's connected to, it might be.


I think PHP programs that worked before using mt_rand() should be 
allowed to continue to work.


Tom


[PHP-DEV] Re: [Discussion] HKDF

2017-01-11 Thread Tom Worster
Hi Andrey,

Is there a draft of end-user docs for the PHP function?

Tom

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] PHP 7.1.0 Released

2016-12-02 Thread Tom Worster
I saw "**PHP 7.1.0 Released**" on php.net

yay! props to contributors. and thank you.

tom

Re: [PHP-DEV] HashDoS

2016-09-23 Thread Tom Worster

On 9/22/16 3:46 AM, Rowan Collins wrote:


I think I'm right in saying that the power of the attack comes in the fact that 
the total time doesn't scale linearly but exponentially.


quadratic is what i read in the previous thread, iirc. even so, it's 
still a useful gain.




That doesn't exactly answer the question of whether 1000 is the right value, of 
course.


it's the parameter for what's in effect a statistical hypothesis test 
for randomness, built on the assumption that key patterns that are not 
hostile are quasi-random and those that are not random are hostile. 1000 
seems large if testing randomness, were that the only consideration.


but i guess there is a concern that, in some cases, legitimate use could 
have key patterns with regularities that lead to accumulation in some bins.


so it should work, to a useful extent, if there is a parameter value

- low enough to give a worthwhile degree of dos attack protection

- high enough to false positive only for benign patterns that would in 
any case cause terrible performance degradation


tom


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] HashDoS

2016-09-21 Thread Tom Worster

On 9/21/16 8:37 AM, Rowan Collins wrote:

On 21 September 2016 13:02:20 BST, Glenn Eggleton  wrote:

What if we had some sort of configuration limit on collision length?


Previous discussions have come to the conclusion that the difference between 
normal collision frequency and sufficient for a DoS is so large that the only 
meaningful settings would be on or off. e.g. the proposed limit is 1000, and 
randomly inserting millions of rows produces about 12.

The problem with long running applications is not that they need to raise the 
limit, it's that they need to handle the error gracefully if they are in fact 
under attack. Because hash tables are so ubiquitous in the engine, there's no 
guarantee that that's possible, so an attacker would have the ability to crash 
the process with the limit turned on, or hang the CPU with the limit turned off.


Right. It seems like count-and-limit pushes the problem onto the user 
who then has to discriminate normal from malicious causes for rising 
counters and find appropriate actions for each.


Even a sophisticated user who understands hash collision counters may 
not welcome this since it adds complexity that's hard to test and 
involves questionable heuristics.


Tom


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] HashDoS

2016-09-21 Thread Tom Worster

On 9/20/16 10:25 PM, Stanislav Malyshev wrote:

Note that to avoid problems with opcache we can only randomize on
initial boot (even then synchronizing among different processes sharing
opcache may be challenging). That means that the process would be
running for extended time (at least days, in theory as long as uptime
allows) with the same seed. Given that, I'm not sure how much
randomization would really improve.


While randomization doesn't eliminate the problem, isn't it still a 
valid complication for attackers? If everybody's PHP instance is running 
with a different hash key, that's harder to attack than if than if they 
all have the same key, even if the key isn't frequently changed.


It reminds me of when Logjam was in the news and we realized it wasn't 
smart for everyone to use the same default DH primes.


Tom


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: HashDoS

2016-09-20 Thread Tom Worster

On 9/15/16 2:48 PM, Scott Arciszewski wrote:


Would the Internals team be open to discussing mitigating HashDoS in a
future version of PHP? i.e. everywhere, even for json_decode() and friends,
by fixing the problem rather than capping the maximum number of input
parameters and hoping it's good enough.

I'd propose SipHash (and/or a derivative): https://www.131002.net/siphash/

(Look at all the other languages that already adopted SipHash.)


I briefly looked through the "Users" list and didn't find anything 
equivalent to using it as PHP's internal base hash.


Python and Rust have an implementation available to users. Ruby is using 
it internally but I think it's focused on JSON.


There's some good info[1] on the situation in Perl 5. While SipHash is 
available it requires a non-default compile-time option.


Correct me if I'm not reading the situation right.

Tom

[1] 
http://news.perlfoundation.org/2012/12/improving-perl-5-grant-report-11.html




--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] HashDoS

2016-09-20 Thread Tom Worster

On 9/16/16 1:59 AM, Thomas Hruska wrote:


If anyone wants a VERY rough estimate of relative performance
degradation as a result of switching to SipHash, here's a somewhat naive
C++ implementation of a similar data structure to that found in PHP:

https://github.com/cubiclesoft/cross-platform-cpp

(See the "Hash performance benchmark" results at the above link.)

In short, there's a significant degradation just switching from djb2 to
SipHash depending on key type.  A similar effect would probably be seen
in PHP.


The difference is big enough that people won't want this as a precaution 
affecting all of PHP's hashes.


But it's small enough that people might opt for it as a defensive 
measure in case of serious attacks in the wild. So having an 
implementation but not compiling it by default would be interesting.




Randomizing the starting hash value for djb2 during the core startup
sequence *could* also be effective for mitigating HashDoS.  Extensive
testing would have to be done to determine how collision performance
plays out with randomized starting hash values.  I can't find any
arguments anywhere against using randomized starting hash values for
djb2.  Also of note, the 33 multiplier seems more critical than anything
else for mixing bits together.


This is consistent with what Nicholas Clark wrote[1] that I mentioned 
already in my reply to Scott.


However, he also says

> I've got a sneaking suspicion that this story still has legs, and 
that someone will pop up with some new surprise or twist. Hence I'm 
keeping an eye open to spot any more developments in this decade old 
saga, in case there is action Perl 5 needs to take.


In which case it is nice for Perl to have SipHash implemented but not 
compiled by default.


Tom


[1] 
http://news.perlfoundation.org/2012/12/improving-perl-5-grant-report-11.html


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Make uniqid() more unique

2016-09-09 Thread Tom Worster

On 9/9/16 6:12 AM, Nikita Popov wrote:


The problem with "fixing" this function to be cryptographically
unpredictable (rather than just unique, for a limited definition of unique)
is that it will necessarily change the size of the output, on which there
may be assumptions. A 128 bit random value is 22 chars in base64, which is
a good bit larger than the current uniqid() output.

I agree with Niklas, this function should simply be deprecated.


It is already in the sin bin, with that warning that steers users to 
safer options, so it makes more sense to deprecate than to reform.


Tom


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Make uniqid() more unique

2016-09-09 Thread Tom Worster

On 9/9/16 7:48 AM, Yasuo Ohgaki wrote:


Some of us feel returning almost random value from uniqid() is
overkill. This is reasonable.


How would it be overkill if uniqid() used, say, php_random_bytes()?

Tom


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Make uniqid() more unique

2016-09-09 Thread Tom Worster

On 9/9/16 7:18 AM, Arvids Godjuks wrote:

2016-09-09 13:37 GMT+03:00 Niklas Keller :


Most people think getting true random is a overkill and implement things
non-secure.


Most? Idk. But there certainly are many programmers that still believe 
the **myth** that one should be conservative of the randoms. It's 
nonsense. And uniqid() should either step aside or be properly random.




I just don't need true random here, just some form of replacing an integer
ID with a value, that cannot be changed just by "+1"


You can't have a "true" random -- at best you can have the kind of 
pseudo-randomness the CSPRNG provides, which, btw, is ideal for this 
purpose.


Another way of doing this is encrypt your predictable IDs. But this is 
rather similar to using random IDs since the CSPRNG is using the key 
stream of a symmetric encryption cipher.




That's exactly where uniqid SHOULD NOT be used. It's predictable. Anyone
can easily guess these URLs. If you want to prevent that, you should use
non-predictable secure random, also called cryptographically secure random:
CSRPNG. See random_bytes and random_int.


I agree with Niklas.



The way the system works and that this is a semi-closed tool for business
purposes, the only real thing why we need these ID's is to track people.
Before this plain numeric ID's from the DB records were used. With the
rewrite the client asked to make ID's so you can't just do a +1 and see
something different. No one will ever want to try and break the uniqid algo
just to get the other page (probably the same text). I also use the
extended version of the uniqid.


Sounds like old-fashioned "security through obscurity", where the level 
of obscurity is somewhere between using strong crypto and trivially 
predictable.


I don't think PHP should encourage it. There is no reason for it. If you 
want unpredictability, there is no reason to hesitate to use 
random_bytes() or random_int() as much as needed. You won't break anything.




Could you outline why you need 200 - 600 IDs in a single action?



Because it's a CSV import and I need to assign every record an ID at that
moment. Those ID's are then exported by admins to a 3rd party system.


Go ahead and read 64 KiB from random_bytes() if that's what you need. 
It's safe and not worth your time to worry about it.




Sure, but for that you can as well just use `microtime` or `time`. As
shown, it's easily misused, you're the perfect example. :-)



microtime and time are easier to guess. And time() is not an option,
because I will get 600 equal ID's then. Microtime is an option, but then
you get number only string and it looks awfully sequential :) Hence the
uniqid usage, that is basically time + microtime if I understand from the
manual, but it generates a bit more random result and I'm sure I get a
unique value on every call. Improving it so it does not look awfully
sequential would suffice for the use cases it is needed. In my case this
was a clearly conscious choice with full understanding how it works.


You are free to devise your own algorithm for generating "somewhat 
unpredictable" identifiers.


But PHP should not should not be involved. I think we should either 
deprecate uniqid() or make it use php_random_bytes() and ignore its 
second parameter.


Tom


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Make uniqid() more unique

2016-09-09 Thread Tom Worster

On 9/9/16 4:36 AM, Arvids Godjuks wrote:


It's also useful in other cases, where using a full blown true random
source is just overkill.


Users should not hesitate to use random_bytes() or php_random_bytes() or 
any of the functions that use them.




For example, my recent usage was to use the result of uniqid('', true) as a
few parameters in URL's instead of plain numeric ID. Client just wanted to
users can't do a +1 and see someone else's result page that might have a
different text or a different campaign even. And I do need to generate
those id's in bursts - 200 to 600 id's in a single action, I would imagine
generating 600 random strings of ~20 char length can be hard on the source
of the randomness, may even deplete it.


It is not possible to deplete this source of randomness.



And I expect the numbers to grow.
So, deprecating it I think is really an overreaction. It's a handy tool. It
can be used to generate filenames too, and a lot of other stuff.

My thoughts are - improve it. Yes, the standard uniqid() is a bit too
short, I have never used it without the second "true" parameter and that
dot in the middle of the string is annoying - I had to strip it out every
use case I had.





--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: [RFC] Deprecate PEAR/PECL & Replace with composer/pickle

2016-09-04 Thread Tom Worster

On 9/2/16 3:32 PM, Davey Shafik wrote:

Hi internals,

I'd like to introduce a new RFC to deprecate pear/pecl (in 7.2, and remove
in 8.0),


Yes!

It's good for the software (and its users) if it is maintained and 
contributed to using _currently_prevailing_conventions_ in f/oss.




as well as add composer/pickle (optional in 7.2, default in 7.3+)
in their place.


Not so sure about this.

While composer seems to be the currently prevailing convention for PHP 
deps right now (I don't know pickle), each project could instead have 
its own installation guide containing whatever makes sense (perhaps even 
something Windows-based for Tony).


The PHP project could usefully maintain a policy on installation for the 
projects it has fingers in (PHP's a style guide to installation guides).


But I don't think it helps to *dictate* those tools and methods and tie 
them to PHP releases. Just liberate the code and *encourage* uniform 
practices but let them evolve with some autonomy.


Tom


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC][VOTE] Add session_create_id() function

2016-08-19 Thread Tom Worster

On 8/16/16 10:51 AM, Lester Caine wrote:

On 16/08/16 13:08, Tom Worster wrote:

The default 128 bits Session ID is large enough to ignore collisions
https://wiki.php.net/rfc/session-create-id#discussions

It describes for an application, but PHP is a platform.
There are millions PHP apps or more and there could be billions of
active sessions. There could be tens of thousands new session IDs or
more are created. Apply the calculation for expected time of possible
collision.

Do you still sure "There will be no collisions at all"?

The calculation underestimates the difficulty of finding collisions by 38
decimal orders of magnitude. The number of different SIDs in default PHP
config is 2^192, not 2^64. So yes, I am still sure.


In a distributed system which would be required to handle millions of
sessions at the same time, then one will have thousands of copies of PHP
running and shared via some sort of traffic manager. So unless some sort
of mechanism is included to provide identification of the PHP instance
then it is probable that different instances will all produce the same
sequence of numbers. A UUID generator provided to ensure every
distributed service has a uniquely identifiable id for every 'session'
is not something that forms part of a single instance of PHP. It must be
centrally managed with a central session store. All that a single
instance of PHP should be worrying about is a few hundred active sessions?


(I think you could use a hash for this. But that's beside the point 
because...)


I have no problem with session_create_id().

I have a problem with saying that CSPRNG is so untrustworthy that users 
must find ways to compensate for its faults in their code.


And I have a problem with a statement to this effect being in the RFC. 
And with that statement obscuring the crux of the argument with 
misleading math about an SID database.


Tom


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: [RFC][DISCUSSION] Argon2 Password Hash

2016-08-17 Thread Tom Worster
On 8/17/16, 3:48 PM, "Charles R. Portwood II"

wrote:

>Hi everyone,
>
>I've spent the last week and a half playing around with various cost
>factors on different virtual machines and hardware (including compiling
>this down for armv6 and testing on a Pi Zero), and looking over the spec
>a bit more and would like to update the proposal to use the following
>cost factors:
>
>
>memory_cost = 1 MiB
>time_cost = 2
>threads = 2
>
>
>There are no "bad" cost factors for Argon2, but obviously more work is
>better than less. These cost factors provide sufficient work effort
>without exhausting system resources. Argon2 is pretty fast with these
>cost factors even on a Pi Zero, which is the most resource constrained
>system I could get my hands on. In all my testing I wasn't ever able to
>get memory exhaustion to occur just from running argon2 hashing.
>
>I'd like to gather some last feedback and make sure there aren't any
>serious objections to these cost factors (or anything else for that
>matter) before putting this up for a vote. Please let me know your
>thoughts.

Hi Charles,

I trust your judgement in drawing conclusions from these experiments.

Thank you for the work you've put in.

Tom



-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC][VOTE] Add session_create_id() function

2016-08-16 Thread Tom Worster
On 8/15/16, 5:39 PM, "Yasuo Ohgaki" <yohg...@ohgaki.net> wrote:

>On Tue, Aug 16, 2016 at 6:03 AM, Yasuo Ohgaki <yohg...@ohgaki.net> wrote:
>> On Tue, Aug 16, 2016 at 5:21 AM, Tom Worster <f...@thefsb.org> wrote:
>>> On 8/14/16 4:13 PM, Yasuo Ohgaki wrote:
>>>
>>>> "Now assume a 128 bit session identifier that provides 64 bits of
>>>> entropy.
>>>
>>>
>>> What exactly does this mean?
>>
>> When you have random 128 bits value, it does not mean it has full size
>>entropy.
>>
>> Anyway, why you insist? CSPRNG should be good enough for security
>> purpose, but nobody proves CSPRNG that PHP uses are collision free.
>> Session ID validation is cheap cost for serious web users.
>>
>> Basically you're saying “We do know it may happen, but you just had
>> rare bad luck. Even though protection could be implemented, whatever
>> consequences are your responsibility. It's the PHP way”.

That is not what I am basically saying.


>> I strongly disagree with this kind of attitude.
>>
>> If there are users who really do not want collision detection at all,
>> they should do it by their own responsibility and risk.
>
>Above discussion is added to the RFC.
>
>The default 128 bits Session ID is large enough to ignore collisions
>https://wiki.php.net/rfc/session-create-id#discussions
>
>It describes for an application, but PHP is a platform.
>There are millions PHP apps or more and there could be billions of
>active sessions. There could be tens of thousands new session IDs or
>more are created. Apply the calculation for expected time of possible
>collision.
>
>Do you still sure "There will be no collisions at all"?

The calculation underestimates the difficulty of finding collisions by 38
decimal orders of magnitude. The number of different SIDs in default PHP
config is 2^192, not 2^64. So yes, I am still sure.

Tom



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC][VOTE] Add session_create_id() function

2016-08-15 Thread Tom Worster

On 8/14/16 4:13 PM, Yasuo Ohgaki wrote:


"Now assume a 128 bit session identifier that provides 64 bits of
entropy.


What exactly does this mean?

If it means that an attacker knows how to eliminate 2^128 - 2^64 
impossible SID values from a search then that SID generation is 
insecure, dangerous garbage. (This isn't the only statement I've seen on 
OWASP that strikes me as very odd.)


Each bit of output from a CSPRNG such as random_bytes() is equally and 
independently unpredictable. Hence a brute force attack cannot know that 
some values are not in its output and may therefore be skipped in a search.


There are 64^32 = 2^192 ~= 6.3e+57 different 32-bytes base-64 string 
values. If a session DB has 1e+7 such SIDs chosen at random then each 
blind insertion/trial has ~1 in 6.3e+50 chance of a hit. At 1e+4 
trials/sec the chance of a hit is ~1 in 6.3e+46 in one second. The age 
of planet Earth is ~1.4e+17 seconds.


Your calculation (I assume based on that sentence from OWASP) has 
128-bit SIDs of which only half are unpredictable. So there are 2^64 ~= 
1.8e+19 different SIDs and (at 10e+4 trials/sec on a DB of 1e+7 SIDs) 
the chance of a hit in one second is ~1.8e+8, which is obviously 
insufficient.


But so what? Four-letter passwords are obviously insufficient too. The 
calculation doesn't support the conclusion...



IMHO, it's nonsense to argue "Session ID collision very rare and
cannot happen", "PHP Session ID safe without collision detection",
etc.


If random SIDs math is nonsense that cannot be trusted then it is 
because either the a) CSPRNG or b) code deriving SIDs from it is 
**dangerous garbage**.


Either way its the dangerous garbage that should be fixed. Nobody should 
just accept such disgraceful SID generation and patch it up with 
collision detection.


Tom

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: [RFC][VOTE] Add session_create_id() function

2016-08-15 Thread Tom Worster

On 8/13/16 9:02 PM, Yasuo Ohgaki wrote:

Hi Tom,

On Sun, Aug 14, 2016 at 12:35 AM, Tom Worster <f...@thefsb.org> wrote:

Rather than argue the details of randomness, I have more basic comments.

1. If an app needs to access session values, it can and should do this
without indirection through the PHP session ID table.


I don't get point. Why?


1) It is not necessary. An app can instead store session-related data in 
a DB that provides lookup and/or search on the session data itself 
rather than the PHP session key. Lester Caine described this.


2) Searching PHP's session database is nasty. It's low-level and the app 
has to understand the handler. But I want apps to work independent of 
sessions being in memcache to Redis or Galera or whatever. Depending on 
handler, it can involve a scan which is slow. And it's hard to make the 
overall operation that uses session table lookup transnational. And the 
session store may be distributed example below(*).


So it's nasty **and** unnecessary. Instead, the app can and should 
implement the business logic to be entirely above and ignorant of PHP 
session mechanics.




2. Users should generally let PHP choose random IDs.


I agree.



3. If PHP is to allow a user to chose its own session IDs, avoiding
collision is should that user's responsibility.


No. I've already explained why this is difficult.


It is easy if you choose to **avoid** rather than **detect**. Use a 
random component long enough for your needs. In other words, I disagree 
with the sentence in your RFC:


> Something like above code is required to implement recommended user 
session save handlers currently.


It is not needed because if 64^32 SID values is inadequate for your app 
then you may increase session.sid_length.




Please read previous mail.
Or try to write session save handler that detects collisions with
memcached, then you'll see why.


I understand your point. At the same time I see no need for collision 
detection.


Questions: When I get a value from session_create_id(), what kind of 
guarantee comes with it? Is the ID reserved for me? If so, for how long?




4. Generating unique unpredictable IDs (without requiring collision
detection) is a common problem with known and trusted solutions.


I agree. It's common because many unique ID generator do not have
centralized database to avoid collisions. In contract, session has
centralized database and it's just a matter of one lookup. (Therefore,
session module should lookup database)


(*)Some session stores are federated, e.g. a cluster of 3 hosts each 
with a memcached server and each with PHP configured with session 
redundancy to save two copies.


While session_create_id() could potentially use the same hashes that 
memcache/d extensions uses to associate SIDs with memcached servers, the 
app has to search them to find the entry with a given SID prefix.




Regards,

P.S. I'll add optimization that eliminates SID validation lookup for
normal operations. You don't have to worry about session performance
if I add this.


--
Yasuo Ohgaki
yohg...@ohgaki.net




--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: [RFC: PATCH v1] Implement mt_srand_array

2016-08-15 Thread Tom Worster

Hi Lauri,

Do you have a PR against php-src on github? It's easier to read and comment.

Tom

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: mt_srand with array seed?

2016-08-14 Thread Tom Worster
On 8/14/16, 5:45 AM, "Lauri Kenttä" <lauri.ken...@gmail.com> wrote:

>On 2016-08-13 18:53, Tom Worster wrote:
>> On 8/12/16 2:48 PM, Lauri Kenttä wrote:
>>> On 2016-08-12 21:40, Tom Worster wrote:
>>>> mt_srand() will work. But what would be in the array? Integers from
>>>> which the upper 32 bits, if they exist, are discarded?
>>> 
>>> mt19937ar.c contains init_by_array.
>>> Compability with that would probably be a good goal,
>>> unless someone can point out another widely used implementation.
>> 
>> Would mt_srand([1,2,3,4,5,6,7,8]) set the same seed on 32 and 64 bit
>> machines?
>
>Of course it would.
>Or let's phrase it this way: can you think of any reason why it
>shouldn't?

An array of N PHP (signed) integers in a 64bit PHP runtime can provide 2N
unsigned long seed values to MT. If mt_srand() did that then I don't see
how portability is maintained.

Otoh, if, for the sake of portability, which I imagine most would prefer,
mt_srand() discards bits from 64bit input integers then mt_rand() will
produce the same sequence for many different mt_srand() seeds. This would
be new and at odds with conventions of how seeds affect PRNG outputs.

Hence I asked on Friday about discarding bits from the array on 64bit PHP.
You pointed me to mt19937ar.c, in which I did not find the answer. My
skill in C is inadequate.

It is still not clear to me how the improved mt_srand() would work.

Tom



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: mt_srand with array seed?

2016-08-13 Thread Tom Worster

On 8/12/16 2:48 PM, Lauri Kenttä wrote:

On 2016-08-12 21:40, Tom Worster wrote:

mt_srand() will work. But what would be in the array? Integers from
which the upper 32 bits, if they exist, are discarded?


mt19937ar.c contains init_by_array.
Compability with that would probably be a good goal,
unless someone can point out another widely used implementation.


Would mt_srand([1,2,3,4,5,6,7,8]) set the same seed on 32 and 64 bit 
machines?


Tom


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: [RFC][VOTE] Add session_create_id() function

2016-08-13 Thread Tom Worster

On 8/10/16 5:14 AM, Yasuo Ohgaki wrote:

Hi all,

This is RFC for adding session_create_id() function.

Session ID string uses special binary to string conversion. Users
should write lengthy and slow code to have the same session ID string
as session module does. It also validates and makes sure generated
session ID string has no collision. (This cannot be done easily by
user script and 3rd party C written save handlers)


Rather than argue the details of randomness, I have more basic comments.

1. If an app needs to access session values, it can and should do this 
without indirection through the PHP session ID table.


2. Users should generally let PHP choose random IDs.

3. If PHP is to allow a user to chose its own session IDs, avoiding 
collision is should that user's responsibility.


4. Generating unique unpredictable IDs (without requiring collision 
detection) is a common problem with known and trusted solutions.


Tom


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC][VOTE] Add session_create_id() function

2016-08-13 Thread Tom Worster
Hi Yasu,

On 8/12/16, 4:38 PM, "Yasuo Ohgaki"  wrote:

>Base 64 adds padding for extra bytes. The Padding char is "=" and it's
>illegal char as SID. Therefore trims is mandatory.

substr(base64_encode(random_bytes($n)), 0, $n). There is no "=" in the
value for any $n.


>Don't you think it's nice to make "PHP just works" without knowing such
>details?

Not in this particular instance, no. Anybody that takes on the
responsibility of designing a scheme for session ID generation needs to
know such details. The typical PHP programmer does/should not do such work
because its is difficult and dangerous and because the requirement seldom
arises. PHP should not try to patch up garbage session IDs by rejecting
IDs already in use.


>Session ID security is key factor of Web application security. There
>is huge difference between 'cannot happen' and 'very rare and almost
>cannot happen'. In addition, proving 'very rare and almost cannot
>happen' is difficult.

This is equivalent to saying that random_bytes() cannot be trusted as a
CSPRNG.


>NIST requires SHA2 or better hash for collision sensitive usage. SHA2
>has 256 bits or more. We only use 128 bits, i.e. MD5 and
>bits_per_character=5 now. (130 bits for 7.1 and later).

Hashing is not the same problem. It's different in some important ways.



>Therefore, I
>don't think collision check is needless. We should have 256 bits
>session ID at least which requires 52 chars with 5 bits_per_character.

I don't think NIST's hashing recommendations should be the basis for
saying how many random bits PHP session IDs should have. I think we should
make our own calculations from first principles.


>We know it's relatively easy to check if a PRNG meets NIST
>requirement, but requirement fulfillment does not mean PRNG is
>generating excellent random. Measuring quality of PRNG is difficult.
>(This does not mean we should add low quality entropy such as time and
>pid to create SID and hash it. We depends on PRNG security anyway, use
>of hash and low quality entropy only makes SID weaker. Thus I proposed
>raw PRNG usage for SID)

Since 7.0, PHP uses the best available practices to obtain random bytes
from the operating system's CSPRNG. These practices are generally
considered good enough for strong cryptography, which demands as much from
a CSPRNG as PHP sessions.

It doesn't make sense to distrust random_bytes() but at the same time
trust the host PHP runs on.


>These things make me think collision detection is mandatory to say
>"PHP session is secure".
>
>Anyway, I'm not a cryptographer and just following their advices. I
>suggest you do the same.

I don't agree with the conclusions you draw from NIST hashing
recommendations.

Tom



-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: mt_srand with array seed?

2016-08-12 Thread Tom Worster

On 8/11/16 10:13 AM, Lauri Kenttä wrote:

Hello,

Any thoughts about supporting a longer seed array for mt_srand? Does
anyone really need it? Should it be in mt_srand or mt_srand_array?

See: https://bugs.php.net/bug.php?id=32145


The second question is controversial.

People have asserted that nobody does statistical work in PHP, or that 
if they do it's not a "legitimate" use of PHP. There were demands for 
evidence that anyone has ever legitimately needed to seed an RNG in PHP.


I disagree. #32145 in itself is sufficient for me.

For statistical work 2^32 starting positions is not enough. MT's period 
is Vast, like Dennett's Vast with a capital V. Nobody needs *that* much 
but if MT is what we've got then it seems reasonable to allow access to 
more of it.


mt_srand() will work. But what would be in the array? Integers from 
which the upper 32 bits, if they exist, are discarded?


Tom


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC][VOTE] Add session_create_id() function

2016-08-12 Thread Tom Worster

On 8/11/16 6:58 PM, Yasuo Ohgaki wrote:

Hi Leigh,

On Fri, Aug 12, 2016 at 3:25 AM, Leigh  wrote:

On Wed, 10 Aug 2016 at 10:15 Yasuo Ohgaki  wrote:


Hi all,

This is RFC for adding session_create_id() function.

Session ID string uses special binary to string conversion. Users
should write lengthy and slow code to have the same session ID string
as session module does.



I disagree, this pretty much covers it:

function session_create_id()
{
$encoded = base64_encode(random_bytes(random_bytes(32)));
// Use same charset as PHP
return rtrim(strtr($encoded, '+/', ',-'), '=');
}


Thank you for insight!

You've missed to set SID to proper length and SID validation.


Replacing rtrim() with substr() is fixes that.



function session_create_id(string $prefix)
{
$encoded = base64_encode(ini_get('session.sid_length')*2);


Did you omit random_bytes() in this line?



// Use same charset as PHP
$sid = substr(rtrim(strtr($encoded, '+/', ',-'), '='), 0,
  ini_get('session.sid_length');

$sid .= $prefix;

// Now validate SID so that it does not have collisions
when session is active, connect to database and validate SID
  try to fetch sid
if sid is not there
  try again to generate SID few times
  if SID validation failed
 fatal error
  return safe SID
   when session is inactive
  return unvalidated SID
}

This is what proposed session_create_id() does.
I used pseudo, but it should be easy to imagine it would be lengthy code.


You don't need to waste time checking for collisions if the SID has a 
random component of sufficient length. 32 random base-64 characters is 
sufficient.


There are lots of purposes for random strings with negligible chance of 
collision. Hence some frameworks provide the function, e.g.
http://www.yiiframework.com/doc-2.0/yii-base-security.html#generateRandomString()-detail 



Only the search of the the session database for collisions seems hard to 
me. But I don't understand why it is needed.


Tom



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: BC break with rand() where min > max

2016-08-10 Thread Tom Worster

On 8/8/16 5:36 PM, Leigh wrote:

Hi all,

There has been an unforeseen break with rand() when the minimum value is
greater than the maximum.

Prior to 7.1 rand() would happily accept backwards parameters and return a
value, however in the 7.1 branch it now emits a warning and returns false.

I've preemptively committed a fix to allow min > max and return a value as
in previous versions, but have kept the warning.

Looking for some feedback/opinions on whether anyone else thinks this
should be fixed differently (or not at all).

N.B. this also changes the behaviour of mt_rand to now accept min > max


Your fix seems fine for rand() but less so for mt_rand().

Applying this fix will break much less mt_rand()-using code than not
applying it will break rand()-using code. From that point of view,
applying it is the better choice.

Otoh, it's like copy-pasting a weird old bug from rand() to mt_rand().
The plan was to make rand() alias mt_rand(). Now I'm not sure that's a
smart plan.

Tom


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: [RFC][DISCUSSION] Argon2 Password Hash

2016-08-07 Thread Tom Worster
On 8/6/16, 1:55 PM, "Charles R. Portwood II"

wrote:

>Typically a run time of of under 50 ms is the target goal. Argon2 can be
>tweaked to use a specific amount of memory, time, or CPU cores. Trying to
>find good default cost factors is problematic since all 3 of those
>factors are variable on any given machine.

(Yes ... the very reason I wanted ... never mind.)

Can anyone think of a way to organize or even automate collection of
timing information together with the relevant specs of the machine the
test was run on? I'm not sure if the qa-repo...@lists.php.net process
still operates. Maybe some other idea?

Tom



-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: [RFC][DISCUSSION] Argon2 Password Hash

2016-08-06 Thread Tom Worster
On 8/5/16, 2:20 PM, "Charles R. Portwood II"

wrote:

>It breaks the API in the interim between this RFC and a potential future
>one. The $options parameter for both password_hash and
>password_needs_rehash is optional. Making it required for one algorithm
>but not another changes the API's for both methods. The expectations
>outlined in the original password_hash RFC make the third parameter for
>tuning the algorithm, not for making the algorithm work. Without default
>values, both password_hash and password_needs_rehash would fail unless
>the costs are provided.

OK. I misunderstood what qualifies as "broken". Looks most like most
people want to set default costs right away so I'll leave it here. As for
choosing the right default values for PHP, what are the criteria?

Tom



-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: [RFC][DISCUSSION] Argon2 Password Hash

2016-08-05 Thread Tom Worster
On 8/5/16, 12:36 PM, "Charles R. Portwood II"

wrote:

>I understand what you're saying. Ryan said it a bit more clearly than I
>did, making the options required causes backwards-incompatible changes to
>the password_hash API. That's my real reservation behind not providing
>defaults. 
>
>A separate RFC would be needed for 7.4 to make PASSWORD_ARGON2I =
>PASSWORD_DEFAULT). If the supplied constants need to be changed for that,
>I think that would be the time to do so. I think for now something needs
>to be provided to ensure the password_hash API doesn't change.

I can understand an argument that it's too much to expect a user to
provide an options array when using Argon2. But I don't understand how my
suggestion breaks BC. In my idea, a future RFC would propose default cost
constants. Changing PASSWORD_DEFAULT to PASSWORD_ARGON2I depends on those
constants so they would need to be defined before changing
PASSWORD_DEFAULT or at the same time. So...

password_hash('password', PASSWORD_DEFAULT) will always work.

password_hash('password', PASSWORD_ARGON2I) works as soon as Argon2 is
introduced in your proposal, but has to wait for another future RFC in my
suggested change.


password_hash('password', PASSWORD_ARGON2I, [costs]) will always work.

How does a BC break happen?


Tom



-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: [RFC][DISCUSSION] Argon2 Password Hash

2016-08-05 Thread Tom Worster
On 8/5/16, 11:08 AM, "Ryan Pallas"  wrote:

>Please keep it so that defaults will work, but $options is available for
>tuning as that's how the feature currently works.

My suggestion doesn't affect that. I agree that password_hash($password,
PASSWORD_DEFAULT) should always "just work".

Instead, I think there should be an interim status, before changing
PASSWORD_DEFAULT, in which password_hash($password, PASSWORD_ARGON2I)
requires $options. Reasons given in my first reply.

There is no hurry to change PASSWORD_DEFAULT, afaik.

Tom



-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: [RFC][DISCUSSION] Argon2 Password Hash

2016-08-05 Thread Tom Worster
On 8/5/16, 10:49 AM, "Charles R. Portwood II"

wrote:

>I think for clarity, PASSWORD_ARGON2I would be sufficient. What are your
>thoughts?

Looks good.


>The rationale for providing defaults is to ensure the password_*
>functions remain easy to use.

I understand. I was actually suggesting that we deliberately make it
harder to use!


> Assuming that at some point PASSWORD_ARGON2I (or any new algorithm)
>would become PASSWORD_DEFAULT, the end user's expectations would be that
>password_hash($password, PASSWORD_DEFAULT) just works, without needing to
>specify additional arguments.

I agree entirely. I'm not against introducing default cost constants. I am
instead proposing we allow a period of time after introduction of Argon2
into PHP before deciding what the default costs should be and define the
constants at the same time as setting PASSWORD_DEFAULT = PASSWORD_ARGON2I,
or possibly before.

Please reread my previous message for the reasons behind this (odd, I
admit) idea.

Tom



-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: [RFC][DISCUSSION] Argon2 Password Hash

2016-08-05 Thread Tom Worster

On 8/5/16 8:47 AM, Charles R. Portwood II wrote:


The RFC is available at: https://wiki.php.net/rfc/argon2_password_hash.


Hi Charles,

Thanks for doing this. I'm glad Argon2 is coming to PHP.

You can have a longer voting period if you like, which I think would be
a good idea.

I think it's confusing to have two consts to identify the algorithm. I
don't understand the analogy to PASSWORD_DEFAULT. If we only provide
Argon2i, one const is easier. If we anticipate adding another Argon2
algo in the future that is not backward compatible with this one then I
don't think we would want to change PASSWORD_ARGON2 to point to it.

Finally, I wonder if it wouldn't be better if, for the time being, we
do not provide default costs constants. Argon2 is new (as crypto algos
go) and very early in a gradual introduction in deployments. And it is
hard to use because of the three cost factors. Correctly tuning those
for different machines is not yet a commonly-understood skill. (You
even can find conflicting advice on how to tune Bcrypt's time factor.)

If we offer default costs then it will appear, to some people, even
those who know little about it beyond the name, as though deploying
Argon2 is just a matter of using it with the defaults. I'm not sure
this is a good idea.

If, on the other hand, we omit the constants and require the $options
argument then it discourages inexpert users. At the same time it
encourages experimentation and understanding of the costs, among those
who take an interest, which I think is just what we want.

Those who want to use Argon2 are going to make special efforts to get
the lib and enable it in PHP. So I don't think it's unreasonable to
expect the early adopters to give some thought to the costs.

Tom


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: [RFC][VOTE] RNG fixes

2016-07-07 Thread Tom Worster

On 7/7/16 6:39 AM, Leigh wrote:

As the discussion thread has been quiet for a while, moving this RFC to voting.

https://wiki.php.net/rfc/rng_fixes

https://github.com/php/php-src/pull/1986



Nice work.

The discussion persuaded me (Nikita mostly) that aliasing rand() to
mt_rand() is sensible. And the compromise to fix the mt_rand() bug is
good enough. Everything else is pretty much uncontroversial, I would
think. A more efficient RNG can wait for another day.

Tom


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: [RFC][VOTE] Session ID without hashing

2016-07-05 Thread Tom Worster

On 7/5/16 11:37 AM, Christoph Becker wrote:

On 05.07.2016 at 16:32, Leigh wrote:


On 5 July 2016 at 04:02, Pierre Joye  wrote:

We can argue about the provided pnrng being CS but it is not php's job to
decide.


I think we need to drop the concerns about exposing "RNG state".

A reminder of what php_random_bytes looks at (in order):
* CryptGenRandom on Windows
* arc4random_buf on modern BSD (where ChaCha20 is used)
* Linux getrandom(2) syscall where available
* /dev/urandom where available
* Throws an exception if it cannot access one of the above


Would that imply that in this latter case sessions couldn't be used
anymore?


I hope so.

It's not safe to use sessions if PHP cannot get unpredictable randoms 
for session IDs. PHP should therefore error so that the sys op can be 
alerted and fix the problem.


Tom



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] RNG fixes

2016-06-23 Thread Tom Worster

On 6/23/16 12:56 PM, Stanislav Malyshev wrote:

Hi!


I would prefer something like random_fast_int() == mt_rand() == rand(),
with clear documentation on when to use random_fast_int() instead of
random_int(), and a note on the others that "since 7.2, mt_rand() is an
alias for random_fast_int()" etc. (Not wedded to the name
random_fast_int, we can bikeshed that later.)


That sounds to me like a good way to proceed too. I don't think it's a
big deal it mt_rand won't be using specific MT algorithm anymore, I see
very small number of places where it would matter.


For these cases https://packagist.org/packages/leigh/mt-rand

Tom

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] RNG fixes

2016-06-23 Thread Tom Worster
On 6/22/16, 5:19 PM, "Nikita Popov"  wrote:

>I haven't been following this thread, just jumping in to comment on this
>point. My suggestion to deprecate rand() was motivated by the fact that
>rand() produces extremely low quality random numbers on Windows, while at
>the same time having the name people are most likely to try first if they
>want to have a random number. It's a bad state of things if there's a
>rand() and an mt_rand() function and the latter is preferable in *all*
>situations, while the former is more likely to be used. However, this
>concern is completely alleviated by aliasing rand() to mt_rand(). If we
>do this, I see no reason to deprecate rand(), at least in the short term.

Alternatively, if you fix rand() by making *it* the new, fast,
platform-independent RNG (e.g. Xoroshiro128+) and leave mt_rand() alone
then:

1. The "bad state of things" you described is resolved.

2. The various complaints about mt_rand() become irrelevant because rand()
will be preferable in *all* situations (except security and backwards
compat).

Tom



-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] RNG fixes

2016-06-22 Thread Tom Worster

On 6/21/16 3:29 PM, Fleshgrinder wrote:


My favorite:


The PHP approach seems to be that any crazy behavior is acceptable as
long as it's documented.


People love raggin' on PHP.

It's a virulent meme. It propagates so well in our coder culture because 
it's easy, it's just provocative enough to attract attention but it's 
pretty safe because you can always find people to agree with you. 
Participants can feel smart and superior even if their contribution 
amounts to "me too".


It's like people raggin' on C++. "They named the language after the 
worst feature in C: pointer arithmetic!" Knowingly superior chuckles. I 
wonder how many of the people who pile on (propagating the meme) have 
the understanding of real experience. My own history with C++ was short 
but so miserable that I sometimes join in with this one.


Same thing with Perl. Before that it was COBOL. Raggin' on COBOL is 
legit even if you've no idea what COBOL code looks like.


It's a form social behavior for establishing groups and belonging. It 
works by using people's need for a sense of identity and validation. 
Computer people at a party can use these memes as small-talk to develop 
relations, either friendly or not. Our modern comms platforms' 
gamification literally rewards this behavior.


But once you're aware of it, it's like American stand-up comics raggin' 
on New Jersey. Usually good for a cheap laugh but in reality it's tired 
out, past its due date, old, utterly unimaginative, very, very boring, 
and, in the Frankfurtian sense[1], bullshit.


There. I finally said it. I've wanted to get it off my chest for years. 
I apologize that I rely too much on America vernacular and culture. And 
for totally hijacking Leigh's RFC thread.


Richard, nothing internals could do will stop PHP being the butt of 
these dreary jokes and insults. And there are more effective ways to 
push your agenda. Please consider using them.


Tom

[1] https://en.wikipedia.org/wiki/On_Bullshit


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] RNG fixes

2016-06-22 Thread Tom Worster

On 6/21/16 3:23 PM, Lester Caine wrote:


Can someone explain why I should need 'crypto safe' random numbers when
ALL *I* use rand for is to give a random order to content items on the
page.


I cannot.



Something more in sync with the shuffle and array_rand without the
need to recode to actually use the array functions, or simply select an
entry at random from a list.


Similarly, when I randomize the time before a daemon next wakes up, I 
don't think random_int() is appropriate. mt_rand() is entirely suitable. 
And upgrading to a more efficient RNG like PCG would be daft.


Tom


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] RNG fixes

2016-06-22 Thread Tom Worster

On 6/21/16 2:32 PM, Stanislav Malyshev wrote:


What is for you "obviously faulty stuff" for literally thousands of
people is "code that works". I appreciate that there's a number of new
hip randomness tests that mt_rand may not satisfy


As far as I can tell

https://gist.github.com/tom--/a12175047578b3ae9ef8

mt_rand() tests just the same as MT19337 and both produce "good quality" 
random variates.


(Btw: there's nothing hip or new about this kind of testing. It's old 
and boring :p)


The issue I have with mt_rand() is speed and memory efficiency, where it 
is orders of magnitude worse than alternatives. But this won't matter in 
many uses of mt_rand(). I checked and it doesn't matter in any of the 
uses in my software.


The only argument for removing it that has any legs is: "people use it 
for security-related unpredictable randoms and PHP should fix that." My 
judgement is that the real world benefits of doing this don't justify 
the disruption.


Tom

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] RNG fixes

2016-06-22 Thread Tom Worster
On 6/21/16, 1:43 PM, "Fleshgrinder"  wrote:

>Yes, let's ask the users! But we don't do that, we just discuss it here.
>Howe could we create such a poll that reaches many people? Maybe Reddit?

Perhaps you misunderstand what I intended by leaving the choice to users.
If we add a new RNG and keep the existing ones then each user can make an
independent choice.


>That being said, I repeat myself now, nikic also proposed to deprecate
>rand() and having pcg_rand() as a modern replacement for mt_rand()

I admire O'Neill's work and her paper and I find the generators and
related theory very interesting. I'm not sure they are sufficiently well
scrutinized and tested. Afaik, the status of this work is: there's an
unpublished paper, a web site, some implementations and a conversation on
reddit. Among other things, O'Neill makes claims about suitability for
crypto. If PHP choses PCG as is its new RNG, that constitutes a strong
endorsement and wonder who among us can confirm the work.

I think there's also an argument against using an RNG that makes specific
unpredictability claims since this confuses the distinction between it and
random_bytes(). People may think that once seeded it's a fast alternative.

Tom



-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] RNG fixes

2016-06-19 Thread Tom Worster
On 6/19/16, 12:59 PM, "Fleshgrinder"  wrote:

>This matches Tom Worster's analysis of mt: it's just crap. :P

Actually I satisfied myself that both MT19937 and PHP's mt_rand() produce
good quality random variates and I posted the evidence behind the belief.
I don't think being slow and inefficient with memory justifies removal or
deprecation (premature optimization).

I think the decision to change this RNG or not is best left to the users.
Only they understand the pros and cons of their specific context.
Furthermore, I would prefer that if they decide nothing, perhaps even
being unaware of the question, they can upgrade PHP and their programs
still work.

This is my opinion.


>I am sorry if it seems to you as if I am ignoring you, Quite the
>opposite is the case. It is just unbelievable to me that we are trying
>to keep these functions if there are so many better alternatives that we
>can provide to our users. There is nothing bad about a deprecation
>together with a much better alternative. I cannot imagine that anyone
>has a problem with that.

It is quite common that different people can have full and correct
appreciation of the technical aspects of something and have different
judgements regarding the best action. So I am surprised you cannot imagine
that someone who disagrees with your conclusions could understand the
facts of the matter.

In the language of politics and policy, since that's what this really
is... You advocate a top-down structural approach to changing individual
behavior for their own and the greater good. I advocate for new
facilities, education, and the individual's responsibility to decide
what's best for them. Subjective differences like this shouldn't be
unbelievable, they should be expected.


>* Let me know if I missed any other argument that clearly explains why
>mt_rand() cannot be deprecated and removed. Oh, yes, I am ignoring the
>legitimate usage from a private software that is unsharable because this
>argument cannot be verified.

As a general matter of taste, I don't like to be drawn by the "prove me
wrong" rhetorical method. And in this specific position of this
php-internals thread I don't see any chance of changing minds by arguing
over what constitutes a legitimate use of a random in a PHP program. So,
on both counts, I prefer not to.

You have clearly stated your positions and explained your reasons. Please
grant that other people with different positions and reasons may not feel
any need or desire to prove you wrong and please don't represent this as
evidence in support of your assertions.

Tom



-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: [RFC] RNG fixes

2016-06-16 Thread Tom Worster
Hi Leigh,

I need to change stance wrt MT.


On 6/16/16, 2:31 PM, "Leigh"  wrote:

>I get your point, but most people probably use mt_rand() because "it's
>better than rand". mt_rand is also incredibly slow and has a huge state
>when compared to modern algorithms. I should probably note the
>performance gains in the RFC.

I spent some time trying to understand the weird PHP mt_rand(). I took the
binary MT19937_02 generator from TestU01 and made a variant with the PHP
bug. I added side-by-side diff off the results from running BigCrush on
both here 

https://gist.github.com/tom--/a12175047578b3ae9ef8

I can't see any significant difference between.

More interesting was how this work changed my appreciation of Mersenne
Twister. I used to think it was a good RNG. But that dates back a long
time to when George Marsaglia had the best tests for RNGs and he was
challenging sci.math to factor enormous numbers to use in new generators
with ever more extravagant periods. I took it on authority that MT was
decent. 

But after spending time with the code I see you're right! Its state and
period are crazy. It's one thing to be slow but on top of that it's
chewing up cache lines as though nothing else needs them.

My opinion on rand() is that it is historical, like the crummy old RNGs
kicking around in various libcs and elsewhere. Don't use them. Now I feel
the same about mt_rand() -- like MD4 and DES, it's interesting history.

I think every self-respecting programming environment should provide a
good deterministic RNG. And now it seems I've persuaded myself that it's
time time for PHP to move on from MT.

So I need to update my opinion on your RFC. I still think rand() and
mt_rand() implementations can stay but I now agree with you that it's time
for a new RNG. And I agree that xoroshiro128+ is a good choice.

Specifically, rand() docs should say that the underlying RNGs are
obsolete, not portable and have questionable quality on some platforms.
mt_rand() docs should mention the poor performance and reference #71152.

Tom



-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] RNG fixes

2016-06-15 Thread Tom Worster

On 6/15/16 9:04 AM, Jordi Boggiano wrote:


Just a thought here, if the goal is to provide a better interface,


Hi Jordi,

Iiuc, Leigh's goal, which I support, is to fix known bugs. It is not to 
provide a better interface.


I already suggested that if people want new RNGs or a new API then we 
should divorce that discussion from the bug fixes. Let's not use this 
RFC or thread for that.


Tom


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] RNG fixes

2016-06-15 Thread Tom Worster

On 6/15/16 6:33 AM, Pierre Joye wrote:

> * Alternatively, fixing the current mt_rand() implementation to make it
> standard

That sounds more reasonable. An option (please no ini as it is a
programatic flow feature, not a php configuration problem) to keep the
old behavior for BC. Having to add an option for 7.1 or 7.2+ is
reasonable enough for the cases where the current seed and predictable
sequences are desired (same data generations for example using one
seed for example).



Hi Pierre,

I'm glad you mentioned a compatibility mode. Let's say we would offer:

int mt_rand ( $mode = MT_RAND_COMPAT )

int mt_rand ( int $min, int $max, $mode = MT_RAND_COMPAT )

MT_RAND_COMPAT = 1

MT_RAND_MT19937 = 2

A PHP user needs to make the right choice of what to use in their 
situation. A technical description of the modes would be confusing and 
unhelpful to most users. I have no idea how to document this simply, 
honestly and accurately, and without jumping to conclusions about 
suitability.


This is why I think a compat/correct mode switch doesn't improve PHP. 
It's inconsistent with the spirit set out in the preamble of "PHP RFC: 
Your Title Here"[1].


[1] https://wiki.php.net/rfc/template

Similarly, the $mode arg allows us to add MT_RAND_XOROSHIRO128_PLUS or 
whatever (interesting to some of us, more "modern", perhaps arguably 
more "strong" or are otherwise "better") aren't improvements to PHP 
unless users are asking for them.


Tom


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] RNG fixes

2016-06-14 Thread Tom Worster

On 6/14/16 3:12 PM, Fleshgrinder wrote:


Call me ignorant but is this required in typical web applications?


PHP is used for various things, not just web apps. I use it for various 
other things because its the language in which I am most fluent.


And the requirements of *typical* apps using PHP should not be the basis 
for removing functions that are in fact used in existing programs.


It's possible to change programs so they don't use mt_rand() etc. but 
most people won't thank you for forcing them to rewrite software that works.


Leigh, iiuc, is trying to fix bugs. Let's not change the discussion to 
cleaning up PHP's API.


Tom


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] RNG fixes

2016-06-14 Thread Tom Worster

On 6/14/16 1:45 PM, Fleshgrinder wrote:


Why do we need so many functions to get a random int anyways if we now
have random_int()?


For backwards compatibility. There are programs that use these and 
little to gain from breaking them.


Tom


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: [RFC] RNG fixes

2016-06-14 Thread Tom Worster

On 6/14/16 12:46 PM, Leigh wrote:


The RFC can be found here: https://wiki.php.net/rfc/rng_fixes


Hi Leigh,

Thanks for putting this together. I am strongly pro on two points and 
moderately contra on the other two. I'd prefer separated votes, even 
though I don't have a vote. I numbered the 4 bullets in your intro 1 thru 4.


4. Insecure usage. I think we should replace the internal insecure uses 
of php_rand(). I can't see a reason not to.


3. Poor scaling of bounded outputs. I think RAND_RANGE() should be 
fixed. Users surely expect unbiased distribution. There's a BC argument 
but the bug is pretty serious. I think this should apply to array_rand() 
too.


1. Incorrect implementations.

I don't think we should dictate that programs currently using mt_rand() 
shall use in future use MT19937 any more than we should dictate that 
XorShift64 or any other PRNG better fits their requirements.


The incorrectness of the mt_rand() implementation with respect to its 
documentation can be fixed either in the code or in the docs. Given 
that, as far as we know, mt_rand()'s byte-stream looks like a decent 
PRNG[1] it's not clear that the actual MT19937 sequence is more 
important that backward compatibility. I for one think it's very unlikely.


[1] https://gist.github.com/tom--/a12175047578b3ae9ef8

I also don't think we should assume the responsibility of correcting 
people's insecure programs using rand() or mt_rand() (e.g. for keys, 
IVs, salts) by changing the algorithm. Programs this bad need more 
rework than we can provide. These functions have had scary-colored 
cautions on them for a long time.


2. Roughly the same arguments applies to rand(). The function is PHP's 
API to the OS's rand(3). There's value to that and probably people who 
rely on it.



Summarizing 2. and 3. it's not clear what we fix in the real world with 
the proposed changes to rand() and mt_rand(). But I do see BC breakage. 
I would prefer to fix these bugs the docs.



With respect to PRNGs completely new to PHP (you mentioned Xoroshiro128+ 
and PCG), I would prefer completely divorce this question from the bugs 
discussed above. If some PHP users need efficient implementations of 
such algorithms then I would urge whoever wants to write them to use a 
new API and to provide them via PECL. In software, "better" is always 
with respect to context. While there are specific, well-known uses for 
random numbers (e.g. crypto) where we can make recommendations, in 
general we cannot.


Tom


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Libsodium - Discussion

2016-06-05 Thread Tom Worster

On 6/5/16 4:31 AM, Scott Arciszewski wrote:

> - memzero, memcmp, hex2bin
>
> I am not totally convinced that memzero and maybe memcmp names are
> good nor they should be there. Both would be very useful as operator
> on variables. Given the simplicity of the implementations, it could be
> very useful in many other areas in case this ext is not installed

>

IMO: memzero is fine; memcmp isn't that great.


memzero() stands out as unusual and interesting because PHP scripts 
don't usually get to manipulate memory, only variables. From the "Using" 
guide


void \Sodium\memzero( $secret);

it looks like it's for zeroing strings.

What arg types does memzero() work with?

Does it check argument type?

Is it safe to use with opcache? Interned strings is an interesting case 
but there might be others.


Tom


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC][Vote] Typed Properties

2016-05-26 Thread Tom Worster
On 5/26/16, 12:48 PM, "Fleshgrinder"  wrote:

>> 
>> Under another 5th option, the problem you state does not arise. Disallow
>> "public int $x;". Under this option you may declare $x with type int and
>> an initial value or you may declare $x without type but you may not
>> declare $x with type (nullable or not) and undefined initial value.
>> 
>> Tom
>> 
>
>This would be a valid approach too, yes. I personally would be against
>it because I do not want to initialize all my properties.
>
>  class A {
>
>private int $x;
>
>public function getX() {
>  if (empty($this->x)) {
>$this->x = 42;
>  }
>  return $this->x;
>}
>
>  }
>
>This would not yield an E_NOTICE because both isset() and empty() never
>do. This allows the attentive programmers to keep up there coding
>practices without the necessity to assign meaningless values everywhere.
>
>  class A {
>
>/** -1 is invalid */
>public int $x = -1;
>
>/** 'INVALID' is invalid but empty string is allowed */
>public string $s = 'INVALID';
>
>/** Null byte is invalid but anything else is valid */
>public string $welcome_to_the_c_world = '\0';
>
>  }

If you want that kind of thing, you can do it the old PHP way like this

class A {
private ?int $x = null;
...


Tom



-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC][Vote] Typed Properties

2016-05-26 Thread Tom Worster
Hi Thomas,

On the face of it, I'm not enthusiastic to introduce new magic numbers
(which would be false, 0, 0.0, "", and [], right?) that PHP assigns when
coercing a typed, uninitialized property read by a file in liberal mode.

This is like taking the most confusing thing about 7.0's dual-mode, scalar
type declaration of function arguments and boosting the confusion power. I
would want a new name for this complement-of-strict mode. "Weak" and
"liberal" don't quite do it. Promiscuous mode? ;)

Tom


On 5/26/16, 10:40 AM, "Thomas Bley" <ma...@thomasbley.de> wrote:

>I think strict_types=1 should give a fatal error for accessing
>non-initialized typed properties, instead of notice.
>Example:
>
>declare(strict_types=1);
>
>class A {
>   public int $x;
>   public ?int $y = null;
>   public int $z = 42;
>   public ?int $u;
>   public ?datetime $v;
>   public datetime $w;
>}
>
>$a = new A;
>var_dump($a->x); // Fatal error, uninitialized...
>var_dump($a->y); // null
>var_dump($a->z); // 42
>var_dump(isset($a->z)); // true
>unset($a->z);
>var_dump(isset($a->z)); // false
>var_dump($a->z); // Fatal error, uninitialized...
>var_dump($a->u); // Fatal error, uninitialized...
>var_dump($a->v); // Fatal error, uninitialized...
>var_dump($a->w); // Fatal error, uninitialized...
>
>var_dump(isset($a->x)); // false
>var_dump(isset($a->y)); // false
>var_dump(isset($a->u)); // false
>var_dump(isset($a->v)); // false
>var_dump(isset($a->w)); // false
>
>Regards
>Thomas
>
>Tom Worster wrote on 26.05.2016 15:53:
>
>> On 5/25/16 5:52 PM, Thomas Bley wrote:
>>> I'm not seeing a problem here:
>>>
>>> class A {
>>>   public int $x;
>>>   public ?int $y = null;
>>>   public int $z = 42;
>>>   public ?int $u;
>>>   public ?datetime $v;
>>>   public datetime $w;
>>> }
>>>
>>> $a = new A;
>>> var_dump($a->x); // 0 + notice
>>> var_dump($a->y); // null
>>> var_dump($a->z); // 42
>>> var_dump(isset($a->z)); // true
>>> unset($a->z);
>>> var_dump(isset($a->z)); // false
>>> var_dump($a->z); // 0 + notice
>>> var_dump($a->u); // null + notice
>>> var_dump($a->v); // null + notice
>>> var_dump($a->w); // Fatal error, uninitialized...
>>>
>>> var_dump(isset($a->x)); // false
>>> var_dump(isset($a->y)); // false
>>> var_dump(isset($a->u)); // false
>>> var_dump(isset($a->v)); // false
>>> var_dump(isset($a->w)); // false
>> 
>> Is the file containing these examples in liberal mode?
>> 
>> What changes if declare(strict_types=1) precedes $a = new A;?
>> 
>> Tom
>> 
>



-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC][Vote] Typed Properties

2016-05-26 Thread Tom Worster
On 5/26/16, 12:30 PM, "Fleshgrinder"  wrote:

>The problem is a completely different one, how should the following code
>behave?
>
>  class A {
>
>public int $x;
>
>  }
>
>  (new A)->x;
>
>The property has no value assigned but it is being accessed. The current
>PHP behavior is to simply initialize it with null. But this is
>impossible according to the type definition.
>
>There are not many ways to handle this. I think we already had all of
>them proposed:
>
>0. Fatal error after __construct was called.
>1. Fatal error and abort.
>2. Initialize with appropriate type.
>3. Initialize with null.

Under another 5th option, the problem you state does not arise. Disallow
"public int $x;". Under this option you may declare $x with type int and
an initial value or you may declare $x without type but you may not
declare $x with type (nullable or not) and undefined initial value.

Tom



-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC][Vote] Typed Properties

2016-05-26 Thread Tom Worster
On 5/26/16, 11:02 AM, "James Gilliland"  wrote:

>Sarcasm aside, I still can't figure out how fundamentally changing how
>people interact with uninitialized properties like this improves
>developer experience. Can someone explain a case where this is better and
>catches a bug or something? Since this is a new feature I would assume
>its not covered by "BC" but this seems like a painful gotcha for people
>developing across typed and untyped code.

Talk of improving developer experiences is too subjective for me. I only
want to say that one option we have is to require that a PHP property with
a type declaration must also have an initial value declaration. This
obviates some confusion, which is a good thing in my opinion.

I understand that it will be surprising to some that the lazy old `public
$var;` (that initializes to null if it is read before written to) is not
available if you insert a type declaration. Such surprise will dissipate
quickly if the compiler rejects such cases, one way or the other: either
don't declare type or declare type plus initial value.

If I'm wrong in this estimation and we in fact need to protect developers
from the pain of this experience then I'd prefer to reject typed
properties for the time being.

Tom



-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC][Vote] Typed Properties

2016-05-26 Thread Tom Worster

On 5/25/16 5:52 PM, Thomas Bley wrote:

I'm not seeing a problem here:

class A {
  public int $x;
  public ?int $y = null;
  public int $z = 42;
  public ?int $u;
  public ?datetime $v;
  public datetime $w;
}

$a = new A;
var_dump($a->x); // 0 + notice
var_dump($a->y); // null
var_dump($a->z); // 42
var_dump(isset($a->z)); // true
unset($a->z);
var_dump(isset($a->z)); // false
var_dump($a->z); // 0 + notice
var_dump($a->u); // null + notice
var_dump($a->v); // null + notice
var_dump($a->w); // Fatal error, uninitialized...

var_dump(isset($a->x)); // false
var_dump(isset($a->y)); // false
var_dump(isset($a->u)); // false
var_dump(isset($a->v)); // false
var_dump(isset($a->w)); // false


Is the file containing these examples in liberal mode?

What changes if declare(strict_types=1) precedes $a = new A;?

Tom


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC][Vote] Typed Properties

2016-05-26 Thread Tom Worster

On 5/25/16 6:03 PM, Stanislav Malyshev wrote:

Hi!


> Andrea already said that we would not use it for untyped properties,
> hence, no BC.

Again, it's not that simple. Properties are not local. That means any
code that can deal with a class that may have typed properties (which
may be library class, for example, so you don't even know what it has
inside) has to deal with the possibility of it being of the new type. So
if the old code uses is_null($object->foo) as means to check if the
value wasn't initialized, and it's no longer null, then that code is
broken and needs to be rewritten. That's a BC break. Yes, it will never
happen if you never use typed properties, and never use any libraries
that might use typed properties, but then what's the point of the whole
thing? The point of BC is that if you don't use new features, you don't
have to change your code and it will keep working. With the proposed
solution, it won't be the case.



If you want, you can easily write a backwards-compatible new class that 
uses declared type properties with


public int $property = null;

I don't think loss of the (as yet hypothetical) lazy shortcut to write 
the same thing:


public int $property;

is BC break.

Tom

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC][Vote] Typed Properties

2016-05-25 Thread Tom Worster

On 5/25/16 9:03 AM, Nikita Popov wrote:

On Wed, May 25, 2016 at 10:30 AM, Joe Watkins  wrote:


Morning Dmitry,

   > I made this check(s) to be invariant. You may like to do this
differently...

   I think this is what everyone expects, isn't it ?

   I did omit to mention that part ...

   > RFC doesn't define how uninitialized nullable typed properties should
behave.

  It does:

   > *Nullable typed properties will not raise an exception when accessed
before initialization.*



I don't agree with this choice, for three reasons:

a) This unnecessarily restricts what can be expressed in the type system.
With these semantics it will no longer be possible to express that a
property should be nullable, but have no default value. This situation is
not uncommon in practice, in particular anytime you have a nullable
constructor argument, you will want the corresponding property to be
nullable without a default, to ensure that it is explicitly initialized.

b) This directly contradicts the meaning of ?Type for parameters. For
parameters ?Type means that it's a nullable parameter **without a default
value**. That's the very thing that distinguishes it from the Type $prop =
null syntax. And now ?Type for properties should mean the exact opposite?

c) If you view this in a larger scope of union types, this *special case*
becomes even more weird. Why does the particular union Type|null get
special treatment, while all other unions don't? Or is it actually not
specific to "null", but to single value types? E.g. if we also allowed
Type|false, would that also receive an implicit false default value? What
about the type null|false? Does that get an implicit default, and if so,
which? I realize this is not quite in scope for type properties, but the
further evolution of the type system should be kept in mind.

Please keep things consistent: If there is not default, there is no default.


Object properties being in a uninitialized state is unfamiliar in 
current PHP practice. We are accustomed to not worrying about an 
"uninitialized error" when we read a property. We are assured to get 
either null or whatever was last written to the property.


I suspect this unfamiliarity lies behind some people's preference for an 
implicit initial null value of a nullable typed property.


But I'm inclined to agree with Nikita that the following should be 
different:


public ?Client $mark;
public ?Client $mark = null;

... and that the difference should be coherent with that between:

function con(?Client $mark) {}
function con(?Client $mark = null) {}

... which is that you *must* give $mark value in the first and you don't 
need to in the second because it has an explicit default. We are 
more-or-less accustomed already to the difference between:


function con(Client $mark) {}
function con(Client $mark = null) {}

... so I think your consistency argument is good.

Moreover, the "good old" implicit initial null for untyped properties 
(i.e. that `public $foo;` means `public $foo = null;`) is just a lazy 
shortcut that I wouldn't defend except for BC, which isn't an issue here.


Tom



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: [RFC][Vote] Typed Properties

2016-05-20 Thread Tom Worster

On 5/20/16 2:05 AM, Joe Watkins wrote:

Morning internals,

Since we have our answer on nullable types, typed properties can now go
to vote.

https://wiki.php.net/rfc/typed-properties#vote

Note that, support for nullability as RFC'd will be merged when the
implementation for nullable_types is merged into master.

Please participate.


I lack suffrage but I have a question about coercion and strictness.

When I assign a value to a property declared with scalar type, is it the 
strictness of the file containing the assignment that controls if 
coercion happens or not? And is the strictness of the file containing 
the declaration irrelevant?


I guess yes on both to be consistent with coercion on function 
invocation but I couldn't find this mentioned in the RFC.


Tom


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: [RFC] Pre-vote notice for Nullable Types

2016-05-08 Thread Tom Worster
On 5/7/16, 1:19 PM, "Nikita Popov"  wrote:

>This RFC has one primary vote and one secondary vote. The primary vote
>determines whether we want to add nullable types to our type system. The
>secondary vote decides how precisely this will happen, in this instance
>deciding whether nullable types will be restricted to return types only
>or not. This is a standard voting layout, with precedent in a number of
>other RFCs.
>
>The reason why the second vote must use a 1/2 majority is symmetry. You,
>as somebody who does not like nullable parameter types, argue from a
>perspective of one 2/3 majority RFC for introducing nullable returns and
>another 2/3 majority RFC for introducing nullable params. I, as somebody
>who thinks supporting this syntax only for returns is wildly
>inconsistent, will argue from a perspective of a 2/3 majority RFC for
>introducing nullable *types* and another 2/3 majority RFC for restricting
>them to return types only. Depending on the perspective this would
>require either a 2/3 majority, or a 1/3 "majority" for unrestricted
>nullable types. Using a 1/2 majority vote ensures that there is no bias
>for either choice.

The explanation is very clear. Thank you.

Tom


(Btw, I don't disagree about the inconsistency you mentioned. But I don't
think it's a wild inconsistency, rather a justified one, given our
context.)




-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: [RFC] Pre-vote notice for Nullable Types

2016-05-07 Thread Tom Worster

On 5/6/16 3:41 PM, Levi Morrison wrote:

The [RFC for Nullable Types][1] is going to go into the voting phase
soon. There have been a few changes to the RFC in the meantime:

   - More example for documentation's sake
   - The vote is now split into two parts: one for nullable parameter
types and one for nullable return types.


The vote counting surprises me. Say, for the sake of argument, a 
hypothetical nullable returns RFC were to pass with 2/3. After that a 
2nd hypothetical RFC for nullable parameters goes to vote. This 2nd RFC 
would need 2/3 to pass. Your RFC defines the same two separate language 
changes as two votes but one of them requires only a majority.


Also, could you clarify in the RFC text how the voting works. For 
example, is it the case that the entire nullable parameter vote is 
discarded if the nullable return vote does not pass?


Tom


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: Request to withdraw RFC's for nullable types foronly return values

2016-04-28 Thread Tom Worster

On 4/28/16 4:41 PM, Björn Larsson wrote:


Can't resist jumping into this discussion, but when I first read
both RFC's, I found them quite complementary.


In one sense, I agree. But when it comes to the question: let's vote on 
the options to decide what, if anything, happens to PHP, they are 
options with important differences. There's no way I can see to organize 
the vote that would be fair to everyone.


For example, I prefer nullable returns only. I could settle for nullable 
hints/returns (NRH) as a second choice but I would prefer to make no 
change at all over union types. That being the case, I would want the 
voting to be structured in one specific way.



> I was actually a
> bit tempted to combine them into one just as a writing exercise
> for my self (wanted to train on writing RFC's).
>
> My suggestion would be that you merge them into one and put
> it into vote quickly, maybe having you both as authors or one of
> you taking the lead?

I considered how this might work but I can't imagine how combining RFCs 
makes it easier. If more than one yes/no voting option is presented in 
one RFC, it's likely to be more confusing than voting on more than one RFC.


Tom


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: Request to withdraw RFC's for nullable types for only return values

2016-04-28 Thread Tom Worster
Levi,

>From one reasonable point of view, Union and Nullable are in conflict with
each other. If one prefers Union then one might argue in favor of Union
over related but different proposals. When it comes to the vote, it's
difficult to support both except with the argument that "I can settle for
Nullable if Union doesn't pass vote", which, when you think about it, is
not really supporting both.

If Union goes to vote before anything else, voters will to take into
account what they expect to subsequently go to vote. So your stance
relative to that matters. Hence it's not really clear what you want while
you continue to own both.

This is how I understand Dmitry's concerns (correct me if I'm wrong,
Dmitry).

It would be easier to understand if you would *either* abandon Union (for
7.1) and throw your support behind Nullable *or* disown Nullable, let
Dmitry champion it, and the two RFCs to vote as alternatives.

I understand that you see Union as a kind of superset of Nullable (correct
me if I'm wrong) but when it comes to the voting, there's no fair way to
organize that. Someone's going to be unhappy.

Tom


On 4/28/16, 3:16 PM, "Dmitry Stogov" <dmi...@zend.com> wrote:

>Levi, I provided an implementation for your RFC on February 2015, and I
>would be glad if your RFC was accepted that time.
>Bit since that time you block it in respect to "Union Types"
>
>See conversation at PR https://github.com/php/php-src/pull/1045
>
>I would be also glad if your "Nullable Types" RFC was accepted now, but I
>don't trust in your intention to support it.
>
>
>From: morrison.l...@gmail.com <morrison.l...@gmail.com> on behalf of Levi
>Morrison <le...@php.net>
>Sent: Thursday, April 28, 2016 10:02:20 PM
>To: Dmitry Stogov
>Cc: Joe Watkins; internals; Tom Worster
>Subject: Re: [PHP-DEV] Re: Request to withdraw RFC's for nullable types
>for only return values
>
>On Thu, Apr 28, 2016 at 12:54 PM, Dmitry Stogov <dmi...@zend.com> wrote:
>> Levi, I don't understand, why do you keep trying to own "Nullable
>>Types" RFC, if you like completely different "Union Types".
>
>I don't understand; I wrote the RFC. What do you mean, "keep trying to
>own" it? I wrote both Nullable Types and Union Types. Some view those
>RFC's as competing, but they can also be orthogonal. I see the value
>in having both.



-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] IntlCharsetDetector

2016-04-27 Thread Tom Worster

On 4/26/16 12:10 PM, Sara Golemon wrote:

On Tue, Apr 26, 2016 at 2:06 AM, Yasuo Ohgaki  wrote:

Things might have been changed, but as you've mentioned encoding
detection is unstable and ICU is poor compared to mbstring's detection
at least for Japanese encodings.


For me, the difference is that I expect further work to be done on
improving ICU,


Why do you expect that?

When I researched this problem some years ago I had the impression a 
number of attempted solutions had been published and abandoned. I took 
this to mean that there was a learning experience that ended with the 
understanding that it's insoluble.


That's why I'm curious if you know of ongoing efforts in ICU. I took a 
look and saw little activity in the last 10 years.




while I lack that confidence for mbstring.  If the API
is in place early on, the library can improve underneath it to the
point it becomes more trustworthy later, but still be usable on older
versions of PHP (linked against newer libicu).


How would it becomes more trustworthy? A way to make it trustworthy 
would need to exist. And somebody would have to work on it.


Tom


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Patch for Union and Intersection Types

2016-04-26 Thread Tom Worster

On 4/26/16 10:58 AM, Bob Weinand wrote:

Yeah, I'd like to not allow ?Foo in any case if union types pass.
If they fail, ?Foo is fine for me.


I am persuaded that using the HHVM grammar is best. I personally don't 
like but it makes sense.


If the Union RFC would propose only the | grammar and both Nullable RFCs 
would propose only ? grammar then the decision process could be 
relatively clear:


1 Union
2 Nullable hints and return
3 Nullable return
4 None of the above

Tom


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Patch for Union and Intersection Types

2016-04-26 Thread Tom Worster

On 4/26/16 12:21 PM, Christoph Becker wrote:

On 26.04.2016 at 16:24, Dmitry Stogov wrote:


At first, I'm glad this implementation is ready.
At least it's possible to analyze its profs and cons.
I'm also sure that both RFCs have their opponents and advocates.

Now, I just like to make the final voting fair.


I'm a bit confused, as there is actually a third related RFC, namely Tom
Worster's "Nullable Return Type Declaration"
(), which apparently is still
under discussion.

Will voting on this RFC also start tomorrow?


I would like to know too.

I would also prefer if the discussion of voting options (decision tree?) 
would happen under a suitable Subject line. I didn't know this 
discussion was here until Christoph cc-ed (thanks, Christoph). I'm 
probably not alone in having missed all this.


Tom

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Nullable Types

2016-04-24 Thread Tom Worster
Hi Thomas,

Sorry for the delay. I was traveling last week.

By convention `return;` in PHP is an early return for a function that
returns nothing at all. I think it can be confusing when reading a
function to look at a `return;` line and have to remember to look
elsewhere to discover what that means. And it can mean something different
in every function.

I prefer to write functions that return only the declared type. In some
cases I need to write functions that return only the declared type or
null. That's the limit of what I think PHP 7.1 should provide.

Hence, for me this has no attraction but it does introduce new ways to
write bugs. So I am not enthusiastic.

Tom


On 4/21/16, 12:33 PM, "Thomas Bley"  wrote:

>Hello Tom,
>
>with default return value I mean to return a certain value if nothing
>else is returned (similar to method parameters with a default value).
>
>example 1:
>
>declare(strict_types=0);
>
>function my_strpos(string $haystack, string $needle): int = false {
>  return 42; // return 42
>  return false; // return false
>  return true; // return 1
>  return; // return false
>}
>
>example 2:
>
>declare(strict_types=1);
>
>function my_strpos(string $haystack, string $needle): int = false {
>  return 42; // return 42
>  return false; // return false
>  return true; // fatal error
>  return; // return false
>}
>
>Regards
>Thomas
>
>
>f...@thefsb.org wrote on 21.04.2016 15:05:
>
>> Hi Thomas,
>> 
>> 
>> What is a default return declaration? Is this for branches within the
>>function
>> that do not lead to a return statement?
>> 
>> 
>> Tom
>> 
>> 
>> 
>> 
>> 
>> From: Thomas Bley
>> Sent: ‎Wednesday‎, ‎April‎ ‎20‎, ‎2016 ‎12‎:‎53‎ ‎PM
>> To: guilhermebla...@gmail.com, cornelius.h...@gmail.com, dmi...@zend.com
>> Cc: f...@thefsb.org, internals@lists.php.net
>> 
>> 
>> 
>> 
>> 
>> What do you think about default return values?
>> 
>> e.g.
>> 
>> function foo(): db_result = null {
>> }
>> 
>> function canLogin(): bool = false {
>> }
>



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: Improving PHP's type system

2016-04-20 Thread Tom Worster

On 4/19/16 7:21 PM, Rick Widmer wrote:

Are too many of these incompatible shiny things, too fast, the main
reason so many PHP users are on older versions?

IMHO, yes.


This would mean, by an large, that people had tried a more recent 
version of PHP and found that their code was incompatible. I think on 
the contrary that they haven't tried because they have little motive. A 
lot of running apps are in maintenance mode with no significant 
investments in new code, without which it's easier to take the attitude 
that it's not broken so don't mess around with it.


Tom


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: Improving PHP's type system

2016-04-18 Thread Tom Worster

On 4/18/16 4:34 AM, Tony Marston wrote:


I repeat, where was the insult in the post in question?  What exactly
were the insulting words?


I chose just one example:

> Those who cannot write effective software without these "clever"
> additions to the language are doing nothing but announcing to the
> world that they are not clever enough to write effective software
> using their own limited abilities.

I think it's hard to avoid construing an implication that people 
proposing and/or supporting changes to how PHP handles type in the 
current discussions here are incompetent programmers.


There's no doubt that this sentence posits a class of incompetent 
programmers who need crutches ('these "clever" additions') and a 
complementary class of competent programmer who don't. Saying so is 
pointless without some assignment (imaginary, implied or real) of 
individuals to the classes. It's hard to imagine that present company or 
the people whose interests we attempt to represent are not involved in 
the assignment. I find this a bit insulting.


Insult is something experienced as well as something performed. If 
enough people experience it then probably it was performed, regardless 
of intent. So to this extent I just disagree that...


> The fact that you don't like what I say does
> not make it an insult.

"It's Not What You Say, It's What People Hear"


But we are now completely off topic. To bring us back on topic I repeat 
my request that you try to be specific about what you want and why, with 
respect to the RFCs under discussion.


Tom


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Nullable Return Type Declaration

2016-04-18 Thread Tom Worster

On 4/18/16 2:24 PM, Stanislav Malyshev wrote:


I would like to note in general that following the latest fashion in
academic development is not always a good idea for PHP. It's fine when
you experiment with academic languages, but when you have language that
a) focused on simplicity and low entrance barrier and b) is in
production use by millions and has 20 years of existing practices,
libraries and habits, we have to be a bit more wary, I think. I am *not*
saying we should not improve, or ignore academic developments, I am
saying that we should be careful with not jumping to the idea-of-the-day
bandwagon too fast, before it is clear it is good and necessary for PHP.


...

I agree with Stas, not just this paragraph but pretty all of the email.

Within the context of today's PHP language, practices, and the tried and 
true libraries and frameworks we rely on, I don't know how, for certain 
necessary semantics, to improve on Something or null return contracts.


I all honesty, I've looked for alternatives that satisfy a specific type 
but I can't find anything better.


For example, I use the active record ORM from the Yii 2 framework. My 
User model class therefore extends ActiveRecord. If I search for a User 
record matching given criteria (e.g. an email address) the parent class 
needs a generic way represent that no matching record exists and that 
this is not an exception.


The convention in PHP is to return null in this situation. I have tried 
to imagine how the search method might return an instance of the User 
model class that represents "not a user record", i.e. the absence of any 
user matching the search criteria. PHP's version of OOP seems not to 
have a intrinsic feature for a function to return an object with type 
Something but void of any Something object instance.


If PHP has nothing to model this, what convention can we invent as a 
workaround to *encode* "empty" in an actual instance of ActiveRecord or 
any subtype? A magic property can conflict with the app's model 
attributes. But an ActiveRecord::isEmpty() method could work. This 
satisfies the desire to return a specific subtype of ActiveRecord (e.g. 
User) but it also introduces hazardous complexity: what do you do with 
an instance in which the model's attributes, including primary key,  are 
valid but isEmpty() returns true, or vise versa.


Moreover, in what way is this better than returning PHP null? What have 
we gained with this isEmpty() conventions?


Tom




--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: Improving PHP's type system

2016-04-16 Thread Tom Worster

On 4/16/16 5:04 AM, Tony Marston wrote:

"Marco Pivetta"  wrote in message
news:CADyq6sJfPYgQvhQt=uvcbqkoojjoupcz1sufzwxc+55hl0p...@mail.gmail.com...


Tony, that sounds really like "real programmers use `dd -if -of`". Please
stop with that argument, as it really doesn't reflect reality.


That is not what I said. As a follower of the KISS principle I believe
that good programmers write simple code that anyone can understand,
while less-than-good programmers write complex code that only a select
few can understand.


...or perhaps nobody can understand. Agreed. So I guess that makes me
a follower of the KISS principle too. I prefer boring, obvious, 
conventional code and the kind of strict style guides that artisan codes 
hate. I'm also conservative wrt changing PHP -- a gradualist. And I 
dislike arguments proposing feature X because such-and-such more 
fashionable language has it.


That said...

I have found that my programs, my team's and the libs I use are more 
obvious, boring and easy to understand when they are clear about type.


So in recent years I've tried to be more and more rigorous in using 
PHPdoc2 type tags, the IDE's linters to run static checks, and working 
towards eliminating using "mixed" and "|" in the type spec.


PHP 7.0's type is better than relying on conventional annotations that 
evolved from a doc generator. I like these specific "shiny new features".


I believe that being stricter with type has helped reduce the rate at 
which we introduce bugs. It's been totally worth it. As I see it, I 
can't afford not to.


So I don't like Union Type because it will encourage sloppy type in libs 
that I might otherwise want to use. I don't want nullable hints for the 
same reason. With some reluctance and acknowledging the inconsistency I 
*do* advocate nullable return because eliminating something or null from 
PHP conventions seems a stretch and I'd rather these were declared than 
not. That's just my position.


I don't think I'm behaving like an academic researcher or computer 
scientist with a PhD. I prefer to see myself as a practical computer 
programmer with a deep concern for long term maintenance of my programs.



> The problem with adding all these new and shiny features ...
> only for the benefit of the few who think programming should be
> restricted to those who have Phd's.

I see two problems with arguing along these lines. 1. It's not specific 
enough. 2. It's a bit insulting both to me and, I imagine, to academics.


So please try to be more specific about both what you want and why. At 
the moment you appear to be arguing against any change to PHP and 
justifying this with the argument that anyone who wants to change it is 
an incompetent programmer.


Tom


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Nullable Types

2016-04-15 Thread Tom Worster

On 4/15/16 1:58 PM, Dmitry Stogov wrote:

A week ago, I actually wrote my own RFC 
https://wiki.php.net/rfc/nullable_return_types


You proposed the ?Something grammar. With ?: and ?? appearing in recent 
PHP and proposals for ??= if not ?:= and now this, I feel we're heading 
to regex hell :p


Tom


but didn't push it for discussion in favor of Levi's  nullable_type RFC (they 
are almost the same).



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Nullable Types

2016-04-15 Thread Tom Worster

On 4/14/16 3:50 AM, Dmitry Stogov wrote:


The up to date implementation for return-type-hints may be found at
https://github.com/php/php-src/pull/1851/files


Splendid!

Thank you, Dmitry. I will refer to it in the nullable_returns RFC[1].

Tom

[1] https://wiki.php.net/rfc/nullable_returns


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Nullable Return Type Declaration

2016-04-15 Thread Tom Worster

On 4/15/16 12:22 AM, Levi Morrison wrote:


My point is that `foo(bar(), $val)` won't die because bar may return
null. Bar is expected to return null sometimes.

For example, let's consider an administrator page where they look up
user information based on an identifier. The routine we'll use will
have this signature:

 function get_user(string $id): User | Null;

It is possible for an identifier to not exist and this is not an error
(database successfully returned no results). If there is no User data
to display then it makes sense for the UI to present that differently.
Thus it makes sense to pass that User | Null onto the code that will
present it:

 $user_data = get_user($id);
 // ...
 $user_html = render_user_data($user_data);

In fact this is a common operation that is encountered in many code
bases (I think every single one I've ever looked at).


This is a good example.

My opinion is that *because* get_user() can return null (a red flag) I 
prefer to see explicit handing of the null case before doing anything else.


If I would end up with `render_user_data(get_user($id))` I would 
consider it fair to not hint the param because I didn't earn it. I 
invented the faux docblock tag @sorry for this kind of thing.


Tom


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Nullable Return Type Declaration

2016-04-15 Thread Tom Worster

On 4/14/16 8:48 PM, Larry Garfield wrote:


I am highly, highly sceptical about nullable parameters or returns, and
frankly would rather they were not included in the language.  By nature
they undermine type safety.  At best, they indicate to all callers
"*every* time you call this function, you MUST put is_null() around it
or your program may fail randomly."


Yes.



While that's better to know
explicitly than not (which is the case for any untyped return, aka any
PHP code pre-7.0), it would be better still to, well, not have to worry
about that billion dollar mistake[1] cropping up in my code.


I agree.

To be clear, I do not intend the RFC to encourage nullable return or 
suggest that it's a fine thing to use. But given where we are, it's hard 
to imagine how to extirpate it.


When we started using PHP 7.0 type, initially when authoring new models 
(and using Yii2), it immediately became clear that we lacked two things: 
this and void returns. We're getting the latter in 7.1. It would be very 
nice if we could have both.


I'm a practical PHP user, with a generally conservative attitude to the 
language, often unmoved by proposals add a feature because some other 
more fashionable language has it (I call it language envy, to borrow 
from Freud). And while PHP 7.0 is good, I'd rather have Something|null 
in the return declaration than just in the docblock. That's all.




In a sense, if we really must allow for value-or-null (which I consider
a code smell in the 98% case) I'd prefer if it was ONLY available via
union types: That is, Something|null.  That's longer and clumsier to
type, and harder to read.  Which it should be. (Static casts in C++ have
a fugly syntax, which has been defended by the language designers on the
grounds that static casts are fugly, so the syntax for them should be as
well to remind you to stop doing it. There is a fair amount of validity
to that argument on affordance grounds, at least within C++.)  Using an
easy short hand notation for something that is inherently a code smell
when you're already typing your code only serves to encourage something
we should be training people out of in the first place.


With regard to syntax, I prefer the long form `Something|null`. That 
seems very clear to me. The proposed short-hand ? syntax always makes me 
think of what I hate most about regex.


Tom


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: Improving PHP's type system

2016-04-15 Thread Tom Worster

On 4/14/16 3:25 PM, Fleshgrinder wrote:

On 4/14/2016 8:59 PM, Stanislav Malyshev wrote:

>Hi!
>

>>I don't know what is complicated about "string|Stringable" or "Foo|Bar"
>>since it is super self-explanatory. However, I find myself checking the

>
>It may be self-explanatory for you. It's much less self-explanatory for
>somebody just starting to learn. It is also very dangerous - if it's
>either Foo or Bar, can you call Foo::stuff on it or not? If it's string
>or not string, can you call strlen on it? Etc., etc. It adds a lot of
>cognitive load and complicates the whole picture. You may have a
>specific use case where it is useful (which we have yet to see btw) but
>please remember it's a language with literally millions of use cases and
>users.
>

Reduce assertions*, enhance self-documentation, making code more robust,


I disagree here. I think our programs are more robust when programmers 
avoid passing mixed types and write more simple code instead.


Hence I agree with Stas about the danger part. Union type hints are a 
hazard. Adding them to PHP as a new feature is like saying "here's a 
great new tool, pick it up and use it" but the tool is really a footgun.


Tom


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Improving PHP's type system

2016-04-15 Thread Tom Worster

On 4/13/16 5:06 PM, Stanislav Malyshev wrote:


Types are designed in a way enhancing the languages experience while
avoiding nearly every impact for people who want to ignore them.


This is not true. If it's in language, you have to understand it to be
able to use the language. Nobody writes code in vacuum - there are
libraries, communities, teams, best practices, tutorials, etc. So if
(hypothetically) you want to introduce algebraic types in PHP, then
since that moment you can not really be a PHP programmer if you don't
understand algebraic types. Otherwise you would not be able to
communicate with the rest of the community, understand and use code
written by others, contribute to projects, etc.


I agree. This is an important point. I should include it in my RFC[1] 
that argues pro nullable return but contra nullable params or unions. 
May I copy-paste?


Tom

[1] https://wiki.php.net/rfc/nullable_returns


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Nullable Return Type Declaration

2016-04-14 Thread Tom Worster
On 4/14/16, 5:46 PM, "Levi Morrison"  wrote:

>Having a separate method instead of `foo(null, "value")` makes it
>difficult to use for the result of a function.

I suspect that might be a good thing:) I don't know for sure but the
possibility exists.


>Assume `bar()` returns
>`Bar | Null`; this would no longer work if they were separated:
>
>foo(bar(), "value")
>
>Functions are often composed so if something is the output of one
>function it will likely be the input to another. Thus if returning
>optionally null values is important to you so should optionally null
>parameters.

This was a chin-scratcher for me. On one hand, I see what you mean. On the
other I couldn't think of an example from my experience (which I admit is
very narrow -- I live a sheltered life) of such a bar() that I would feed
straight foo().

The semantic convention for Something or null return, as I see it, is when
bar() returns null, it is saying "I got nothing". What kind of foo() does
the same thing to nothing at all as it does to a Something object? This is
where I got stuck.

Say I was doing the composition instead via chaining.
Something::bar()->foo("value") is nonsense if bar() could return null.
This suggests to me that the other composition might not be wise.

*Either* bar() should not be returning Something or null (maybe it should
instead return some other type that can represent all the possible
returns) *or* we shouldn't try to compose like this and should test for
the Somethingness of bar()'s return before apply ->foo("value") or foo(…,
"value") to it. Or maybe this API needs an even more fundamental redesign.

So, from my perspective, this might be an example of the limitation
nudging us to think harder about the design.

Tom



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Nullable Return Type Declaration

2016-04-14 Thread Tom Worster
On 4/14/16, 1:33 PM, "Fleshgrinder"  wrote:

>On 4/14/2016 6:35 PM, Levi Morrison wrote:
>>I can appreciate that you want only the restricted union with null.
>> However, I do not see the point of disallowing it for parameter types
>>
>My guess is that this RFC only wants to get it for return because it
>might be an easier vote?

Hi Richard,

That wasn't really my intent. I tried to set out my argument contra
nullable param type in the RFC and elaborate it in my answer to Levi,
which I hope you read.

My attitude to programming reversed since 10 years ago. I used to prefer
to have all the options and be allowed to exercise my judgement. But over
those years I had to remain responsible for most of my code, which led to
a blinding conversion. Now I am so acutely aware of how likely I am to
write bugs that I more often than not want the language to get smaller.

My sense is that nullable params won't turn out to be one of the good
parts, in the Crockford sense. Something|null return, otoh, is so
established as a convention I can't imagine getting away from it.


I'm aware that some people won't understand my point of view. If that's
still the case, ask again and I'll try a different answer.

Tom



-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Nullable Return Type Declaration

2016-04-14 Thread Tom Worster
On 4/14/16, 12:35 PM, "Levi Morrison"  wrote:

>I can appreciate that you want only the restricted union with null.
>However, I do not see the point of disallowing it for parameter types
>while allowing it for return types:
>
>function setLeft(Node $n = null) {
>$this->left = $n;
>$this->updateHeight();
>}
>
>Why disallow the explicit union with null here instead of the default
>parameter which does not exactly capture the desired semantics?
>Calling `$node->setLeft()` is just odd, almost as if it was a mistake.
>I would much prefer `$node->setLeft(null)` here. Basically, if we have
>a feature for return types that exactly matches the semantics that we
>occasionally want for the parameter types why forbid it?

I agree that `$node->setLeft()` is weird but I find `$node->setLeft(null)`
still a bit weird. Perhaps something like `$node->resetLeft()` would work?
Was that the idea?


>Additionally, on occasion I'll see functions like this:
>
>function foo(Bar $b = null, $not_optional_param);

The only thing we can know for sure from this is that the programmer
urgently needs reeducation :)


>Why not allow nullable types on parameters to avoid that wonkiness
>caused by default values of null?
>
>function foo(Bar | Null $b, $not_optional_param);
>
>This is much better.

Yes but still a code smell to me. I'd need to know more about the
programmer's intent for `foo(null, "value")`. It might be better to swap
order, or change the method name, or add another method... Who knows? Need
to take each case individually.

This kind of asking questions about intent in code review is good for code
quality. That's why I like how PHP doesn't allow this. It encourages the
question asking. Every case is different, of course, so you can surely
find counter examples. But on balance I'd say it's better to disallow it.


Does this help you understand my preference? I think the restriction
encourages a healthy discipline.

Otoh, I think nullable return is a pressing need.

Tom



-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: [RFC] Union Types

2016-04-14 Thread Tom Worster

On 4/13/16 11:46 PM, Levi Morrison wrote:

As alluded to in an earlier email today[1] I am now moving the Union
Types RFC[2] to the discussion phase. The short summary of the RFC is
that it permits a type declaration to be one of several enumerated
types.



I look forward to a helpful and meaningful discussion!

   [1]: http://news.php.net/php.internals/92252
   [2]: https://wiki.php.net/rfc/union_types


Hi Levi,

Your email [1] excellently summarizes the overall historical and present 
context. In it you listed three specific things that 7.0 cannot 
represent. My RFC[3] basically argues in favor of implementing only the 
first of these in 7.1. I like to see this as a more conservative version 
of yours, preferring a more gradual introduction of these three 
loosenings of PHP 7.0's type features.


[3] https://wiki.php.net/rfc/nullable_returns

I hope you will consider this a constructive contribution to the discussion.

Tom


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] [RFC] Nullable Return Type Declaration

2016-04-14 Thread Tom Worster
I would like to introduce for discussion an RFC proposing and arguing for
Nullable Return Type Declaration in 7.1 and deferring for now more general
relaxations of 7.0 type as proposed in Levi's two RFCs.

  https://wiki.php.net/rfc/nullable_returns

If anyone would like to collaborate on the RFC, I have a repo you may fork:
https://gist.github.com/tom--/e95a10fbe4d34f8a72c9 (although guthub's
formatting isn't lovely). I'm looking for help with implementation.

Tom





Re: [PHP-DEV] IntlCharsetDetector

2016-04-14 Thread Tom Worster

On 4/11/16 6:11 PM, Sara Golemon wrote:

On Mon, Apr 11, 2016 at 9:36 AM, Stanislav Malyshev  wrote:

The point is even imperfect detection may be useful in certain
circumstances, and detector being part of ICU hints that people find it
useful enough to spend time implementing and supporting it. We should
not ignore that.


Well, Stas, your informal thumbs up to the idea means enough to me to
at least formalize it into an RFC even though I was previously feeling
negative on it.

I may yet vote no on my own RFC after the discussion period, but as
you say it's worth considering the fact that someone thought it
reasonable enough to actually build into ICU...


The general problem is impossible. If you constrain the question, for 
example as Stas says by knowing the language and choosing between a 
given set of codes, then you may have success. And I'm sure I'm not 
alone in sometimes using a simple heuristic to choose between cp1252 and 
utf8.


But this does not logically imply that ICU CharsetDetector is a suitable 
solution in such cases or that it's a good API or a decent 
implementation. Or that PHP should expose it. An SO chat doesn't 
necessarily count as a feature request.


I'd rather people engineered real solutions specific to their 
requirements than resort to any of the failed attempts to solve the 
general problem.


Tom


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] #71915 openssl_random_pseudo_bytes is not "fork-safe"

2016-03-29 Thread Tom Worster
mathieuk has requested feedback on a patch to mitigate this problem in PHP.

https://bugs.php.net/bug.php?id=71915

Tom




[PHP-DEV] Re: RFC about automatic template escaping

2016-03-21 Thread Tom Worster

Hi Daniel,

When I write scripts that need to behave the same independently of the 
value of mbstring.func_overload then I have to remember to be careful 
with the functions it affects. It's a drag. I resent having to write 
things like mb_strlen($str, '8bit') to get a byte-count knowing that the 
scripts would be cleaner and easier to understand if I could dictate the 
value of mbstring.func_overload (or if it had never been invented).


Would your proposal have any sort of similar effect? I mean, would it 
complicate the task of HTML-escaping output when the scripts need to 
work the same regardless of the '__auto_escape' ini setting?


Tom


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: PRNG: Raise warning and/or provide better pseudorandom generator?

2016-02-23 Thread Tom Worster

On 2/23/16 7:13 PM, Yasuo Ohgaki wrote:


>http://www.pcg-random.org/

It's simple and supports 64bit int out of the box.
Looks great!


PSG is very interesting. But it's new and hasn't been peer reviewed yet. 
It's in the "experimental" stage while others are more "well known".


xorshift+ seems fairly popular.

Tom

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: PRNG: Raise warning and/or provide better pseudorandom generator?

2016-02-23 Thread Tom Worster
I agree that mt_rand() should warn before delivering bogus outputs. But 
when it works, it works ok:


https://gist.github.com/tom--/a12175047578b3ae9ef8

Given that it hasn't been MT19937 or many years, it probably doesn't 
need to be.


If there is really a need for fast repeatable RNGs (the kind popular in 
monte-carlo sims) then MT is no longer the most likely candidate. A new 
API could allow the user to select a generator -- there are a lot to 
chose from (see below)!


Tom


These are the built-in generators in dieharder 3.31.1. Note the xorshift 
family isn't present. Melissa O’'Neill's new PCG family is interesting too.



000 borosh13
001 cmrg
002 coveyou
003 fishman18
004 fishman20
005 fishman2x
006 gfsr4
007 knuthran
008 knuthran2
009 knuthran2002
010 lecuyer21
011 minstd
012 mrg
013 mt19937
014 mt19937_1999
015 mt19937_1998
016 r250
017 ran0
018 ran1
019 ran2
020 ran3
021 rand
022 rand48
023 random128-bsd
024 random128-glibc2
025 random128-libc5
026 random256-bsd
027 random256-glibc2
028 random256-libc5
029 random32-bsd
030 random32-glibc2
031 random32-libc5
032 random64-bsd
033 random64-glibc2
034 random64-libc5
035 random8-bsd
036 random8-glibc2
037 random8-libc5
038 random-bsd
039 random-glibc2
040 random-libc5
041 randu
042 ranf
043 ranlux
044 ranlux389
045 ranlxd1
046 ranlxd2
047 ranlxs0
048 ranlxs1
049 ranlxs2
050 ranmar
051 slatec
052 taus
053 taus2
054 taus113
055 transputer
056 tt800
057 uni
058 uni32
059 vax
060 waterman14
061 zuf
203 ca
204 uvag
205 AES_OFB
206 Threefish_OFB
207 XOR (supergenerator)
208 kiss
209 superkiss
400 R_wichmann_hill
401 R_marsaglia_multic.
402 R_super_duper
403 R_mersenne_twister
404 R_knuth_taocp
405 R_knuth_taocp2


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: [PHP-CVS] com php-src: Revert "Fix #71152: mt_rand()returnsthedifferent values from original mt19937ar.c":ext/standard/rand.cext/standard/tests/math/mt_rand_value.phpt

2016-02-22 Thread Tom Worster
embarrassing correction to my message: the last four words of the 2nd 
para here should have been at the end of the previous para.


On 2/22/16 8:23 AM, Tom Worster wrote:


PHP is an unlikely language for the typical programs that specifically
need MT19937. I doubt we would sort out anyone's existing problems by
fixing it. If I'm wrong and there is indeed a need for this kind of RNG
then I'd rather see an API that supports more than just this one generator.

So I don't think PHP should feel obliged to provide a correct MT19937,
although it should correctly document what it does provide, like the
hash ext.


4c4,5
< then I'd rather see an API that supports more than just this one 
generator.

---
> then I'd rather see an API that supports more than just this one
> generator, like the hash ext.
7,8c8
< although it should correctly document what it does provide, like the
< hash ext.
---
> although it should correctly document what it does provide.

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP-DEV] Re: [PHP-CVS] com php-src: Revert "Fix #71152: mt_rand() returnsthedifferent values from original mt19937ar.c": ext/standard/rand.cext/standard/tests/math/mt_rand_value.phpt

2016-02-22 Thread Tom Worster
I ran mt_rand() through dieharder and it appears to perform well. I put 
the results here:


https://gist.github.com/tom--/a12175047578b3ae9ef8


On 2/19/16 8:39 PM, Andrea Faulds wrote:


PHP's implementation of the Mersenne Twister algorithm is buggy, so it
doesn't produce the same output as in other languages. But the buggy
algorithm produces sufficiently random sequences of apparently the same
quality as the proper algorithm.


I don't think it's safe to say that mt_rand() has the *same* qualities 
as MT19937. mt_rand()'s output has been tested using the available 
randomness testers and seems ok. But randomness testing is tricky and 
shows only that an RNG probably passes those specific tests, not that it 
has, for example, 623-dimensional equidistribution.




So we *could* simply consider this as a
documentation issue if we wanted to. I'm not saying that's the right
course of action, though.


mt_rand() is really weird.

- Some unique RNG not described or studied in the literature that by 
some fluke(*) appears to work, in a manner of speaking.


- It's output is 31-bits wide.

- It's scaling to a given [min, max] range is crazy.

It's so weird I would suggest documenting the problems it and leave it 
alone.


Users that don't need to reseed and regenerate a sequence can use 
random_bytes() and random_int(). Those that *do* need to reseed but 
don't need specifically MT19937 are probably adequately served by mt_rand().


PHP is an unlikely language for the typical programs that specifically 
need MT19937. I doubt we would sort out anyone's existing problems by 
fixing it. If I'm wrong and there is indeed a need for this kind of RNG 
then I'd rather see an API that supports more than just this one generator.


So I don't think PHP should feel obliged to provide a correct MT19937, 
although it should correctly document what it does provide, like the 
hash ext.


Tom


(*) At one level it astonishes me that the buggy mt_rand() works at all 
as an RNG, given that it's algorithm presumably was never actually 
designed. But the fact that it passes the standard statistical tests 
makes me wonder if it is MT19937 in disguise. I tried to figure out if 
its output is a function of MT19937's, perhaps a bit permutation, for 
example, but didn't get far.




--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Re: Internals and Newcomers and the Sidelines -- "let's proceed to ideas"

2016-01-14 Thread Tom Worster
On 1/14/16, 9:37 AM, "Pierre Joye"  wrote:

>I think we get every one point about where we stand, between the
>people against a CoC, against a CoC with teeth etc.

I wasn't talking about the Code of Conduct. Different topic.
 


>This is getting
>nowhere and we are really off topic.
>
>I would suggest to stop talking in circle for now and wait the next
>version of the RFC. Then we can focus on the content of the CoC, let
>me rephrase that, then we can focus only on the content of the CoC and
>the eventual "CoC group" and its role.




-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



  1   2   >