date:20150222

Re: [PHP-DEV] [RFC] Exceptions in the engine

2015-02-22 Thread Tony Marston

Pierre Joye  wrote in message 
news:CAEZPtU7vt=ppk4p3vfzflaepzi_wfr2hr_av+dtzvd6d2dz...@mail.gmail.com...


On Feb 21, 2015 2:08 AM, Tony Marston tonymars...@hotmail.com wrote:


Nikita Nefedov  wrote in message news:op.xuco5eutc9evq2@nikita-pc...



On Fri, 20 Feb 2015 12:39:33 +0300, Tony Marston 
tonymars...@hotmail.com

wrote:



I disagree. Exceptions were originally invented to solve the

semipredicate problem which only exists with procedural functions, not
object methods. Many OO purists would like exceptions to be thrown
everywhere, but this would present a huge BC break. If it were possible get
these functions to throw an exception ONLY when they are included in a try
... catch block then this would not break BC at all.




Tony, first of all - this still breaks BC, because exception is being

thrown in a place where it used not to be...



I disagree. The following function calls would not throw exceptions
   fopen(...);
   fwrite(...);
   fclose(...);

while the following code would:
   try {
   fopen(...);
   fwrite(...);
   fclose(...);
   } catch () {

   
   }


When some function's result heavily depends on the context it makes said

function much harder to reason about. And creates mental overhead for those
who'll have to read the code with this function.


And again, if you need exceptions for fopen please consider using

SplFileObject.



For file usage, yes. But are there any other procedural functions without

an Spl* alternative which would benefit from this technique?

Expected failures should not raise exception. For example, IOs are expected
to fail (be network, filesystem etc), I would really not be in favor of
adding exceptions for similar cases. This is a normal control flow.
Pierre Joye  wrote in message 
news:CAEZPtU7vt=ppk4p3vfzflaepzi_wfr2hr_av+dtzvd6d2dz...@mail.gmail.com...


On Feb 21, 2015 2:08 AM, Tony Marston tonymars...@hotmail.com wrote:


Nikita Nefedov  wrote in message news:op.xuco5eutc9evq2@nikita-pc...



On Fri, 20 Feb 2015 12:39:33 +0300, Tony Marston 
tonymars...@hotmail.com

wrote:



I disagree. Exceptions were originally invented to solve the

semipredicate problem which only exists with procedural functions, not
object methods. Many OO purists would like exceptions to be thrown
everywhere, but this would present a huge BC break. If it were possible get
these functions to throw an exception ONLY when they are included in a try
... catch block then this would not break BC at all.




Tony, first of all - this still breaks BC, because exception is being

thrown in a place where it used not to be...



I disagree. The following function calls would not throw exceptions
   fopen(...);
   fwrite(...);
   fclose(...);

while the following code would:
   try {
   fopen(...);
   fwrite(...);
   fclose(...);
   } catch () {

   
   }


When some function's result heavily depends on the context it makes said

function much harder to reason about. And creates mental overhead for those
who'll have to read the code with this function.


And again, if you need exceptions for fopen please consider using

SplFileObject.



For file usage, yes. But are there any other procedural functions without

an Spl* alternative which would benefit from this technique?

Expected failures should not raise exception. For example, IOs are expected
to fail (be network, filesystem etc), I would really not be in favor of
adding exceptions for similar cases. This is a normal control flow.


Then why do SplFileInfo::openFile and SplFileObject::__construct throw 
exceptions if the file cannot be opened?


--
Tony Marston


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: AW: [PHP-DEV] Coercive Scalar Type Hints RFC

2015-02-22 Thread rich gray


On 22/02/2015 13:09, Robert Stoll wrote:

[snip]

...

PHP 7.1: necessary bug-fixes introduced with PHP 7.0
PHP 7.x: deprecate even more if required
PHP 8:
   - introduce scalar type hints which reflect the conversion rules as defined (adding 
strict type hints as well is possible of course, whether with an ini-setting, a declare 
statement or individually with a modifier something like strict int for a 
single parameter or strict function for all parameters incl. return type or strict class 
for every type defined in the class is up to discussion)
   - exchange the behaviour of (bool), (int) etc. - use the new conversion 
rules instead
   - change internal functions which do not yet obey to the new conversion rules
   - change the operators which do not yet obey to the new conversion rules (for 
instance, + would also emit an E_RECOVERABLE_ERROR for a + 1)
   - (change the control structures in order that they obey the new conversion 
rules as well) = as mentioned above, probably too strict for PHP

Back to this RFC.  think this RFC goes in the right direction with the 
specified conversion rules. Only thing to get rid of are the implicit 
conversions to bool from string, float and int IMO.
Moreover, I like that the RFC already has different steps for adding the new 
behaviour. Yet, I think it should slow down a little bit as shown. I think we 
need more time to come up with a very good strategic solution.

Thoughts?
  
+1 - good analysis - a single mode approach with consistent type 
coercion rules across the board makes absolute sense even if STH are put 
back until PHP 8.x



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] Coercive Scalar Type Hints RFC

2015-02-22 Thread Jefferson Gonzalez


On 02/22/2015 09:00 AM, Etienne Kneuss wrote:


There have been several attempts:
for JS: http://users-cs.au.dk/simonhj/tajs2009.pdf
or similar techniques applied to PHP, quite outdated though:
https://github.com/colder/phantm

You are right that the lack of static information about types is (one of
the) a main issue. Recovering the types has typically a huge performance
cost, or is unreliable

But seriously, time is getting wasted on this argument; it's actually a
no-brainer: more static information helps tools that rely on static
information. Yes. Absolutely. 100%.

The question is rather: at what weight should we take (potential/future)
external tools into account when developping language features?



Previous on the list nodejs JIT engine was mentioned as a working 
example of a JIT without the language having any sort of type 
information. While this is true I think it should also be considered the 
amount of checks and resources required for the generated machine code 
to achieve this. On this tests you can see that in most situations the 
javascript v8 engine used on nodejs uses much more memory than that of 
current PHP (compare it to C also) 
(benchmarksgame.alioth.debian.org/u64/compare.php?lang=v8lang2=php) 
Yes, it is faster, but it consumes much more CPU and RAM in most 
situations, and I'm sure that it is related to the dynamic nature of the 
language.


A JIT or AOT machine code generator IMHO will never have a decent use of 
system resources without some sort of strong/strict typed rules, 
somebody explain if thats not the case.


As I see it, some example, if the JIT generated C++ code to then 
generate the machine code:


function calc(int $val1, int $val2) : int {return $val1 + $val2;}

On weak mode I see the generated code would be something like this:

Variant* calc(Variant val1, Variant val2) {
if(val1.isInt()  val2.isInt())
return new Variant(val1.toInt() + val2.toInt());

else if(val1.isFloat()  val2.isFloat())
return new Variant(val1.toInt() + val2.toInt());
else
throw new RuntimeError();
}

while on strict mode the generated code could be:

int calc(int val1, int val2) {
return val1 + val2;
}

So in this scenario is clear that strict mode performance and memory 
usage would be better. Code conversion code would be required only in 
some cases, example:


calc(1, 5) // No need for casting

calc((int) 12, (int) 15) // Needs casting depending on how the 
parser deals with it


If my example is right it means strict would be better to achieve good 
performance rather than weak which is almost the same situation we have 
now with zval's. Also I think is wrong to say that casting will always 
take place on strict mode.


So I have some questions floating on my mind for the coercive rfc.

1. Does weak mode could provide the required rules to implement a JIT 
with a sane level of memory and CPU usage?


2. I see that the proponents of dual weak/strict modes are offering to 
write a AOT implementation if strict makes it, And the impresive work of 
Joe (JITFU) and Anthony on recki-ct with the strict mode could be taken 
to another level of integration with PHP and performance. IMHO is harder 
and more resource hungry to implement a JIT/AOT using weak mode. With 
that said, if a JIT implementation is developed will the story of the 
ZendOptimizer being a commercial solution will be repeated or would this 
JIT implementation would be part of the core?


Thats all that comes to mind now, and while many people doesn't care for 
performance, IMHO a programming language mainly targeted for the web 
should have some caring on this department.


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] Coercive Scalar Type Hints RFC

2015-02-22 Thread Shashank Kumar

On Sat, Feb 21, 2015 at 2:39 PM, Pavel Kouřil pajou...@gmail.com wrote:

 On Sat, Feb 21, 2015 at 11:25 PM, Zeev Suraski z...@zend.com wrote:
  There’s a fundamental difference between the two RFCs that goes beyond
  whether using a global INI setting and the other per-file setting.  The
  fundamental difference is that the endgame of the Dual Mode RFC is having
  two modes – and whatever syntax we’ll add, will be with us forever;  and
 in
  the Coercive STH RFC – the endgame is a single mode with no INI entries,
  and opening the door to changing the rest of PHP to be consistent with
 the
  same rule-set as well (implicit casts).   The challenge with the Coercive
  STH RFC is figuring out the best transition strategy, but the endgame is
  superior IMHO.

 Hello,

 the two modes was something that I didn't like, at all, as a userland
 developer. It seems really scary that decision to add 2 modes would
 mean that every PHP code could have been written in any of these 2
 ways and it would stick forever with PHP, because removing it again if
 it proved to be a bad feature would be IMHO really painful.

 So a single mode is infinitely better than 2 modes.

 Also, personally, I would prefer #1 or #2 version for internal
 functions, but definitely without an INI switch. Not being able to
 change it on some hostings could make development for the transition
 period kinda painful.

 Regards
 Pavel Kouril

 --
 PHP Internals - PHP Runtime Development Mailing List
 To unsubscribe, visit: http://www.php.net/unsub.php

 As a userland developer I will chime in and say that I love this rfc
better than the dual mode. In my opinion, having dual mode will
put unnecessary cognitive burden on me especially when reading other
people's code and libraries. While current rfc conversion rules are also
different than what exists in PHP but I can get used to them and they are
more
in line with what should ideally be there (except contentious ones like
bool).

I will certainly love to see the type coercion rules being unified and
slightly tightened up
over time and this is a good first step for that.
I am not a lang expert but of the languages I have learnt
and developed in, I can't think of any that allow dual mode type coercion
rules in such an explicit manner as the one proposed by Andrea/Anthony.

Thanks
Shashank

Re: [PHP-DEV] Unnecessary extensions ...

2015-02-22 Thread Michael Wallner


 On 22 02 2015, at 11:31, Lester Caine les...@lsces.co.uk wrote:
 
 With the discussion on adding http extension by default and not now
 having other key extensions in a normal build I'm looking at what I NEED
 and what I can get away without. On the current PHP7 test build I do not
 have mysqlnd installed as I don't use mysql, but I can't make the mysql
 section available in a second php-fpm instance becuase I can't add
 mysqlnd as a shared module.


Just to clarify that bit: enabling ext/http by default wouldn’t mean it’s not 
possible to disable it.

An extension enabled by default, does not implicitly mean it cannot be 
disabled, like standard or spl.


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] PDO_DBLIB type handling

2015-02-22 Thread Matteo Beccati


On 21/02/2015 23:12, Yasuo Ohgaki wrote:

Hi Adam,

On Sat, Feb 21, 2015 at 2:22 AM, Adam Baratz adam.bar...@gmail.com wrote:


This driver returns all column data as a string, regardless of how it's
represented in the DB. I created a patch for my own use that syncs up the
type handling with the behavior of the MSSQL extension. This seems like it
would be of general use. Does anyone have any feedback before I put
together an RFC? My main question would be whether people would rather have
this be the default/only behavior, or whether it should be opted into
via PDO::ATTR_STRINGIFY_FETCHES.



Databases return string data to return correct data in DB.
Most obvious is NUMERIC data type. NUMERIC has any precision.
We may have 128 bit INT in near future also.

So it should return string by default, PHP may convert  types into
PHP native types optionally. Not the other way around. IMHO.


The default behaviour of mysql/pgsql drivers is to convert to the 
matching PHP type, if possible. That can be turned off via 
PDO::ATTR_STRINGIFY_FETCHES = true.


If PDO_DBLIB doesn't behave like that, I'd say it's a bug that needs to 
be fixed, but possibly only applied to a major/minor release due to the 
BC break.



Cheers
--
Matteo Beccati

Development  Consulting - http://www.beccati.com/

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] Coercive Scalar Type Hints RFC

2015-02-22 Thread Etienne Kneuss

On Sat Feb 21 2015 at 21:08:39 Anthony Ferrara ircmax...@gmail.com wrote:

 Zeev,

 I won't nit-pick every point, but there are a few I think need to be
 clarified.

   Proponents of Dynamic STH bring up consistency with the rest of the
  language, including some fundamental type-juggling aspects that have
 been
  key tenets of PHP since its inception. Strict STH, in their view, is
  inconsistent
  with these tenets.
 
  Dynamic STH is apparently consistency with the rest of the language's
  treatment of scalar types. It's inconsistent with the rest of the
  languages
  treatment of parameters.
 
  Not in the way Andrea proposed it, IIRC.  She opted to go for consistency
  with internal functions.  Either way, at the risk of being shot for
 talking
  about spiritual things, Dynamic STH is consistent with the dynamic
 spirit of
  PHP, even if there are some discrepancies between its rule-set and the
  implicit typing rules that govern expressions.  Note that in this RFC I'm
  actually suggesting a possible way forward that will align *all* aspects
 of
  PHP, including implicit casting - and have them all governed by a single
 set
  of rules.

 The point I was making up to there is that we currently have 2 type
 systems: user-land object and ZPP-scalar. So in any given function you
 have 2 type systems interacting. The current ZPP scalar type is
 dynamic, and user-land object static.

 With the proposal here, you'd unify user-land scalar to behave as
 zpp-scalar. So you'd have two type systems in any given function:
 scalar and object (which behave differently).

 My proposal gives you the same two by default (scalar and object) and
 a strict switch to collapse them into a single, unified type system.

 This is even more apparent with the int-float acceptance, because we
 can mentally model Float as an object that extends Int. Then it makes
 perfect sense why you'd accept ints where you see floats, but not the
 opposite.

  However there's an important point to make here: a lot of best practice
  has
  been pushing against the way PHP treats scalar types in certain cases.
  Specifically around == vs === and using strict comparison mode in
  in_array,
  etc.
 
  I think you're correct on comparisons, but not so much on the rest.
 Dynamic
  use of scalars in expressions is still exceptionally common in PHP code.
  Even with comparisons, == is still very common - and you'd use == vs. ===
  depending on what you need.
 
  So while it appears consistent with the rest of PHP, it only does so if
  you
  ignore a large part of both the language and the way it's commonly used.
 
  Let's agree to disagree.  That's one thing we can always agree on!  :)

 I'm talking about the object system. I don't think you're disagreeing
 that it's static. Hence coercive scalars are consistent only if you
 look at 1/2 the type system. That was the point I was making there.

  3. Just Do It but give users an option to not - This has the problems
  that
  E_DEPRECATED has, but it also gets us back to having fundamental code
  behavior controlled by an INI setting, which for a very long time this
  community has generally seen as a bad thing (especially for portability
  and
  code re-use).
 
  I do too, and I was upfront about their cons, not just pros.  And yet,
 they
  all bring us to a much better outcome within a relatively short period of
  time (in the lifetime of a language) than the Dual Mode will.

 Let's agree to disagree that an ini setting will be better than a
 per-file setting.

 In fact, I personally think this is major enough of an issue that I
 will vote no simply on this reason alone (type behavior depending on
 an ini setting in any way shape or form).

   Further, the two sets can cause the same functions to behave
   differently depending on where they're being called
 
  I think that's misleading. The functions will always behave the same.
  The difference is how you get data into the function. The behavior
  difference
  is in your code, not the end function.
 
  I'll be happy to get a suggestion from you on how to reword that.
  Ultimately, from the layman user's point of view, she'd be calling foo()
  from one place and have it accept her arguments, and foo() from another
  place and have it reject the very same arguments.

 Let me think on it and I will come up with something.

  With strict mode, you'd have to embed a cast (smart or explicit) to
  convert to
  an integer at the point the data comes in.
 
  First, I'm not aware of smart/safe casts being available or proposed at
 this
  point.
  Secondly, why at the point the data comes in?  That would be ideal for
  static analyzers, but it's probably a lot more common that it will be
 done
  at the first point in time where it gets rejected.

 By smart cast I was referring to a function which checked
 is_numeric(). Not a new language construct.

  I have a hard time connecting to the 'power' approach.  I think
 developers
  want their code to work, with

[PHP-DEV] Unnecessary extensions ...

2015-02-22 Thread Lester Caine

With the discussion on adding http extension by default and not now
having other key extensions in a normal build I'm looking at what I NEED
and what I can get away without. On the current PHP7 test build I do not
have mysqlnd installed as I don't use mysql, but I can't make the mysql
section available in a second php-fpm instance becuase I can't add
mysqlnd as a shared module.

Just what is the current state on what is 'required' and what is still
optional. I will return to banging on about breaking up php-src so that
one CAN get away with building individual modules and I see no reason
why those who want 'strict' can't have that as a pecl module to replace
'lax' operations.

-- 
Lester Caine - G8HFL
-
Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk
Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] Unnecessary extensions ...

2015-02-22 Thread Joe Watkins

 I see no reason
why those who want 'strict' can't have that as a pecl module to replace
'lax' operations.

Simple, the most robust implementation is inferior to internal support.

Making a call to this:

function (int $some, double $other) {

}

behave as if Zend is strict is quite easy, what is difficult is:

class Foo {
public function bar(int $some) {

}
}

class Qux extends Foo {
public function bar(double $some) {

}
}

Enforcing our current rules is so hard you might as well call it impossible.

TL;DR because internal support is much much much better, in every possible
way

Cheers
Joe

Even if you managed it, it would not be robust, in any reasonable opinion.



On Sun, Feb 22, 2015 at 10:31 AM, Lester Caine les...@lsces.co.uk wrote:

 With the discussion on adding http extension by default and not now
 having other key extensions in a normal build I'm looking at what I NEED
 and what I can get away without. On the current PHP7 test build I do not
 have mysqlnd installed as I don't use mysql, but I can't make the mysql
 section available in a second php-fpm instance becuase I can't add
 mysqlnd as a shared module.

 Just what is the current state on what is 'required' and what is still
 optional. I will return to banging on about breaking up php-src so that
 one CAN get away with building individual modules and I see no reason
 why those who want 'strict' can't have that as a pecl module to replace
 'lax' operations.

 --
 Lester Caine - G8HFL
 -
 Contact - http://lsces.co.uk/wiki/?page=contact
 L.S.Caine Electronic Services - http://lsces.co.uk
 EnquirySolve - http://enquirysolve.com/
 Model Engineers Digital Workshop - http://medw.co.uk
 Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

 --
 PHP Internals - PHP Runtime Development Mailing List
 To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] Unnecessary extensions ...

2015-02-22 Thread Joe Watkins

Apologies for terribly formatted communication there.

Cheers
Joe

On Sun, Feb 22, 2015 at 10:38 AM, Michael Wallner m...@php.net wrote:


  On 22 02 2015, at 11:31, Lester Caine les...@lsces.co.uk wrote:
 
  With the discussion on adding http extension by default and not now
  having other key extensions in a normal build I'm looking at what I NEED
  and what I can get away without. On the current PHP7 test build I do not
  have mysqlnd installed as I don't use mysql, but I can't make the mysql
  section available in a second php-fpm instance becuase I can't add
  mysqlnd as a shared module.


 Just to clarify that bit: enabling ext/http by default wouldn’t mean it’s
 not possible to disable it.

 An extension enabled by default, does not implicitly mean it cannot be
 disabled, like standard or spl.


 --
 PHP Internals - PHP Runtime Development Mailing List
 To unsubscribe, visit: http://www.php.net/unsub.php

RE: [PHP-DEV] Coercive Scalar Type Hints RFC

2015-02-22 Thread Zeev Suraski

From: Etienne Kneuss [mailto:col...@php.net]
Sent: Sunday, February 22, 2015 3:00 PM
To: Anthony Ferrara; Zeev Suraski
Cc: PHP internals
Subject: Re: [PHP-DEV] Coercive Scalar Type Hints RFC

 The question is rather: at what weight should we take (potential/future)
 external tools into account when developping language features?

Agreed!  My answer - Static Analyzers need to be designed for Languages,
rather than Languages being designed for Static Analyzers.

Will send additional thoughts on Static Analysis on a separate, off-list
email to put this argument to rest as both Anthony and agreed to.

Zeev

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

AW: [PHP-DEV] Coercive Scalar Type Hints RFC

2015-02-22 Thread Robert Stoll

Hi Zeev,

 -Ursprüngliche Nachricht-
 Von: Zeev Suraski [mailto:z...@zend.com]
 Gesendet: Samstag, 21. Februar 2015 18:22
 An: PHP internals
 Betreff: [PHP-DEV] Coercive Scalar Type Hints RFC
 
 All,
 
 
 
 I’ve been working with François and several other people from internals@ and 
 the PHP community to create a single-mode
 Scalar Type Hints proposal.
 
 
 
 I think it’s the RFC is a bit premature and could benefit from a bit more 
 time, but given the time pressure, as well as the fact
 that a not fully compatible subset of that RFC was published and has people 
 already discussing it, it made the most sense to
 publish it sooner rather than later.
 
 
 
 The RFC is available here:
 
 
 
 wiki.php.net/rfc/coercive_sth
 
 
 
 Comments welcome!
 
 
 Zeev

First of all, thank you and all others working on this RFC but also people 
working on another RFC related to scalar type hints. It is good that PHP will 
get scalar type hints eventually.

Although I think the strict mode as proposed in the v0.5 RFC is nice as such, I 
prefer this RFC simply by the fact that it does not introduce different modes. 
I genuinely believe that different modes will be very harmful for PHP.
Yet, this RFC is not perfect either. IMO PHP is not ready for scalar type 
hints, not for PHP 7.0 respectively and we should instead focusing on clearing 
the way for the introduction of scalar type hints in PHP 7.x and hence 
introduce all required BC breaks in PHP 7.0 which are necessary in order that 
scalar type hints can be added to PHP 7.x later on. 

I am provoking on purpose of course. But rightly so, because I think in all the 
debate about strict/weak scalar type hints we lost focus on what really matters 
in language growth, namely its maturation. 

For instance, Pierre and others carped about that string - bool and float - 
bool are accepted by this RFC. While I agree that it is a bad idea to apply 
implicit conversions to such input  (I would not even allow int - bool to be 
honest), it makes totally sense for PHP to behave like this at the moment. I 
would even claim that null, array, object, literally everything should be 
accepted as well since the explicit (bool) accepts everything as well and some 
implicit castings such as the one in an if statement accepts also everything. 

From questions like these:
 Boolean STH (bool):
 this is by far too weak. How strings could be consider as valid, how?
 true  Boolean true? I suppose then false will be boolean false?
 What's is the boolean value of float 0.5?
 At the very least only integer should be accepted, 0  false, anything =1 
 true

I get the impression that even internals start to get confused about the 
conversion rules PHP has. Implicitly convert something to bool should be 
exactly the same as an explicit conversion (thus straight forward to verify 
http://3v4l.org/nVgbG ).
We should start to eliminate the different behaviour of implicit/explicit 
castings [1], to have a consistent and predictable/obvious behaviour in the 
long run. Or in other words, and that is what I meant above, PHP's type system 
needs to mature. While I can understand that it looks beneficial to have all 
kind of reliefs for the beginner, it is rather harmful in the long run. PHP has 
so many inconsistencies and requires a user to be aware of all kind of edge 
cases that I think bugs are introduced more frequently than necessary. We 
already have different conversion mechanisms in PHP and I guess the reason why 
https://wiki.php.net/rfc/safe_cast was declined is based on the fact that most 
people did not want to see yet another group of conversion rules.

There were people claiming that PHP follows the philosophy that a user does not 
need to know anything about scalar types. PHP will deal with it via type 
juggling. A function/operator requires an int? Just pass a scalar and PHP will 
convert it automatically via type juggling to int. 
That is long gone (probably was never there) because the user had to know 
exactly what type can be passed or rather what values, otherwise bugs are 
inevitable. Consider the following:

 a % 1;
fmod(a, 0.5);

Kind of logical that % accepts any kind of scalar where fmod does not, right? I 
do not want to exaggerate too much on this but I think you get my position that 
PHP needs to get rid of this inconsistencies rather than adding yet another 
obstacle which impedes to reach consistency. Once scalar type hints are in 
place it should follow the conversion rules which we want to have in PHP in the 
long run otherwise the BC impact it would have to change them would be too big 
and we would at least need to wait till PHP 8 if not even PHP 9.

So what does that mean for scalar types?
IMO it means that way more important than adding scalar type hints to PHP 7.0 
is to agree on a new set of conversion rules for the long run. PHP should 
strive to have one consistent set of conversion rules which apply in all places 
where implicit or explicit conversion are

Re: [PHP-DEV] Coercive Scalar Type Hints RFC

2015-02-22 Thread Pavel Kouřil

On Sun, Feb 22, 2015 at 2:09 PM, Robert Stoll p...@tutteli.ch wrote:

 I see the migration plan roughly as follows:

 PHP 7.0:
   - reserve keywords: bool, int, float including alternatives
   - deprecate alternative type names such as boolean, integer etc.

   - introduce new conversion functions which reflect the current behaviour of 
 (bool), (int) etc.
   -- as mentioned above, they could be named oldSchoolBoolConversion etc.
   -- Encourage users to use this function instead of (bool), (int) etc 
 since (bool) etc. will change with PHP 8.0. Also mention, that this function 
 should only be used if the weakness is really required otherwise use the new 
 conversion functions from below

   - introduce new conversion functions which reflect the new defined 
 conversion rule set (which shall be the only one encouraged in the future) 
 Those functions shall trigger an E_RECOVERABLE_ERROR
 -- encourage users to use this functions instead of (bool), (int) and 
 oldSchoolBoolConversion etc.  (unless the weakness is really required, then 
 use oldSchoolBoolConversion)

   - update the docs in order to reflect the new encouraged way. Also mention 
 that:
  - (bool), (int) etc. will change their behaviour in PHP 8.0
  - internal functions will use the new conversion rules if not already 
 done this way in PHP 8.0 (for instance, strstr will no longer accept a scalar 
 as third parameter in the case where we do not support implicit casts to bool)
  - operators will use the new conversion rules if not already done this 
 way in PHP 8.0
  - (control structures will use the new conversion rules if not already 
 done this way in PHP 8.0) =Maybe this is too strict for most of you and goes 
 against the spirit of PHP (I suppose some of you will say that - fair enough, 
 I guess you are right). In this case, I would at least use the term loose 
 comparison as mentioned here: 
 http://php.net/manual/en/types.comparisons.php#types.comparisions-loose 
 instead of using the term conversion, then it is compatible with the 
 changes introduced in PHP 8.0

 PHP 7.1: necessary bug-fixes introduced with PHP 7.0
 PHP 7.x: deprecate even more if required
 PHP 8:
   - introduce scalar type hints which reflect the conversion rules as defined 
 (adding strict type hints as well is possible of course, whether with an 
 ini-setting, a declare statement or individually with a modifier something 
 like strict int for a single parameter or strict function for all 
 parameters incl. return type or strict class for every type defined in the 
 class is up to discussion)
   - exchange the behaviour of (bool), (int) etc. - use the new conversion 
 rules instead
   - change internal functions which do not yet obey to the new conversion 
 rules
   - change the operators which do not yet obey to the new conversion rules 
 (for instance, + would also emit an E_RECOVERABLE_ERROR for a + 1)
   - (change the control structures in order that they obey the new conversion 
 rules as well) = as mentioned above, probably too strict for PHP

 Back to this RFC.  think this RFC goes in the right direction with the 
 specified conversion rules. Only thing to get rid of are the implicit 
 conversions to bool from string, float and int IMO.
 Moreover, I like that the RFC already has different steps for adding the new 
 behaviour. Yet, I think it should slow down a little bit as shown. I think we 
 need more time to come up with a very good strategic solution.


Hello,

Am I understanding correctly that you are suggesting changes to type
casting? This seems like a bad idea. Explicit and implicit conversions
are something really different. Generally, implicit conversions are OK
only when no data is lost and explicit conversions (casts) are used
when you realize some information can get lost and you still want to
proceed with the conversion. Having only one type of conversion is
IMHO weird.

Also, I'm not a fan of having to wait for scalar type hints for few
more years. :(

Regards
Pavel Kouril

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] Coercive Scalar Type Hints RFC

2015-02-22 Thread Joe Watkins

 1. Does weak mode could provide the required rules to implement a JIT
with a sane level of memory and CPU usage?

There is no objective answer to the question while it has the clause with
a sane level of 

The assertion in the RFC that says there is no difference between strict
and weak types, in the context of a JIT/AOT compiler, is wrong.

function add(int $l, int $r) {
return $l + $r;
}

The first instruction in the interpreted code is not ZEND_ADD, first,
parameters must be received from the stack.

If that's a strict function, then un-stacking parameters is relatively
easy, if it's a dynamic function then you have to generate code that is
considerably more complicated.

This is an inescapable difference, of the the kind that definitely does
have a negative impact on implementation complexity, runtime, and
maintainability.

To me, it only makes sense to compile strict code AOT or JIT; If you want
dynamic behaviour, we have an extremely mature platform for that.

 2. ... With that said, if a JIT implementation is developed will the
story of the ZendOptimizer being a commercial solution will be repeated or
would this JIT implementation would be part of the core?

There should hopefully be no need to complicate the core with the
implementation, the number of people that are capable of maintaining Zend
is low enough already, the number of people able to maintain something as
new (for us) and complex as a JIT/AOT engine is even less, I fear.

I think it's likely that Anthony and I, and Dmitry want different things
for a JIT/AOT engine. I think Anthony and I are preferring an engine that
requires minimal inference because type information is present (or
implicit), while Dmitry probably favours the kind that can infer at
runtime, the dynamic kind, like Zend is today. They are a world apart, I
think, I'll be happy to be proven wrong about that.

I like to think that even if Dmitry wrote it all by himself, it would be
opensource from the start, in fact I don't think that will happen. I'm
hoping we'll all work on the same solution together.

Cheers
Joe




On Sun, Feb 22, 2015 at 2:24 PM, Jefferson Gonzalez jgm...@gmail.com
wrote:

 On 02/22/2015 09:00 AM, Etienne Kneuss wrote:

  There have been several attempts:
 for JS: http://users-cs.au.dk/simonhj/tajs2009.pdf
 or similar techniques applied to PHP, quite outdated though:
 https://github.com/colder/phantm

 You are right that the lack of static information about types is (one of
 the) a main issue. Recovering the types has typically a huge performance
 cost, or is unreliable

 But seriously, time is getting wasted on this argument; it's actually a
 no-brainer: more static information helps tools that rely on static
 information. Yes. Absolutely. 100%.

 The question is rather: at what weight should we take (potential/future)
 external tools into account when developping language features?


 Previous on the list nodejs JIT engine was mentioned as a working example
 of a JIT without the language having any sort of type information. While
 this is true I think it should also be considered the amount of checks and
 resources required for the generated machine code to achieve this. On this
 tests you can see that in most situations the javascript v8 engine used on
 nodejs uses much more memory than that of current PHP (compare it to C
 also) (benchmarksgame.alioth.debian.org/u64/compare.php?lang=v8lang2=php)
 Yes, it is faster, but it consumes much more CPU and RAM in most
 situations, and I'm sure that it is related to the dynamic nature of the
 language.

 A JIT or AOT machine code generator IMHO will never have a decent use of
 system resources without some sort of strong/strict typed rules, somebody
 explain if thats not the case.

 As I see it, some example, if the JIT generated C++ code to then generate
 the machine code:

 function calc(int $val1, int $val2) : int {return $val1 + $val2;}

 On weak mode I see the generated code would be something like this:

 Variant* calc(Variant val1, Variant val2) {
 if(val1.isInt()  val2.isInt())
 return new Variant(val1.toInt() + val2.toInt());

 else if(val1.isFloat()  val2.isFloat())
 return new Variant(val1.toInt() + val2.toInt());
 else
 throw new RuntimeError();
 }

 while on strict mode the generated code could be:

 int calc(int val1, int val2) {
 return val1 + val2;
 }

 So in this scenario is clear that strict mode performance and memory usage
 would be better. Code conversion code would be required only in some cases,
 example:

 calc(1, 5) // No need for casting

 calc((int) 12, (int) 15) // Needs casting depending on how the parser
 deals with it

 If my example is right it means strict would be better to achieve good
 performance rather than weak which is almost the same situation we have now
 with zval's. Also I think is wrong to say that casting will always take
 place on strict mode.

 So I have some questions floating on my mind for the coercive rfc.

AW: [PHP-DEV] Coercive Scalar Type Hints RFC

2015-02-22 Thread Robert Stoll

Hi Pavel,

 -Ursprüngliche Nachricht-
 Von: Pavel Kouřil [mailto:pajou...@gmail.com]
 Gesendet: Sonntag, 22. Februar 2015 15:54
 An: Robert Stoll
 Cc: Zeev Suraski; PHP internals
 Betreff: Re: [PHP-DEV] Coercive Scalar Type Hints RFC
 
 On Sun, Feb 22, 2015 at 2:09 PM, Robert Stoll p...@tutteli.ch wrote:
 
  I see the migration plan roughly as follows:
 
  PHP 7.0:
- reserve keywords: bool, int, float including alternatives
- deprecate alternative type names such as boolean, integer etc.
 
- introduce new conversion functions which reflect the current behaviour 
  of (bool), (int) etc.
-- as mentioned above, they could be named oldSchoolBoolConversion 
  etc.
-- Encourage users to use this function instead of (bool),
  (int) etc since (bool) etc. will change with PHP 8.0. Also mention,
  that this function should only be used if the weakness is really
  required otherwise use the new conversion functions from below
 
- introduce new conversion functions which reflect the new defined 
  conversion rule set (which shall be the only one
 encouraged in the future) Those functions shall trigger an E_RECOVERABLE_ERROR
  -- encourage users to use this functions instead of (bool), (int)
  and oldSchoolBoolConversion etc.  (unless the weakness is really
  required, then use oldSchoolBoolConversion)
 
- update the docs in order to reflect the new encouraged way. Also 
  mention that:
   - (bool), (int) etc. will change their behaviour in PHP 8.0
   - internal functions will use the new conversion rules if not already 
  done this way in PHP 8.0 (for instance, strstr will no
 longer accept a scalar as third parameter in the case where we do not support 
 implicit casts to bool)
   - operators will use the new conversion rules if not already done this 
  way in PHP 8.0
   - (control structures will use the new conversion rules if not
  already done this way in PHP 8.0) =Maybe this is too strict for most
  of you and goes against the spirit of PHP (I suppose some of you will
  say that - fair enough, I guess you are right). In this case, I would
  at least use the term loose comparison as mentioned here:
  http://php.net/manual/en/types.comparisons.php#types.comparisions-loos
  e instead of using the term conversion, then it is compatible with
  the changes introduced in PHP 8.0
 
  PHP 7.1: necessary bug-fixes introduced with PHP 7.0 PHP 7.x:
  deprecate even more if required PHP 8:
- introduce scalar type hints which reflect the conversion rules as 
  defined (adding strict type hints as well is possible of
 course, whether with an ini-setting, a declare statement or individually with 
 a modifier something like strict int for a single
 parameter or strict function for all parameters incl. return type or strict 
 class for every type defined in the class is up to
 discussion)
- exchange the behaviour of (bool), (int) etc. - use the new conversion 
  rules instead
- change internal functions which do not yet obey to the new conversion 
  rules
- change the operators which do not yet obey to the new conversion rules 
  (for instance, + would also emit an
 E_RECOVERABLE_ERROR for a + 1)
- (change the control structures in order that they obey the new
  conversion rules as well) = as mentioned above, probably too strict
  for PHP
 
  Back to this RFC.  think this RFC goes in the right direction with the 
  specified conversion rules. Only thing to get rid of are
 the implicit conversions to bool from string, float and int IMO.
  Moreover, I like that the RFC already has different steps for adding the 
  new behaviour. Yet, I think it should slow down a
 little bit as shown. I think we need more time to come up with a very good 
 strategic solution.
 
 
 Hello,
 
 Am I understanding correctly that you are suggesting changes to type casting? 
 This seems like a bad idea. Explicit and
 implicit conversions are something really different. Generally, implicit 
 conversions are OK only when no data is lost and
 explicit conversions (casts) are used when you realize some information can 
 get lost and you still want to proceed with the
 conversion. Having only one type of conversion is IMHO weird.

Yes, I am suggesting to make conversions behave the same regardless if it is 
implicit or explicit. The only difference between the two should be that one is 
stated explicitly by the user where the other is applied implicitly. Other 
programming languages behave like this and are more predictable for users as 
well as developers because one does not need to learn two sets of conversion 
rules.

 
 Also, I'm not a fan of having to wait for scalar type hints for few more 
 years. :(
 
 Regards
 Pavel Kouril
 
 --
 PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: 
 http://www.php.net/unsub.php



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] Coercive Scalar Type Hints RFC

2015-02-22 Thread Jefferson Gonzalez


On 02/22/2015 11:06 AM, Joe Watkins wrote:


This is an inescapable difference, of the the kind that definitely does
have a negative impact on implementation complexity, runtime, and
maintainability.

To me, it only makes sense to compile strict code AOT or JIT; If you want
dynamic behaviour, we have an extremely mature platform for that.


So basically weak mode coupled with a JIT will be almost the same thing 
we have today, the only difference is the opcache would be replaced with 
machine code (for a bit more of performance), but the same logic and 
code used on the zend engine will also be used on the generated code of 
the JIT (more bloat).


On the other hand a strict type mode would allow the generation of 
machine code that is much cleaner and similar in respect of C to machine 
code translation, meaning it would be more efficient and less resource 
hungry to the point that functions code generated by the AOT or JIT 
would be more efficient than those functions provided by the zend 
engine, which does lots of type checking/parsing (less bloat).




There should hopefully be no need to complicate the core with the
implementation, the number of people that are capable of maintaining Zend
is low enough already, the number of people able to maintain something as
new (for us) and complex as a JIT/AOT engine is even less, I fear.


And thats why I asked about the commercial stuff, because, like things 
are looking, from a technical perspective the strict mode opens the 
doors for an easier implementation of AOT or JIT while a weak mode would 
only make it harder for others in the community to work in such things, 
which again rises the question, does this whole idea of favoring a weak 
model by the minority serve as an impediment/complication for others so 
they (those who favor weak) can force a commercial solution?



I think it's likely that Anthony and I, and Dmitry want different things
for a JIT/AOT engine. I think Anthony and I are preferring an engine that
requires minimal inference because type information is present (or
implicit), while Dmitry probably favours the kind that can infer at
runtime, the dynamic kind, like Zend is today. They are a world apart, I
think, I'll be happy to be proven wrong about that.

I like to think that even if Dmitry wrote it all by himself, it would be
opensource from the start, in fact I don't think that will happen. I'm
hoping we'll all work on the same solution together.


And it would be ideal to have the most capable people to develop this 
solution to work in a single team from a community point of view. IMHO a 
dual weak/strict mode is the best way of getting people to work together 
in a way that benefits the community. Otherwise, a single handed man 
working on a solution can serve as a justification to commercialize 
something that is being currently offered by others (HHVM).



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

[PHP-DEV] JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints RFC)

2015-02-22 Thread Zeev Suraski

 -Original Message-
 From: Jefferson Gonzalez [mailto:jgm...@gmail.com]
 Sent: Sunday, February 22, 2015 4:25 PM
 To: Etienne Kneuss; Anthony Ferrara; Zeev Suraski
 Cc: PHP internals
 Subject: Re: [PHP-DEV] Coercive Scalar Type Hints RFC

Jefferson,

Please note that Anthony, the lead of the Dual Mode RFC, said this earlier
on this thread, referring to the claim that Strict STH can improve JIT/AOT
compilation:

A statement which I said on the side, and I said should not impact RFC or
voting in any way. And is in no part in my RFC at all.

Please also see:

marc.info/?l=php-internalsm=142439750614527w=2

So while Anthony and I don't agree on whether there are performance gains
to be had from Strict STH, both of us agree that it's not at a level that
should influence our decision regarding the RFCs on the table.

I wholeheartedly agree with that stance, which is why I also listed the
apparently extremely widespread misconception (IMHO) that Strict STH can
meaningfully help JIT/AOT in my RFC.

Despite that, as your email suggests, there are still (presumably a lot)
of people out there that assume that there are, in fact, substantial gains
to be had from JIT/AOT if we introduce Strict STH.  I'm going to take
another stab at explaining why that's not the case.

 A JIT or AOT machine code generator IMHO will never have a decent use of
 system resources without some sort of strong/strict typed rules,
somebody
 explain if thats not the case.

It kind of is and kind of isn't.

There's consensus, I think, that if PHP was completely strongly typed -
i.e., all variables need to be declared and typed ahead of time, cannot
change types, etc. - we'd clearly be able to create a lot of optimizations
in AOT that we can't do today.  That the part that 'is the case'.  But
nobody is suggesting that we do that.  The discussion on the table is
very, very narrow:

-- Can the code generated for a strict type hint can somehow be optimized
significantly better than the code generated for a dynamic/coercive type
hint.

And here, I (as well as Dmitry, who actually wrote a JIT compiler for PHP)
claim that it isn't the case.  To be fair, there's no consensus on this
point.

Let me attempt, again, to explain why we don't believe there are any gains
associated with Strict STH, be them with the regular engine, JIT or AOT.

Consider the following code snippet:

function strict_foo($x)
{
  if (!is_int($x)) {
trigger_error();
  }
   .inner_code.
}

function very_lax_foo($x)
{
$x = (int) $x;
   .inner_code.
}

function test_strict()
{
  .outer_code.
  strict_foo($x);
}

function test_lax()
{
  .outer_code.
  very_lax_foo($x);
}

test_strict();
test_lax();

strict_foo() implements a pretty much identical check to the one that a
Strict integer STH would perform.
very_lax_foo() implements an explicit type conversion to int, that can
pretty much never fail - which is significantly more lax than what is
proposed for weak types in the Dual Mode RFC, and even more so compared to
the Coercive STH RFC.
.inner_code. is identical between the two foo() functions, and
.outer_code. is identical between the two tester functions.

The claim that strict types can be more efficiently optimized than more
lax types, suggests it should be possible to optimize the code flow for
test_strict()/strict_foo() significantly better than for very_lax_foo()
using JIT/AOT.

Let's dive in.  Beginning with the easy part, that's been mentioned
countless times - it's clear that it's just as easy (or hard) to optimize
the .inner_code. block in the two implementations of foo().  It can bank
on the exact same assumptions - $x would be an integer.  So we can
optimize the two function bases to exactly the same level.  For example,
if we're sure that $x inside the function never changes type - it can be
optimized down to a C-level long.  That's oversimplifying things a bit,
but the important thing here is that it can be easily proven that the two
function bodies can be optimized to the exact same level, for better or
worse.  The only difference between them is how they handle non-integer
inputs;  The strict implementation errors out if it gets a non-integer
typed value, while the lax version happily accepts everything.  But that's
a functionality difference, not a performance one (i.e., if you want the
value to be accepted in the strict case, you'd manually conduct the
conversion before the call is made, or sooner - resulting in roughly the
same behavior and performance).

Now the slightly trickier part - the .outer_code. block.  What can we say
about the type of $x, without knowing what code is in there?  Not a whole
lot.  We know that if $x isn't going to be of type int at the end of this
block, test_strict() is going to fail, but that doesn't mean $x will truly
be an int.  The fact I want to be young and healthy doesn't mean I'm going
to magically become young and healthy :)

Let's dive further.  Assuming we don't have strong variable type
declarations (i.e., int

Re: [PHP-DEV] Coercive Scalar Type Hints RFC

2015-02-22 Thread Pavel Kouřil

On Sun, Feb 22, 2015 at 7:30 PM, Robert Stoll p...@tutteli.ch wrote:
 Hi Pavel,

 Yes, I am suggesting to make conversions behave the same regardless if it is 
 implicit or explicit. The only difference between the two should be that one 
 is stated explicitly by the user where the other is applied implicitly. Other 
 programming languages behave like this and are more predictable for users as 
 well as developers because one does not need to learn two sets of conversion 
 rules.


Actually this is not true. Other languages have differences between
explicit conversions (aka casting) and implicit conversions as well.
C# is the language I use the most after PHP, so I'll bring that one up
(see https://msdn.microsoft.com/en-us/library/ms173105.aspx), but I
believe other languages (probably Java?) act the same way.

Regards
Pavel Kouril

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] new json, push generated file?

2015-02-22 Thread Anatol Belski

Hi Jakub,

On Tue, February 17, 2015 17:53, Anatol Belski wrote:
 Hi Jakub,


 On Sun, February 15, 2015 21:18, Jakub Zelenka wrote:

 On Wed, Feb 11, 2015 at 11:56 AM, Jakub Zelenka bu...@php.net wrote:




 I would like to push the the bison tab files shortly as the majority
 of people in this thread (including me) are for having them in the
 repo. The
 only thing that I would like is to have a specific version in the repo
  to prevent big diffs for small changes in the source files. For now
 I
 would like to have there re2c 0.13.6 (thanks for regenerating it back
 ;)
 ) and
 bison 2.7.1 gen files. I will update Readme at some point as well and
 add there that info.


 Hey just a quick update. I bumped the version in the repo for re2c
 0.13.7.5
 (latest - no changes in generated states ) and bison to 3.0.4 . I
 updated Readme as well.



 I have pushed the bison files in
 http://git.php.net/?p=php-src.git;a=commit;h=911f7b10b1f4c9529bc01580d29
 8a
 93a5cd6bbd2
 . There is an explanation why the C preprocessor guard macro names are
 file system dependent. It means why there is
 YY_PHP_JSON_YY_HOME_JAKUB_PROG_PHP_MASTER_EXT_JSON_JSON_PARSER_TAB_H_IN
 CL
 UDED
 .
 It's due to bison algorithm for creating such name. As I noted the only
 solution that works for me is using different yacc.c skeleton. I have
 done it in jsond in
 https://github.com/bukka/php-jsond/commit/583619d7962fa57bf97dcdac4147d
 8b
 599a42672
 where I have optional bison generation which means that I can stick with
  one bison version and use custom skeleton file. This is however not
 possible in the core where we allow range of bison versions which
 doesn't play well with skeletons that are version specific.


 thanks for pushing. I'm using re2c 0.13.7.5 now for master as well. With
 bison, it'll be however hard to move from 2.4.1 on Windows (and it's not
 that critical), but file generated with it seems to work. Anyway, nothing
  prevents you or anyone to regenerate it and push over, just in case.
 Most
 important is that fixes land in the *.re/*.y sources. And one knows who to
  ping for verifications :)

FYI I had to downgrade re2c to 0.13.6 as the latest randomly crashes.

Regards

Anatol


-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] Re: [RFC] Exceptions in the engine

2015-02-22 Thread Nikita Popov

On Thu, Feb 19, 2015 at 4:13 PM, Dmitry Stogov dmi...@zend.com wrote:

 Hi Nikita,

 I refactored your implementation: https://github.com/php/php-src/pull/1095

 I introduced a class hierarchy to minimize effect on existing code.
 cacth (Exception $e) won't catch new types of exceptions.

 BaseException (abstarct)
  +- EngineException
  +- ParaseException
  +- Exception
  +-ErrorException
  +- all other exceptions

 In case of uncaught Parse and EngineEexception PHP writes the compatible
 error message.
 I made it mainly to keep thousand PHPT tests unchanged.
 If you like we may introduce an ini option that will lead to emitting of
 Uncaught Exception ...  with backtrace instead.

 The internal API was changed a bit. e.g. EngineException are thrown from
 zend_error(), if the error_code has E_EXCEPTION bit set.
 So to change some error into exception now we should change
 zend_error(E_ERROR,...) into zend_error(E_EXCEPTION|E_ERROR. ...)

 All tests are passed.
 I'm not sure about sapi/cli/tests/bug43177.phpt, because it started to
 fail also in master.

 We may need to replace E_RECOVERABLE_ERROR with E_ERROR and fix
 corresponding tests.
 Despite of this, I think the patch is good enough to be merged into master.

 We may decide to convert more fatal errors later, but it shouldn't prevent
 from putting RFC into vote.

 Thoughts?


I've updated some minor points in the RFC to be consistent with the new
patch. The BaseException based hierarchy will be a separate vote [1]. If
there are no further concerns I'd start voting on this RFC tomorrow.

One point wrt the patch: Do you think it would be hard to use a clean
shutdown on uncaught exceptions? This would make sure we don't forget to
free anything when throwing exceptions.

Nikita

[1]: https://wiki.php.net/rfc/engine_exceptions_for_php7#hierarchy

Re: [PHP-DEV] [RFC] Coercive Scalar Type Hints

2015-02-22 Thread Ole Markus With



On 02/21/2015 09:10 PM, Pádraic Brady wrote:

 On the RFC rules themselves, a few comments:
 
 1. Happy to see leading/trailing spaces excluded.
 2. Rules don't make mention of leading zeroes, e.g. 0003
 3. 1E07 might be construed as overly generous assuming we are
 excluding stringy integers like hex, oct and binary
 4. I'm assuming the stringy ints are rejected?
 5. Is .32 coerced to float or only 0.32? Merely for clarification.
 6. Boolean coercion from other types... Not entirely sure myself.
 Completely off the cuff: =0: false, 0:true, floats and strings need
 not apply.
 7. In string to float, only capital E or also small e?
 8. I'll never stop call them stringy ints.
 

In my mind, certainly a better proposition than those introducing dual
mode. Agree with the comments above, except that I am entirely sure that
boolean coercion from other types should not be allowed.

Cheers,
Ole Markus

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

RE: [PHP-DEV] Coercive Scalar Type Hints RFC

2015-02-22 Thread Zeev Suraski

 -Original Message-
 From: Jefferson González [mailto:jgm...@gmail.com]
 Sent: Sunday, February 22, 2015 11:59 PM
 To: Stanislav Malyshev
 Cc: Etienne Kneuss; Anthony Ferrara; Zeev Suraski; PHP internals
 Subject: Re: [PHP-DEV] Coercive Scalar Type Hints RFC

 2015-02-22 16:38 GMT-04:00 Stanislav Malyshev smalys...@gmail.com:

   Yes, that's not the case, at least nobody ever showed that to be the
   case. In general, as JS example (among many others) shows, it is
   completely possible to have JIT without strict typing. In particular,
   coercive typing provides as much information as strict typing about
   variable type after passing the function boundary - the only
 difference
   is what happens _at_ the boundary and how the engine behaves
 when the
   types do not match, but I do not see where big performance
 difference
   would come from - the only possibility for different behavior would
 be
   if your app requires constant type juggling (checks are needed in
 strict
   mode anyway, since variables are not typed) - but in this case in strict
   mode you'd have to do manual type conversions, which aren't in any
 way
   faster than engine type conversions.
   So the case for JIT being somehow better with strict typing so far
   remains a myth without any substantiation.

 Well, strict on a JIT environment may haven't been proved, but it surely
 has
 been proved on statically compiled languages like C.

Jefferson,

Strict type hints will not make PHP even remotely similar or otherwise
comparable to a statically typed language like C.  It will take totally
different facilities being added to PHP, which are not being considered.

 Currently, a JIT in the
 most cases can't compete to the bare performance of a static compiled
 language, both in resources and CPU, so how is non strict better in that
 sense?

Nobody is saying it's better in that sense.  Nobody is also suggesting that
a statically compiled language like C isn't a lot easier to optimize than a
dynamic language with JIT.  What we are saying is that Strict type hints are
of no help making JIT any better or easier, compared to Dynamic/Coercive
hints.  I provided what I believe to be detailed proof for that in my email
titled 'JIT'.

 I thought those checks could be optional if generated at call time, thats
 why I
 gave these 2 examples:

 calc(1, 5) - no need for type checking or conversion, do a direct call
 calc(12, 15) - calc(strToInt(value1), strToInt(value2)) calc($var1,
 $var2) -
  needs type checking and conversion if required

That's the wrong comparison to make.  We should be comparing the same calls
with the two systems, rather than one call in one system and a different one
in a different system.
Taking your example:

calc(1, 5) - no need for type checking or conversion *in neither Strict
type hints nor Coercive/Dynamic type hints*, do a direct call.  Identical
performance.
calc(1, 5) - fails in strict type hints, succeeds in Dynamic/Coercive
type hints (cannot be optimized)

Again, this illustrates that the difference between the two is that of
*functionality*, not performance.

If you're saying that calc(1, 5) is slower than calc(1, 5) when using
dynamic type hints - then that would be correct, but also pretty meaningless
from a performance standpoint, if what you have are string values.  And if
you have integer values, well then, we've already established there's no
difference between the two type hinting systems.

Typically, you obtain the data you need in a type that's not under your
control.  You're getting data from the browser, database, filesystem, web
service or some algorithm - the type of the values you get is determined by
the API functions you're using to get the data from.

So what are your options if what you have in your hand is 1 and 5,
because that's how the APIs provided the data to you, as opposed to 1 and 5?

Before they can be added, they need to be converted to integer format,
whether this is done by explicitly casting them (likely outcome in case of a
strict type hint), casting them through a safe coercive STH, or letting
PHP's + operator implementation do it for you.  The data needs to be
converted somewhere.

 So what you are saying is that there is no way of determining the type of
 a
 variable (only at runtime), as Zeev explained on the previous messages,
 since
 variables aren't typed, checks are mandatory either way.

There are ways to infer typing information both during compile time and also
create 'educated guess' as to what the data type is going to be based on
runtime information, but:
1. No, it's absolutely not possible to always determine the type of
variables during compile-time, you'd often (perhaps more often than not)
only know the data type with absolute certainty only at runtime
2. Whatever you CAN infer, you can infer equally regardless of whether a
piece of code uses strict type hints or dynamic ones.

   Please

Re: [PHP-DEV] [VOTE] Expectations

2015-02-22 Thread Stanislav Malyshev

Hi!

 I do not see much gain today to improve them while I do not see why we
 should not. It does not hurt.

The gain is simple - today, assertions have costs so people that are
performance-conscious (rightly or wrongly) use them less than they
could. We can make them cost-less in production, while preserving their
advantages in the test environments (also makes code easier to follow
btw if asserts show which invariants are being enforced). Unit tests
only provide for assertions in test code, but cost-less asserts can help
you ensure that code works the way you intended all the way through,
without paying for that with performance.
-- 
Stas Malyshev
smalys...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

[PHP-DEV] Re: JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints RFC)

2015-02-22 Thread JGM

2015-02-22 14:37 GMT-04:00 Zeev Suraski z...@zend.com:

 I think it's fair to say that Dmitry - who led the PHPNG effort - cares *a
 lot* performance.  I'm sure you'd agree.  I tend to think that I also care
 a lot about performance, and so does Xinchen.  We all spent substantial
 parts of our lives working to speed PHP up.  It's not whether we think
 performance is important - it is (although we do believe we should build
 optimizers for languages, more so than languages for optimizers).  It's
 just that we all fail to see how the flavor of STH can have any meaningful
 influence on performance.

 Thanks for the feedback!

 Zeev


Thanks for the insightful response! Now it would be nice to also see the
opinions of the other camp.

Re: [PHP-DEV] Coercive Scalar Type Hints RFC

2015-02-22 Thread Stanislav Malyshev

Hi!

 A JIT or AOT machine code generator IMHO will never have a decent use of
 system resources without some sort of strong/strict typed rules,
 somebody explain if thats not the case.

Yes, that's not the case, at least nobody ever showed that to be the
case. In general, as JS example (among many others) shows, it is
completely possible to have JIT without strict typing. In particular,
coercive typing provides as much information as strict typing about
variable type after passing the function boundary - the only difference
is what happens _at_ the boundary and how the engine behaves when the
types do not match, but I do not see where big performance difference
would come from - the only possibility for different behavior would be
if your app requires constant type juggling (checks are needed in strict
mode anyway, since variables are not typed) - but in this case in strict
mode you'd have to do manual type conversions, which aren't in any way
faster than engine type conversions.
So the case for JIT being somehow better with strict typing so far
remains a myth without any substantiation.

 while on strict mode the generated code could be:
 
 int calc(int val1, int val2) {
 return val1 + val2;
 }

No, it can't be (at least it can't be the _entire_ code of this
function), since the user still can pass non-int into this function -
nothing introducing strict typing in functions, as it is proposed now,
prevents it. What strict typing does is to ensure the error in this
case, but to generate the error you still need the checks!
BTW, your weak mode code is wrong too - there's no need to generate
Variants if you typed the variables as int. You know once coercion is
done they are ints. At least in the model that was now proposed.

 If my example is right it means strict would be better to achieve good

Unfortunately, your example is not right.

 to another level of integration with PHP and performance. IMHO is harder
 and more resource hungry to implement a JIT/AOT using weak mode. With

Please provide a substantiation for this opinion. So far what was
provided was not correct.

 Thats all that comes to mind now, and while many people doesn't care for
 performance, IMHO a programming language mainly targeted for the web
 should have some caring on this department.

Please do not strawman. A lot of people here care about performance, and
you have not yet made case that strict typing has any benefit on
performance, so implying that opponents of strict typing somehow don't
care about performance while you champion it does not match the real
situation.
-- 
Stas Malyshev
smalys...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] [RFC][DISCUSSION] Context Sensitive lexer

2015-02-22 Thread Stanislav Malyshev

Hi!

 RFC: https://wiki.php.net/rfc/context_sensitive_lexer
 TL;DR commit: https://github.com/marcioAlmada/php-src/commit/c01014f9
 PR: https://github.com/php/php-src/pull/1054

I like the idea. But we need to examine the cases carefully so we don't
block some future routes - especially this is with regards to such
things as type names which we wanted to reserve.

I.e. method names resolution is probably clear, since they appear after
- or ::, but for class names the context may be much more varied.
-- 
Stas Malyshev
smalys...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] [RFC] [VOTE] Parameter skipping RFC

2015-02-22 Thread Stanislav Malyshev

Hi!

 I see a LOT of no votes against this RFC but can't find the thread
 outlining the reasoning for such resistence.

I think my attempts to explain that this was a step towards named
params, not a contradiction with them, failed - people read it, say we
understood it and the say no, we don't want it, we want named params
instead!. Well, let's hope somebody (not me) writes a patch for named
params instead. In the meantime, 7.0 will have neither.
-- 
Stas Malyshev
smalys...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints RFC)

2015-02-22 Thread Anthony Ferrara

Zeev,

On Sun, Feb 22, 2015 at 1:37 PM, Zeev Suraski z...@zend.com wrote:
 -Original Message-
 From: Jefferson Gonzalez [mailto:jgm...@gmail.com]
 Sent: Sunday, February 22, 2015 4:25 PM
 To: Etienne Kneuss; Anthony Ferrara; Zeev Suraski
 Cc: PHP internals
 Subject: Re: [PHP-DEV] Coercive Scalar Type Hints RFC

 Jefferson,

 Please note that Anthony, the lead of the Dual Mode RFC, said this earlier
 on this thread, referring to the claim that Strict STH can improve JIT/AOT
 compilation:

 A statement which I said on the side, and I said should not impact RFC or
 voting in any way. And is in no part in my RFC at all.

 Please also see:

 marc.info/?l=php-internalsm=142439750614527w=2

 So while Anthony and I don't agree on whether there are performance gains
 to be had from Strict STH, both of us agree that it's not at a level that
 should influence our decision regarding the RFCs on the table.

 I wholeheartedly agree with that stance, which is why I also listed the
 apparently extremely widespread misconception (IMHO) that Strict STH can
 meaningfully help JIT/AOT in my RFC.

So you agree we shouldn't discuss it, then you go ahead and discuss
it. I guess that shouldn't surprise me.

 Despite that, as your email suggests, there are still (presumably a lot)
 of people out there that assume that there are, in fact, substantial gains
 to be had from JIT/AOT if we introduce Strict STH.  I'm going to take
 another stab at explaining why that's not the case.

 A JIT or AOT machine code generator IMHO will never have a decent use of
 system resources without some sort of strong/strict typed rules,
 somebody
 explain if thats not the case.

 It kind of is and kind of isn't.

 There's consensus, I think, that if PHP was completely strongly typed -
 i.e., all variables need to be declared and typed ahead of time, cannot
 change types, etc. - we'd clearly be able to create a lot of optimizations
 in AOT that we can't do today.  That the part that 'is the case'.  But
 nobody is suggesting that we do that.  The discussion on the table is
 very, very narrow:

There's no consensus there. As I've pointed out to you more than once,
plenty of other languages manage this through type inference or
reconstruction. Many (like Go) only requiring explicit types on the
parameters, not on variables.

Heck, I did **exactly** that in Recki-CT. So please don't dismiss
something that's being done **IN THE PHP WORLD** just because you
don't think it's possible.

 -- Can the code generated for a strict type hint can somehow be optimized
 significantly better than the code generated for a dynamic/coercive type
 hint.

 And here, I (as well as Dmitry, who actually wrote a JIT compiler for PHP)
 claim that it isn't the case.  To be fair, there's no consensus on this
 point.

And me, who wrote an AOT compiler that does **exactly** this, claim
that it is the case. Along with other people who've worked on
compilers. See the reply in a private thread you started that shows
the tradeoffs, specifically in generated code efficiency and memory
usage.

You can keep ignoring the arguments, but PLEASE don't keep spreading
them as fact.

Also: if Dmitry worked on a JIT compiler, why isn't that code out in
the open? And if the code isn't out, why isn't the knowledge open? Are
we just supposed to rely on a single person's experience (especially
when more than one other person's shared experiences differ)?

 Let me attempt, again, to explain why we don't believe there are any gains
 associated with Strict STH, be them with the regular engine, JIT or AOT.

 Consider the following code snippet:

 function strict_foo($x)
 {
   if (!is_int($x)) {
 trigger_error();
   }
.inner_code.
 }

 function very_lax_foo($x)
 {
 $x = (int) $x;
.inner_code.
 }

 function test_strict()
 {
   .outer_code.
   strict_foo($x);
 }

 function test_lax()
 {
   .outer_code.
   very_lax_foo($x);
 }

 test_strict();
 test_lax();

 strict_foo() implements a pretty much identical check to the one that a
 Strict integer STH would perform.
 very_lax_foo() implements an explicit type conversion to int, that can
 pretty much never fail - which is significantly more lax than what is
 proposed for weak types in the Dual Mode RFC, and even more so compared to
 the Coercive STH RFC.
 .inner_code. is identical between the two foo() functions, and
 .outer_code. is identical between the two tester functions.

 The claim that strict types can be more efficiently optimized than more
 lax types, suggests it should be possible to optimize the code flow for
 test_strict()/strict_foo() significantly better than for very_lax_foo()
 using JIT/AOT.

Assuming that they were split in the files appropriately, you are
missing **THE** key thing we've been trying to tell you this entire
time. Looking at a single function, yes there is no difference if it's
strict or not (well, you can save some time on the next function call
inside, but it's small). However we're not talking about

Re: [PHP-DEV] Coercive Scalar Type Hints RFC

2015-02-22 Thread Pavel Kouřil

On Sun, Feb 22, 2015 at 9:42 PM, Robert Stoll p...@tutteli.ch wrote:

 Probably it is a philosophical question how to look at it. IMO the only 
 difference in C# (as well as in Java) lies in the way the conversions are 
 applied. Implicit conversions are applied automatically by the compiler where 
 explicit conversions are applied by the user. The difference lies in the fact 
 that C# is statically typed and implicit conversions are only applied when it 
 is certainly safe to apply one. However, Implicit conversions in C# behave 
 the same as explicit conversion since implicit conversion which fail simply 
 do not exist (there is no implicit conversion from double to int for 
 instance). That is the way I look at it. You probably look at it from another 
 point of view and would claim an implicit conversion from double to int in C# 
 exists but just fails all the time = ergo implicit and explicit are 
 different (that is my interpretation of your statement above). In this sense 
 I would agree. But even when you think in this terms then you have to admit, 
 they are fundamentally different in the way that implicit conversion which 
 are different than explicit conversion always fail, in all cases - pretty 
 much as if they do not exist. There are no cases, neither in C# nor in Java 
 which I am aware of, where an implicit cast succeeds in certain cases but not 
 in all and an explicit conversion succeeds in at least more cases than the 
 implicit conversion. Hence, something like a should also not work in an 
 explicit conversion in PHP IMO if it is not supported by the implicit 
 conversion (otherwise strict mode is useless btw.)

 Try out the following C# code:
 dynamic d1 = 1.0;
 int d = d1;
 You will get the error Cannot implicitly convert type `double` to `int` at 
 runtime.

 We see a fundamental difference between C# and PHP here. PHP is dynamically 
 typed an relies on values rather than types (in contrast to C#). Therefore, 
 the above code emits a runtime error even though the data could be converted 
 to int without precision loss.
 This shall be different in PHP according to this RFC and I think that is 
 perfectly fine. Yet, even more important it seems to me that 
 implicit/explicit conversions behave the same way.
 At first it might seem strange to have just one conversion rule set in PHP 
 since PHP is not known to be a language which shines due to its consistency...
 OK, I am serious again. If you think about it from the following point of 
 view: A user writes an explicit conversion in order to state explicitly that 
 some value will be converted (this is something which will be necessary in a 
 strict mode). Why should this explicit conversion be different from the 
 implicit one? There should not be any difference between explicit knowledge 
 and implicit one. That is my opinion. If you really do not care about data 
 loss and just want to squeeze a float/string into an int no matter what the 
 value really is then you can use the @ in conjunction with ?? and provide the 
 desired default value to fall back on if the conversion fails. If conversions 
 like a to int really matters that much to the users of PHP then we could 
 keep the oldSchoolIntConversion function (as propose in my first email) even 
 in PHP 10 (I would probably get rid of them at some point).

 Cheers,
 Robert


Well,

I look at it this way (in a simplified manner). Hopefully this will
make you understand my point of view more.

- Implicit conversions work only when you are sure you won't lose stuff
- Explicit conversions are for forcing (casting) variable to become
another type, and when you are explicitely as user calling it, you are
aware you can lose values

Sure, the literal meaning in C# and PHP differs a little bit (because
of static and dynamic typed language differences and stuff), but the
*intent* is IMHO the same; implicit conversions can happen in the
background safely, while for dangerous conversions, you have to
cast by hand. And I see use cases for both of these types of
conversions.

Also, you are assuming that there will be a strict mode; I sincerely
hope there won't. Ssince introduction of 2 modes, I was always
saying that there should be only one mode - I don't really care
whether it would be strict or weak, but just only one.

Regards
Pavel Kouril

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] [RFC] [FINAL DISCUSSION] Script only include/require

2015-02-22 Thread Stanislav Malyshev

Hi!

 I think this will be the final discussion before vote.
 This RFC is to make PHP stronger against script inclusion attacks just like
 other languages.
 
 https://wiki.php.net/rfc/script_only_include

I still think this RFC takes a wrong road for the following reasons:

1. Having any code in your app that allows to run include on
user-controlled files (I'm not talking about filtered cases but user
data controlling the path) is insecure and can not be made secure. It
should just never be done. Trying to find workarounds for this is like
safe_mode - good idea in theory, leads to worse security in practice.

2. Default configuration would break tons of PHP scripts with extensions
other than .php (very frequent case). The BC break potential of this is
very big as it modifies core functionality.

3. Prohibiting phar uploads would also be a bc break, but more
importantly, there still probably are ways to work around this by using
phar files with extension different than .phar and then asking to
include files within that phar file. As long as the eventual path would
end in .php, your code would allow it.

Also, the claim that move_upload_file() is obsolete is not based on
anything as far as I can see. Why is it obsolete?

-- 
Stas Malyshev
smalys...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

AW: [PHP-DEV] Coercive Scalar Type Hints RFC

2015-02-22 Thread Robert Stoll

 -Ursprüngliche Nachricht-
 Von: Pavel Kouřil [mailto:pajou...@gmail.com]
 Gesendet: Sonntag, 22. Februar 2015 22:18
 An: Robert Stoll
 Cc: Zeev Suraski; PHP internals
 Betreff: Re: [PHP-DEV] Coercive Scalar Type Hints RFC
 
 On Sun, Feb 22, 2015 at 9:42 PM, Robert Stoll p...@tutteli.ch wrote:
 
  Probably it is a philosophical question how to look at it. IMO the
  only difference in C# (as well as in Java) lies in the way the
  conversions are applied. Implicit conversions are applied
  automatically by the compiler where explicit conversions are applied
  by the user. The difference lies in the fact that C# is statically
  typed and implicit conversions are only applied when it is certainly
  safe to apply one. However, Implicit conversions in C# behave the same
  as explicit conversion since implicit conversion which fail simply do
  not exist (there is no implicit conversion from double to int for
  instance). That is the way I look at it. You probably look at it from
  another point of view and would claim an implicit conversion from
  double to int in C# exists but just fails all the time = ergo
  implicit and explicit are different (that is my interpretation of your
  statement above). In this sense I would agree. But even when you think
  in this terms then you have to admit, they are fundamentally different
  in the way that implicit conversion which are different than explicit
  conversion always fail, in all cases - pretty much as if they do not
  exist. There are no cases, neither in C# nor in Java which I am aware
  of, where an implicit cast succeeds in certain cases but not in all
  and an explicit conversion succeeds in at least more cases than the
  implicit conversion. Hence, something like a should also not work in
  an explicit conversion in PHP IMO if it is not supported by the
  implicit conversion (otherwise strict mode is useless btw.)
 
  Try out the following C# code:
  dynamic d1 = 1.0;
  int d = d1;
  You will get the error Cannot implicitly convert type `double` to `int` 
  at runtime.
 
  We see a fundamental difference between C# and PHP here. PHP is dynamically 
  typed an relies on values rather than
 types (in contrast to C#). Therefore, the above code emits a runtime error 
 even though the data could be converted to int
 without precision loss.
  This shall be different in PHP according to this RFC and I think that is 
  perfectly fine. Yet, even more important it seems to
 me that implicit/explicit conversions behave the same way.
  At first it might seem strange to have just one conversion rule set in PHP 
  since PHP is not known to be a language which
 shines due to its consistency...
  OK, I am serious again. If you think about it from the following point of 
  view: A user writes an explicit conversion in order
 to state explicitly that some value will be converted (this is something 
 which will be necessary in a strict mode). Why should
 this explicit conversion be different from the implicit one? There should not 
 be any difference between explicit knowledge
 and implicit one. That is my opinion. If you really do not care about data 
 loss and just want to squeeze a float/string into an
 int no matter what the value really is then you can use the @ in conjunction 
 with ?? and provide the desired default value
 to fall back on if the conversion fails. If conversions like a to int 
 really matters that much to the users of PHP then we
 could keep the oldSchoolIntConversion function (as propose in my first email) 
 even in PHP 10 (I would probably get rid of
 them at some point).
 
  Cheers,
  Robert
 
 
 Well,
 
 I look at it this way (in a simplified manner). Hopefully this will make you 
 understand my point of view more.
 
 - Implicit conversions work only when you are sure you won't lose stuff
 - Explicit conversions are for forcing (casting) variable to become another 
 type, and when you are explicitely as user calling
 it, you are aware you can lose values
 

I see. I see and think you are not alone with this opinion. I give you another 
example and hope you reconsider your position (up to you what position you take 
afterwards of course).

 Consider the following in C#

class A{}
class B : A{}
class C : A{}

A a = new B();
B b = a; // will fail, needs a conversion
C c1 = a; // will fail, needs a conversion
C c2 = (C) a;   //will fail at runtime

And now imagine C# would not be based on types but on values. Then the 
following would be perfectly legal as well:

B b = a; //is fine since a is of type B
C c1 = c; //will fails since a is not of type C
C c2 = (C) c; //still fails since a is not of type C

Or to illustrate it differently. Imagine you have a shop and your main currency 
is $. However, you accept € as well as long as they are banknotes. In this case 
the customer can insert the banknotes in a currency exchange machine at the 
till. 
Now imagine the following four use cases: 
1. A customer buys something with $ - everything is fine

RE: [PHP-DEV] Coercive Scalar Type Hints RFC

2015-02-22 Thread François Laupretre

Hi Robert,

 So what does that mean for scalar types?
 IMO it means that way more important than adding scalar type hints to PHP
 7.0 is to agree on a new set of conversion rules for the long run. PHP should
 strive to have one consistent set of conversion rules which apply in all 
 places
 where implicit or explicit conversion are used.

That's exactly what I mean. I think people should keep in mind, when talking 
about enabling/disabling a given conversion, that the implicit scope is every 
explicit or implicit conversion implemented in PHP.

In an ideal world, we would proceed in reverse order. We wouldn't start 
considering modifying the ZPP ruleset before having aligned every 
implicit/explicit conversions existing in PHP on a single ruleset. 
Unfortunately, if we want to keep a chance with STH in 7.0, we cannot do that. 
So, we will probably evaluate potential BC breaks on ZPP ruleset modifications 
only, meaning we'll make decision without a good evaluation of the BC breaks 
introduced by aligning other PHP conversions on the newly-proposed ruleset. So, 
we'll need to extrapolate from ZPP-only results.

Regards

François


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

RE: [PHP-DEV] Coercive Scalar Type Hints RFC

2015-02-22 Thread François Laupretre

Hi Stas,

It seems the actual problem is that we have too many compiler / code analysis 
experts in the community ;)

(don't get me wrong, I am not saying that for you, I just admire your patience 
explaining the same again and again to people who never read one line from PHP 
core source).

Regards

François

 -Message d'origine-
 De : Stanislav Malyshev [mailto:smalys...@gmail.com]
 Envoyé : dimanche 22 février 2015 21:39
 À : Jefferson Gonzalez; Etienne Kneuss; Anthony Ferrara; Zeev Suraski
 Cc : PHP internals
 Objet : Re: [PHP-DEV] Coercive Scalar Type Hints RFC
 
 Hi!
 
  A JIT or AOT machine code generator IMHO will never have a decent use of
  system resources without some sort of strong/strict typed rules,
  somebody explain if thats not the case.
 
 Yes, that's not the case, at least nobody ever showed that to be the
 case. In general, as JS example (among many others) shows, it is
 completely possible to have JIT without strict typing. In particular,
 coercive typing provides as much information as strict typing about
 variable type after passing the function boundary - the only difference
 is what happens _at_ the boundary and how the engine behaves when the
 types do not match, but I do not see where big performance difference
 would come from - the only possibility for different behavior would be
 if your app requires constant type juggling (checks are needed in strict
 mode anyway, since variables are not typed) - but in this case in strict
 mode you'd have to do manual type conversions, which aren't in any way
 faster than engine type conversions.
 So the case for JIT being somehow better with strict typing so far
 remains a myth without any substantiation.
 
  while on strict mode the generated code could be:
 
  int calc(int val1, int val2) {
  return val1 + val2;
  }
 
 No, it can't be (at least it can't be the _entire_ code of this
 function), since the user still can pass non-int into this function -
 nothing introducing strict typing in functions, as it is proposed now,
 prevents it. What strict typing does is to ensure the error in this
 case, but to generate the error you still need the checks!
 BTW, your weak mode code is wrong too - there's no need to generate
 Variants if you typed the variables as int. You know once coercion is
 done they are ints. At least in the model that was now proposed.
 
  If my example is right it means strict would be better to achieve good
 
 Unfortunately, your example is not right.
 
  to another level of integration with PHP and performance. IMHO is harder
  and more resource hungry to implement a JIT/AOT using weak mode. With
 
 Please provide a substantiation for this opinion. So far what was
 provided was not correct.
 
  Thats all that comes to mind now, and while many people doesn't care for
  performance, IMHO a programming language mainly targeted for the web
  should have some caring on this department.
 
 Please do not strawman. A lot of people here care about performance, and
 you have not yet made case that strict typing has any benefit on
 performance, so implying that opponents of strict typing somehow don't
 care about performance while you champion it does not match the real
 situation.
 --
 Stas Malyshev
 smalys...@gmail.com
 
 --
 PHP Internals - PHP Runtime Development Mailing List
 To unsubscribe, visit: http://www.php.net/unsub.php


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] Coercive Scalar Type Hints RFC

2015-02-22 Thread Stanislav Malyshev

Hi!

 Well, strict on a JIT environment may haven't been proved, but it surely
 has been proved on statically compiled languages like C. Currently, a

I understand that using the same concept of typing in both cases can be
confusing, but that's pretty much where the similarity ends. Strict
typing in C has very little to do with what is proposed as strict typing
in PHP, and so far nobody is considering making PHP strictly typed in
the way C is (let alone more strict languages than C are). So bringing C
into the discussion is misleading.

 JIT in the most cases can't compete to the bare performance of a static
 compiled language, both in resources and CPU, so how is non strict
 better in that sense? 

Dynamic typing is not better in that sense. That's my whole point - from
the JIT perspective, they are the same, so the claim that strict typing,
as proposed, provides performance benefits, is incorrect.

 previous message, at runtime it consumes more memory and cpu and this is
 mostly due to all the type checking it requires. In that sense if the

As I already mentioned, current strict proposal requires type checking
too. The only one that doesn't is complete strict typing at compile-time
- which nobody is proposing.

 strict proposal could improve that situation it would be a benefit.

You keep repeating that, but that claim does not become more true
because it is repeated more times. It still is as unsubstantiated and
lacking base as it was the first time it was introduced. Please provide
some proof (logical or experimental) as to why it must happen (yes, this
includes the if too since it is pointless to bring it as a possibility
if we do not have any way for this possibility to be realized).

 I thought those checks could be optional if generated at call time,
 thats why I gave these 2 examples:

I don't see how they can be optional with strict typing.

 
 calc(1, 5) - no need for type checking or conversion, do a direct call
 calc(12, 15) - calc(strToInt(value1), strToInt(value2))
 calc($var1, $var2) - needs type checking and conversion if required

The same can be said about dynamic typing, with exactly the same words.
The only difference is what happens *after* checking - but this is only
relevant if the code relies on conversions, in which case in strict case
it just won't work - hardly a performance improvement worth considering.

 I was thinking on the sense that before calling a function, type
 checking could take place and conversion if required, but may be thats
 even more complicated...

This can be done in dynamic case too, provided the type information is
present (i.e. constants). No current proposal does this, though, AFAIK.

 Static typed languages - Direct conversion to machine code
 Dynamic typed languages with JIT - Intermediate representation -
 Checks - Conversion to machine code with checks.

We're not talking about making PHP statically typed language, do we? So
this advantage - while without any doubt real - does not apply to PHP.
-- 
Stas Malyshev
smalys...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

RE: [PHP-DEV] JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints RFC)

2015-02-22 Thread Zeev Suraski

Anthony,

I started writing this long response, but instead, I want to localize the
whole discussion to the one true root difference.  Your position on that
difference is the basis for your entire case, and my position on this
argument is the base for my entire case.

There we go:

 And note that this can only work with strict types since you can do the
 necessary type inference and reconstruction (both forward from a function
 call, and backwards before it).

Please do explain how strict type hints help you do inference that you
couldn't do with dynamic type hints.  Ultimately, your whole argument hinges
on that, but you mention it in parentheses almost as an afterthought.
I claim the opposite - you cannot infer ANYTHING from Strict STH that you
cannot infer from Coercive STH.  Consequently, everything you've shown, down
to the C-optimized version of strict_foo() can be implemented in the exact
same way for very_lax_foo().  Being able to optimize away the value
containers is not unique to languages with strict type hints.  It's done in
JavaScript JIT engines, and it was done in our JIT POC.

 With lax (weak, coercive) types, the ability to do type reconstruction
 drops
 significantly. Because you can no longer do any backwards inference from
 other function calls. Which means you can't prove if a type is stable in
 most
 cases (won't change). Therefore, you'll always have to allocate a ZVAL,
 and
 then the optimizations I showed above would stop working.

Again, using the scientific method I'm familiar with, that would be a
theory, and it would require proof.  So far I haven't seen any proof, and I
believe I pretty much proved the opposite with my example.

Zeev

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

[PHP-DEV] [RFC] Script only include/require

2015-02-22 Thread Yasuo Ohgaki

Hi all,

I wrote patch and made adjustment in the RFC
https://wiki.php.net/rfc/script_only_include
https://github.com/php/php-src/pull/
Where to check filename extension is subject to be changed.
At first, I thought implementing this as PHP code is good, but
I've changed my mind. It seems better to be done in Zend code.
Opinions are appreciated.

This RFC aims to make PHP as secure as other languages
with respect to script inclusion attacks.
Note: File inclusion is not a scope of this RFC.

INI Changes:
 - php_script - zend.script_extensions
 - Allow all files: * - NULL or 

Open Issues:
 - Error type - Is it OK to raise E_ERROR/E_RECOVERABLE_ERROR in
   zend_language_scanner.c?
 - Vote type - 50%+1 or 2/3

If there is anyone who would like to vote no for this RFC,
I would like to know the reason and try to address/resolve issue you have.

Thank you.

Regards,


--
Yasuo Ohgaki
yohg...@ohgaki.net

Re: [PHP-DEV] JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints RFC)

2015-02-22 Thread Lester Caine

On 22/02/15 18:37, Zeev Suraski wrote:
 Variant* calc(Variant val1, Variant val2) {
if(val1.isInt() ) {
 // type checking
 if (!val1.coerceToInt()) {
   throw new RuntimeError()
 }
 If (!val2.coerceToInt()) {
   throw new RuntimeError();
 }
 
 // function body begins here
 int result = Variant(val1.intValue() + val2.intValue());
 return result;
 }

A more practical example would be to replace coerceToInt() with
inRange() which includes an int check/'coerce' as part of the range
check, and produce a number of errors based on the result.

-- 
Lester Caine - G8HFL
-
Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk
Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

[PHP-DEV] Coercive Scalar Type Hints RFC

2015-02-22 Thread Anthony Ferrara

Adding in a thread that was started in private, but absolutely is
worth sharing with the group:

-- Forwarded message --
From: Etienne Kneuss col...@php.net
Date: Sun, Feb 22, 2015 at 8:42 AM
Subject: Re: [PHP-DEV] Coercive Scalar Type Hints RFC
To: Zeev Suraski z...@zend.com
Cc: Anthony Ferrara ircmax...@gmail.com, Dmitry Stogov dmi...@zend.com

On Sun Feb 22 2015 at 14:23:58 Zeev Suraski z...@zend.com wrote:

  There have been several attempts:
  for JS: http://users-cs.au.dk/simonhj/tajs2009.pdf
  or similar techniques applied to PHP, quite outdated though:
  https://github.com/colder/phantm

 Looks like WebKit's type inference is doing some pretty good job at
 analyzing code, although I'm not sure how much of it is static vs. dynamic.
 My guess is that a lot of it is static:
 twitter.com/kangax/status/558974257724940288

My guess would be that it's almost entirely dynamic, or probabilistic
(e.g. this nice recent work done at ETH: http://www.jsnice.org/).

I think you underestimate the difficulty of statically recovering
precise types from no-annotations without runtime witnesses ;) You
don't want webkit anlysing the JS for 10 minutes until it renders the
page. It is much more profitable to JIT these.

  You are right that the lack of static information about types is (one of
  the) a main issue. Recovering the types has typically a huge performance
  cost, or is unreliable

 We're not really talking about performance issue here, as static analysis is
 a separate activity that is unrelated to runtime performance.

What I meant was: it is a performance issue for the static analyzer,
not PHP itself.

  But seriously, time is getting wasted on this argument; it's actually a
  no-brainer: more static information helps tools that rely on static
  information. Yes. Absolutely. 100%.

 There's still disagreement between us on whether the different behavior of
 Strict STH constitutes additional static information or not, as it doesn't
 give you any extra information on the value being fed to the function, and
 it doesn't give you any extra information on what the function will receive.
 It only gives you information about how the function would behave if it gets
 a wrongly-typed value.

1) for forward analyses (which are the most common for these
applications): it gives you precious information from the beginning of
the function and forward. You can consider it similarly to a cast: You
don't necessarily know what the value coming in is, but you know which
type you are having from that point forward.

2) backward analyses could piggy-back the type constraints from the
functions (strict or no strict) and check that they are met when
constructing the value fed to the function.

Having worked several years on static analysis tools for languages
such as PHP, I can guarantee you that this information would help a
lot. However, the other dynamic feature of PHP would still make
analyses slow/unreliable/imprecise. Let's not imagine that this is the
only thing missing for PHP to be static-analysis-wonderland, far from
it.

 But my the bottom line is exactly the bottom line you ended with, and what I
 answered you on-list - how much weight should Static Analysis improvements
 have on our decision to introduce new language features?  My answer is not
 that much, if they have downsides.  Static Analyzers should be designed for
 languages and not vice versa.

I fully agree in general that the flow should be this way. But it
remains a bonus if a certain feature, as a plus, would help external
tools. I believe it is worth mentionning..

 Thanks,

 Zeev

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints RFC)

2015-02-22 Thread Stanislav Malyshev

Hi!

 You can tell because you know the function foo expects an integer. So
 you can infer that $x will have to have the type integer due to the
 future requirement. Which means the expression $something / 2 must
 also be an integer. We know that's not the case, so we can raise an
 error here.

OK, so your claim is that the compiler with strict typing can detect
some situations which the dynamic one can not and reject some of the
code. Without going too much into details, I agree with this, this is an
obvious difference between strict and dynamic. However, this is not a
performance advantage, obviously - since you are comparing running code
with non-running one - your model just accepts less code. Obviously,
this works if non-accepted code was wrong - and doesn't work if it was
not. But we talked about running code, I thought.

 At that point the developer has the choice to explicitly cast or put
 in a floor() or one of a number of options.

That's exactly what I claim would be the defect of the strict model -
people would start putting excessive casts ensuring there would be cases
where information is lost. For example, assume we knew $something is even:

function bar(int $something): int {
assert($something %2 == 0);
$x = $something / 2;
return foo($x);
}

Now everything is fine (ignoring the typing for a second), right? We're
dealing with integers, /2 always divides evenly, all is great. Now we
introduce strictness, so we'd need to say something like:

function bar(int $something): int {
assert($something %2 == 0);
$x = $something / 2;
return foo((int)$x);
}

Now assume somebody messed up on the routine code reformatting merge and
the code somehow ended up like:

function bar(int $something): int {
$x = $something / 2;
return foo((int)$x);
}

Do you see what the problem is? Now we lost the check for $something
being even, but we would never know about it since type system forced us
to insert (int) (which we didn't need) and thus disabled the controls
for the bug of $something not being even (which we did need).

But more important question is - with (int) the coercive model can use
this information too, so what's the difference from strict model on that
code? There seems to be none.

 Without strict typing this code is always stable, but you still need
 to generate full type assertions in a compiled version of foo() and
 use ZVALs for $x, hence reducing the effect of the optimization
 significantly.

Wait, you said this code is invalid so no code will be generated. Did
you mean code after introducing (int)? Then strict has no advantage
anymore as we can derive the info from (int) anyway.
Otherwise, I can't see how you can avoid generating typechecks in foo()
unless the only place it can ever be called from is bar() - but I don't
see how you can ensure that in PHP, and if you could, I don't see why
weak model could not make the same conclusions on the same code.

So far the only advantage I've seen seems to be that your compiler
would reject code that looks suspicious to it and thus force the
programmer to coerce the variables into the types manually - by (int) or
floor() - something that the coercive model would do for you
automatically. Once coerced, the same code would have the same type info
(and thus same potential optimizations) in both models. I don't think it
is a gain in general, and I don't think forcing people to modify their
code qualifies as JIT performance gain.
-- 
Stas Malyshev
smalys...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] PDO_DBLIB type handling

2015-02-22 Thread Yasuo Ohgaki

Hi Matteo,

On Sun, Feb 22, 2015 at 7:45 PM, Matteo Beccati p...@beccati.com wrote:

 The default behaviour of mysql/pgsql drivers is to convert to the matching
 PHP type, if possible. That can be turned off via
 PDO::ATTR_STRINGIFY_FETCHES = true.

 If PDO_DBLIB doesn't behave like that, I'd say it's a bug that needs to be
 fixed, but possibly only applied to a major/minor release due to the BC
 break.


Please write up RFC now. It's the time :)

Regards,

--
Yasuo Ohgaki
yohg...@ohgaki.net

Re: [PHP-DEV] JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints RFC)

2015-02-22 Thread Anthony Ferrara

Stas,

 -- Can the code generated for a strict type hint can somehow be optimized
 significantly better than the code generated for a dynamic/coercive type
 hint.
 And me, who wrote an AOT compiler that does **exactly** this, claim

 Sorry, did exactly what? Here a bit more explanation would help.

Optimized statically typed PHP functions. Or more specifically
function calls inside of compiled code are treated strictly (so trying
to pass a float to an int typed function would error at compile time).
The outer function call (from non-compiled PHP) is parsed using ZPP
rules, but once it's inside it's strict.

https://github.com/google/recki-ct/blob/master/doc/0_introduction.md
https://github.com/google/recki-ct/blob/master/doc/2_basic_operation.md

 However, since test_strict() is compiled, there's no reason to
 dispatch back up to PHP functions for strict_foo(). In fact, that
 would be exceedingly slow. So instead, we'd compile strict_foo() as a
 C function, and do a native function call to it. Never having to check
 types because they are passed on the C stack.

 Doesn't that assume strict_foo() is always called with the right type of
 arguments? What exactly ensures that it does in fact happen? Shouldn't
 you have the type check _somewhere_ to be able to claim this happens?
 test_foo() doesn't do any checks, so what ensures $x is of the right
 type for C? And if the check is there, how is it better?

Yes it does check the types, but at compile time. My AOT compiler
backend has no concept of a mixed or ZVAL type. All types are
determined at compile time, and in the very few cases it can't it will
error. The type inference engine attempts to determine specifically
using all available information (prior context, current context,
future context) to determine what the type is.

It does also detect type changes (via assignment) and is able to
correctly generate code based on that as well.

 And note that this can only work with strict types since you can do
 the necessary type inference and reconstruction (both forward from a
 function call, and backwards before it).

 I don't get the backwards part - I think you claimed it last time we
 discussed it but I haven't seen your answer explaining why it's OK to
 just ignore cases when the variable is of the wrong type. Right now, it
 looks like you claim that if somebody has a call strict_foo($x) and
 strict_foo() accepts integers, that magically makes $x integer and you
 can generate code everywhere (not only inside strict_foo but outside)
 assuming $x is integer without actually needing a check. I don't see how
 this can work.

Ok, let's take another example:

?php declare(strict_types=1);
function foo(int $int): int {
return $int + 1;
}

function bar(int $something): int {
$x = $something / 2;
return foo($x);
}

^^ In that case, without strict types, you'd have to generate code for
both integer and float paths. With strict types, this code is invalid.

You can tell because you know the function foo expects an integer. So
you can infer that $x will have to have the type integer due to the
future requirement. Which means the expression $something / 2 must
also be an integer. We know that's not the case, so we can raise an
error here.

At that point the developer has the choice to explicitly cast or put
in a floor() or one of a number of options.

The function bar itself didn't give us that information. We needed
to use the type information from foo() to infer the type of $x prior
to foo()'s call. Or more specifically, we inferred the only stable
type that it could be. Which let us determine that $x's assignment was
where the error was (since it wasn't a stable assignment).

Without strict typing this code is always stable, but you still need
to generate full type assertions in a compiled version of foo() and
use ZVALs for $x, hence reducing the effect of the optimization
significantly.

Anthony

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] Reclassify E_STRICT notices

2015-02-22 Thread Lester Caine

On 22/02/15 22:30, Nikita Popov wrote:
 I would like to propose reclassifying our few existing E_STRICT notices and
 removing this error category:
 
 https://wiki.php.net/rfc/reclassify_e_strict
 
 As we don't really have good guidelines on when which type of error should
 be thrown, I'm mainly going by what category other similar errors use. I'm
 open to suggestions, but hope this will not deteriorate into total bikeshed.

At last something which fits my roadmap ...

Only problem is 'Redefining a constructor' which I certainly accept
while upgrading code, but I also appreciate why 'Remove PHP4
Constructors' might not be accepted leaving a difficulty. I think one
ends up with still needing a 'mode' switch if the legacy constructors
are retained?

-- 
Lester Caine - G8HFL
-
Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk
Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints RFC)

2015-02-22 Thread Anthony Ferrara

Zeev,

 And note that this can only work with strict types since you can do the
 necessary type inference and reconstruction (both forward from a function
 call, and backwards before it).

 Please do explain how strict type hints help you do inference that you
 couldn't do with dynamic type hints.  Ultimately, your whole argument hinges
 on that, but you mention it in parentheses almost as an afterthought.
 I claim the opposite - you cannot infer ANYTHING from Strict STH that you
 cannot infer from Coercive STH.  Consequently, everything you've shown, down
 to the C-optimized version of strict_foo() can be implemented in the exact
 same way for very_lax_foo().  Being able to optimize away the value
 containers is not unique to languages with strict type hints.  It's done in
 JavaScript JIT engines, and it was done in our JIT POC.

I do here: http://news.php.net/php.internals/83504

I'll re-state the specific part in this mail:

?php declare(strict_types=1);
function foo(int $int): int {
return $int + 1;
}

function bar(int $something): int {
$x = $something / 2;
return foo($x);
}

^^ In that case, without strict types, you'd have to generate code for
both integer and float paths. With strict types, this code is invalid.

You can tell because you know the function foo expects an integer. So
you can infer that $x will have to have the type integer due to the
future requirement. Which means the expression $something / 2 must
also be an integer. We know that's not the case, so we can raise an
error here.

At that point the developer has the choice to explicitly cast or put
in a floor() or one of a number of options.

The function bar itself didn't give us that information. We needed
to use the type information from foo() to infer the type of $x prior
to foo()'s call. Or more specifically, we inferred the only stable
type that it could be. Which let us determine that $x's assignment was
where the error was (since it wasn't a stable assignment).

Without strict typing this code is always stable, but you still need
to generate full type assertions in a compiled version of foo() and
use ZVALs for $x, hence reducing the effect of the optimization
significantly.

Anthony

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

[PHP-DEV] RE: JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints RFC)

2015-02-22 Thread Zeev Suraski

[I hope I managed to get the threading right although I think it's more
than likely I didn't;  apologies in advance if that's the case]

 -Original Message-
 From: Joe Watkins [mailto:pthre...@pthreads.org]
 Sent: Sunday, February 22, 2015 5:07 PM
 To: Jefferson Gonzalez
 Cc: Etienne Kneuss; Anthony Ferrara; Zeev Suraski; PHP internals
 Subject: Re: [PHP-DEV] Coercive Scalar Type Hints RFC

  1. Does weak mode could provide the required rules to implement a JIT
 with a sane level of memory and CPU usage?

 There is no objective answer to the question while it has the clause
with a
 sane level of 

 The assertion in the RFC that says there is no difference between strict
and
 weak types, in the context of a JIT/AOT compiler, is wrong.

Can we take a hint from youtu.be/9MV1ot9HkPg please? :)

 function add(int $l, int $r) {
 return $l + $r;
 }

 The first instruction in the interpreted code is not ZEND_ADD, first,
 parameters must be received from the stack.

 If that's a strict function, then un-stacking parameters is relatively
easy, if it's
 a dynamic function then you have to generate code that is considerably
 more complicated.

Can you explain how so, assuming you know the same about what's waiting on
the stack in both cases?  If you know they're already ints, you can
optimize the function to take this assumption into account - identically -
in both the case of a strict or dynamic hint (and as a matter of fact,
even if there's no hint at all).  If, however, you don't know what's
waiting on the stack - you would have to conduct checks in both cases.
Sure, the checks are different - but the difference between them is
behavioral, semantic - not performance related.

 This is an inescapable difference, of the the kind that definitely does
have a
 negative impact on implementation complexity, runtime, and
maintainability.

This inescapable difference somehow manages to escape a bunch of people
here, myself included...

 To me, it only makes sense to compile strict code AOT or JIT; If you
want
 dynamic behaviour, we have an extremely mature platform for that.

I'll have to disagree with that statement.  I'm not yet sure what we can
achieve with JIT in real world PHP web apps, but I've seen enough to know
that with JIT, certain use cases which are impractical today due to
performance (mainly data crunching) suddenly become viable.  And that's
with no STH at all, strict or otherwise.  Our JIT POC runs bench.php 25
times faster than PHP 7, without a single byte changed in the source (no
typo, 25 times faster, not 25%).  Unfortunately, most real world app code
doesn't look anything like what bench.php does - but it should illustrate
the point about the viability of JIT/AOT for dynamic platforms.

To take another example, JavaScript is extremely lax, are you suggesting
that it doesn't make sense to use JIT with JavaScript?  JIT made JS
performance literally explode, without changing the language to be strict.

If what you can already discern from dynamic type inference (which is
quite a lot, as JS proves) isn't enough for you - what you need isn't some
specific kind of type hinting (strict or otherwise) - but more tools to
tell what the type is in situation where today you can't (or find it hard
to) infer it.  Strictly typed variable declarations would be one such
thing, changing how our operators work to be more strict would be another
(for the record, I'm absolutely not suggesting we do that!).  In short,
changing PHP to be a much more strongly typed language than it is today,
with or without Strict STH.

  2. ... With that said, if a JIT implementation is developed will the
story of
 the ZendOptimizer being a commercial solution will be repeated or would
 this JIT implementation would be part of the core?

 I think it's likely that Anthony and I, and Dmitry want different things
for a
 JIT/AOT engine. I think Anthony and I are preferring an engine that
requires
 minimal inference because type information is present (or implicit),

I don't see how that's possible, unless you add facilities such as the
ones I mentioned above.  If that's the goal, I think it should be clearly
stated.  Without that, you need the exact same type inference with strict
and weak types in order to develop useful JIT/AOT.

while
 Dmitry probably favours the kind that can infer at runtime, the dynamic
kind,
 like Zend is today. They are a world apart, I think, I'll be happy to be
proven
 wrong about that.

The difference is really not about what kind of JIT implementation anyone
prefers to have, but what kind of language behavior it's going to have to
address.  Trust me, I can tell you with absolute confidence that anybody
who writes a JIT engine - Dmitry and JS engine devs included - would
prefer to have the types just handed to him, rather than have to infer it.
Type inference is hard.  But that would be the task at hand.

Thanks,

Zeev

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit:

AW: [PHP-DEV] Coercive Scalar Type Hints RFC

2015-02-22 Thread Robert Stoll



 -Ursprüngliche Nachricht-
 Von: Pavel Kouřil [mailto:pajou...@gmail.com]
 Gesendet: Sonntag, 22. Februar 2015 20:02
 An: Robert Stoll
 Cc: Zeev Suraski; PHP internals
 Betreff: Re: [PHP-DEV] Coercive Scalar Type Hints RFC
 
 On Sun, Feb 22, 2015 at 7:30 PM, Robert Stoll p...@tutteli.ch wrote:
  Hi Pavel,
 
  Yes, I am suggesting to make conversions behave the same regardless if it 
  is implicit or explicit. The only difference
 between the two should be that one is stated explicitly by the user where the 
 other is applied implicitly. Other
 programming languages behave like this and are more predictable for users as 
 well as developers because one does not
 need to learn two sets of conversion rules.
 
 
 Actually this is not true. Other languages have differences between explicit 
 conversions (aka casting) and implicit
 conversions as well.
 C# is the language I use the most after PHP, so I'll bring that one up (see 
 https://msdn.microsoft.com/en-
 us/library/ms173105.aspx), but I believe other languages (probably Java?) act 
 the same way.
 
 Regards
 Pavel Kouril
 

Hm... I reconsidered my statements and that is a good thing :)
I am not sure if I got your view point. I will try to elaborate more on mine 
and explain how I interpret your statement.

Probably it is a philosophical question how to look at it. IMO the only 
difference in C# (as well as in Java) lies in the way the conversions are 
applied. Implicit conversions are applied automatically by the compiler where 
explicit conversions are applied by the user. The difference lies in the fact 
that C# is statically typed and implicit conversions are only applied when it 
is certainly safe to apply one. However, Implicit conversions in C# behave the 
same as explicit conversion since implicit conversion which fail simply do not 
exist (there is no implicit conversion from double to int for instance). That 
is the way I look at it. You probably look at it from another point of view and 
would claim an implicit conversion from double to int in C# exists but just 
fails all the time = ergo implicit and explicit are different (that is my 
interpretation of your statement above). In this sense I would agree. But even 
when you think in this terms then you have to admit, they are fundamentally 
different in the way that implicit conversion which are different than explicit 
conversion always fail, in all cases - pretty much as if they do not exist. 
There are no cases, neither in C# nor in Java which I am aware of, where an 
implicit cast succeeds in certain cases but not in all and an explicit 
conversion succeeds in at least more cases than the implicit conversion. Hence, 
something like a should also not work in an explicit conversion in PHP IMO if 
it is not supported by the implicit conversion (otherwise strict mode is 
useless btw.)

Try out the following C# code:
dynamic d1 = 1.0;
int d = d1;
You will get the error Cannot implicitly convert type `double` to `int` at 
runtime.

We see a fundamental difference between C# and PHP here. PHP is dynamically 
typed an relies on values rather than types (in contrast to C#). Therefore, the 
above code emits a runtime error even though the data could be converted to int 
without precision loss.
This shall be different in PHP according to this RFC and I think that is 
perfectly fine. Yet, even more important it seems to me that implicit/explicit 
conversions behave the same way. 
At first it might seem strange to have just one conversion rule set in PHP 
since PHP is not known to be a language which shines due to its consistency... 
OK, I am serious again. If you think about it from the following point of view: 
A user writes an explicit conversion in order to state explicitly that some 
value will be converted (this is something which will be necessary in a strict 
mode). Why should this explicit conversion be different from the implicit one? 
There should not be any difference between explicit knowledge and implicit one. 
That is my opinion. If you really do not care about data loss and just want to 
squeeze a float/string into an int no matter what the value really is then you 
can use the @ in conjunction with ?? and provide the desired default value to 
fall back on if the conversion fails. If conversions like a to int really 
matters that much to the users of PHP then we could keep the 
oldSchoolIntConversion function (as propose in my first email) even in PHP 10 
(I would probably get rid of them at some point).

Cheers, 
Robert


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] Coercive Scalar Type Hints RFC

2015-02-22 Thread Jefferson González

2015-02-22 16:38 GMT-04:00 Stanislav Malyshev smalys...@gmail.com:

 Yes, that's not the case, at least nobody ever showed that to be the
 case. In general, as JS example (among many others) shows, it is
 completely possible to have JIT without strict typing. In particular,
 coercive typing provides as much information as strict typing about
 variable type after passing the function boundary - the only difference
 is what happens _at_ the boundary and how the engine behaves when the
 types do not match, but I do not see where big performance difference
 would come from - the only possibility for different behavior would be
 if your app requires constant type juggling (checks are needed in strict
 mode anyway, since variables are not typed) - but in this case in strict
 mode you'd have to do manual type conversions, which aren't in any way
 faster than engine type conversions.
 So the case for JIT being somehow better with strict typing so far
 remains a myth without any substantiation.


Well, strict on a JIT environment may haven't been proved, but it surely
has been proved on statically compiled languages like C. Currently, a JIT
in the most cases can't compete to the bare performance of a static
compiled language, both in resources and CPU, so how is non strict better
in that sense? You can argue a lot about nodejs, but as I said on previous
message, at runtime it consumes more memory and cpu and this is mostly due
to all the type checking it requires. In that sense if the strict proposal
could improve that situation it would be a benefit.


 No, it can't be (at least it can't be the _entire_ code of this
 function), since the user still can pass non-int into this function -
 nothing introducing strict typing in functions, as it is proposed now,
 prevents it. What strict typing does is to ensure the error in this
 case, but to generate the error you still need the checks!
 BTW, your weak mode code is wrong too - there's no need to generate
 Variants if you typed the variables as int. You know once coercion is
 done they are ints. At least in the model that was now proposed.


I thought those checks could be optional if generated at call time, thats
why I gave these 2 examples:

calc(1, 5) - no need for type checking or conversion, do a direct call
calc(12, 15) - calc(strToInt(value1), strToInt(value2))
calc($var1, $var2) - needs type checking and conversion if required

I was thinking on the sense that before calling a function, type checking
could take place and conversion if required, but may be thats even more
complicated...

So what you are saying is that there is no way of determining the type of a
variable (only at runtime), as Zeev explained on the previous messages,
since variables aren't typed, checks are mandatory either way.


 Please provide a substantiation for this opinion. So far what was
 provided was not correct.


Static typed languages - Direct conversion to machine code
Dynamic typed languages with JIT - Intermediate representation - Checks
- Conversion to machine code with checks.


  Please do not strawman. A lot of people here care about performance, and
 you have not yet made case that strict typing has any benefit on
 performance, so implying that opponents of strict typing somehow don't
 care about performance while you champion it does not match the real
 situation.


My intention is just that, clear the doubts, I thought and may still think
that strict has some advantages, but I'm been proven wrong and many people
with all these insightful information might as well.

Re: [PHP-DEV] new json, push generated file?

2015-02-22 Thread Jakub Zelenka

Hi Anatol,

On Sun, Feb 22, 2015 at 6:09 PM, Anatol Belski anatol@belski.net
wrote:


 FYI I had to downgrade re2c to 0.13.6 as the latest randomly crashes.


Ok. :) There are no differences in the generated DFA so it's not a problem
for me to use 0.13.6 too.

The preferred versions are more about nicer diffs when regenerating files.
So it's not a big issue if it gets regenerated with another supported
version. I test all supported versions when I do some changes to the parser
or scanner and I can always regenerate it back if someone else needs to do
some urgent changes ;)

Cheers

Jakub

[PHP-DEV] Reclassify E_STRICT notices

2015-02-22 Thread Nikita Popov

Hi internals!

I would like to propose reclassifying our few existing E_STRICT notices and
removing this error category:

https://wiki.php.net/rfc/reclassify_e_strict

As we don't really have good guidelines on when which type of error should
be thrown, I'm mainly going by what category other similar errors use. I'm
open to suggestions, but hope this will not deteriorate into total bikeshed.

Thanks,
Nikita

Re: [PHP-DEV] JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints RFC)

2015-02-22 Thread Stanislav Malyshev

Hi!

 -- Can the code generated for a strict type hint can somehow be optimized
significantly better than the code generated for a dynamic/coercive type
hint.
 And me, who wrote an AOT compiler that does **exactly** this, claim

Sorry, did exactly what? Here a bit more explanation would help.

 However, since test_strict() is compiled, there's no reason to
 dispatch back up to PHP functions for strict_foo(). In fact, that
 would be exceedingly slow. So instead, we'd compile strict_foo() as a
 C function, and do a native function call to it. Never having to check
 types because they are passed on the C stack.

Doesn't that assume strict_foo() is always called with the right type of
arguments? What exactly ensures that it does in fact happen? Shouldn't
you have the type check _somewhere_ to be able to claim this happens?
test_foo() doesn't do any checks, so what ensures $x is of the right
type for C? And if the check is there, how is it better?

 And note that this can only work with strict types since you can do
 the necessary type inference and reconstruction (both forward from a
 function call, and backwards before it).

I don't get the backwards part - I think you claimed it last time we
discussed it but I haven't seen your answer explaining why it's OK to
just ignore cases when the variable is of the wrong type. Right now, it
looks like you claim that if somebody has a call strict_foo($x) and
strict_foo() accepts integers, that magically makes $x integer and you
can generate code everywhere (not only inside strict_foo but outside)
assuming $x is integer without actually needing a check. I don't see how
this can work.
-- 
Stas Malyshev
smalys...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] [VOTE] Remove PHP 4 Constructors

2015-02-22 Thread Levi Morrison

On Sun, Feb 22, 2015 at 11:33 PM, Yasuo Ohgaki yohg...@ohgaki.net wrote:
 Hi Levi,

 On Mon, Feb 23, 2015 at 1:39 PM, Levi Morrison le...@php.net wrote:

 I have moved the RFC for removing PHP 4 constructors[1] into voting
 phase. As there are a lot of RFCs in discussion and voting right now I
 will leave this RFC in voting phase until the evening (UTC-7) of March
 6th which is 12 days away; this will hopefully allow everyone to be
 able to review this RFC and vote on it without being rushed.


 This may be a bit off topic. During this RFC discussion, I mentioned Trait
 method
 name issue that trait method which has PHP4 constructor name for the class
 is
 treated as class constructor.

 http://3v4l.org/gHbdq (Trait method is called as constructor)

 If there is __construct() in the class

 http://3v4l.org/HEBMl (Fatal error: A has colliding constructor definitions
 coming from traits)

 Is this bug fixed also? Or we have to wait until PHP8?

Aside from the new warning that is emitted and the old one that is
removed no other behavior is changed. This means the bug will remain
through the PHP 7 lifecycle even if this RFC passes. If the RFC passes
the colliding constructor issue will be removed in PHP 8 when the rest
of the old-style constructor support is removed.

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

[PHP-DEV] Re: [RFC] Script only include/require

2015-02-22 Thread Yasuo Ohgaki

Hi all, Zend engine experts especially,

On Mon, Feb 23, 2015 at 6:23 AM, Yasuo Ohgaki yohg...@ohgaki.net wrote:

 I wrote patch and made adjustment in the RFC
 https://wiki.php.net/rfc/script_only_include
 https://github.com/php/php-src/pull/
 Where to check filename extension is subject to be changed.
 At first, I thought implementing this as PHP code is good, but
 I've changed my mind. It seems better to be done in Zend code.
 Opinions are appreciated.


I noticed very strange behavior under ZTS build with this patch.
It turned out that compiler_globals is not accessible under ZTS build
according to gdb.

Is this intended? If so, where should I put script_extensions char array?

Thank you.

--
Yasuo Ohgaki
yohg...@ohgaki.net

RE: [PHP-DEV] Coercive Scalar Type Hints RFC

2015-02-22 Thread François Laupretre

Hi,

For those interested in evaluating the impact of ZPP ruleset modications on 
internal and userland code, A pull request is now available :

https://github.com/php/php-src/pull/1110

Please note that this is not a mere implementation of the RFC ruleset, although 
it comes preconfigured this way. It contains a set of 12 configurable options, 
each one enabling/disabling a particular ruleset modification. This allows for 
a much more powerful exploration of potential modifications and BC breaks 
against the existing codebase. Every combination of individual behaviors is 
possible, providing a theoretical number of about 3,000 potentials rulesets. Of 
course, a lot of these are not consistent, but it still allows for creative 
thinking.

Given the time I had to write it, I didn't perform extensive testing. I just 
ensured the ruleset described in the RFC and the one you get when activating 
every possible changes both compile and seem to work as expected. I'll test 
more cases tomorrow. So, code review is key priority and every error (compile 
or runtime) you may get should be reported as fast as possible.

Overall configuration possibilities include and go beyond the STH RFC, with the 
exception of numeric strings, whose proposed restrictions are not implemented 
yet, but will be soon.

So, I hope you'll enjoy the new toy. And thoughts are welcome, as usual.

Regards

François


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] Reclassify E_STRICT notices

2015-02-22 Thread Yasuo Ohgaki

Hi Nikita,

On Mon, Feb 23, 2015 at 7:30 AM, Nikita Popov nikita@gmail.com wrote:

 I would like to propose reclassifying our few existing E_STRICT notices and
 removing this error category:

 https://wiki.php.net/rfc/reclassify_e_strict

 As we don't really have good guidelines on when which type of error should
 be thrown, I'm mainly going by what category other similar errors use. I'm
 open to suggestions, but hope this will not deteriorate into total
 bikeshed.


+1 overall.

Regarding Only variables should be assigned by reference

Most of errors are appropriate, but some of them may be removed.
For example, literals do not make sense so current behavior is good.

$ php -r 'array_pop([1,2,3]);'
PHP Fatal error:  Only variables can be passed by reference in Command line
code on line 1

However, emitting Only variables should be assigned by reference for this

$top = array_pop(some_func_returns_array()); // Code needs only top element

seems too strict, for example. I would rather PHP behaves like HHVM

 - http://3v4l.org/5AIrb.
 - http://3v4l.org/O0SXE

Is it possible relax the error for tmp variables?

Regards,

--
Yasuo Ohgaki
yohg...@ohgaki.net

Re: [PHP-DEV] [VOTE] Remove PHP 4 Constructors

2015-02-22 Thread Yasuo Ohgaki

Hi Levi,

On Mon, Feb 23, 2015 at 1:39 PM, Levi Morrison le...@php.net wrote:

 I have moved the RFC for removing PHP 4 constructors[1] into voting
 phase. As there are a lot of RFCs in discussion and voting right now I
 will leave this RFC in voting phase until the evening (UTC-7) of March
 6th which is 12 days away; this will hopefully allow everyone to be
 able to review this RFC and vote on it without being rushed.


This may be a bit off topic. During this RFC discussion, I mentioned Trait
method
name issue that trait method which has PHP4 constructor name for the class
is
treated as class constructor.

http://3v4l.org/gHbdq (Trait method is called as constructor)

If there is __construct() in the class

http://3v4l.org/HEBMl (Fatal error: A has colliding constructor definitions
coming from traits)

Is this bug fixed also? Or we have to wait until PHP8?

Regards,

P.S. I voted for yes, of course.

--
Yasuo Ohgaki
yohg...@ohgaki.net

Re: [PHP-DEV] [VOTE] Remove PHP 4 Constructors

2015-02-22 Thread Yasuo Ohgaki

Hi Levi,

On Mon, Feb 23, 2015 at 3:40 PM, Levi Morrison le...@php.net wrote:

 Aside from the new warning that is emitted and the old one that is
 removed no other behavior is changed. This means the bug will remain
 through the PHP 7 lifecycle even if this RFC passes. If the RFC passes
 the colliding constructor issue will be removed in PHP 8 when the rest
 of the old-style constructor support is removed.


Thank you for the answer.
Everyone should vote yes for this RFC.

Regards,

--
Yasuo Ohgaki
yohg...@ohgaki.net

Re: [PHP-DEV] Coercive Scalar Type Hints RFC

2015-02-22 Thread user


On 02/22/2015 06:28 PM, François Laupretre wrote:

Hi Stas,

It seems the actual problem is that we have too many compiler / code analysis 
experts in the community ;)

(don't get me wrong, I am not saying that for you, I just admire your patience 
explaining the same again and again to people who never read one line from PHP 
core source).



Well I never have worked on a JIT/AOT and I have to admit I haven't done 
any contributions to the PHP engine (and it seems I do not have any 
rights to write some couple of messages expressing concerns/views 
because of that).


On the other side I took the wxwidgets extension in an effort to revive 
it (because I believe PHP can have other use cases). Improved its code 
generator (and other stuff that involved a relation with the PHP source 
code) which now generates more than 905941 lines of code that constitute 
the extension (github.com/wxphp/wxphp/tree/master/src).


So I have indeed read source from PHP core. In any case, sorry if I have 
annoyed some, that never was my intention, we as humans can't posses all 
the knowledge of the world, so thats why we always learn from somebody 
else, whats the purpose of a community without participation :)


Cheers!


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints RFC)

2015-02-22 Thread Anthony Ferrara

Stas,

On Sun, Feb 22, 2015 at 6:47 PM, Stanislav Malyshev smalys...@gmail.com wrote:
 Hi!

 You can tell because you know the function foo expects an integer. So
 you can infer that $x will have to have the type integer due to the
 future requirement. Which means the expression $something / 2 must
 also be an integer. We know that's not the case, so we can raise an
 error here.

 OK, so your claim is that the compiler with strict typing can detect
 some situations which the dynamic one can not and reject some of the
 code. Without going too much into details, I agree with this, this is an
 obvious difference between strict and dynamic. However, this is not a

Alright, we're getting somewhere.

 performance advantage, obviously - since you are comparing running code
 with non-running one - your model just accepts less code. Obviously,
 this works if non-accepted code was wrong - and doesn't work if it was
 not. But we talked about running code, I thought.

It is still a performance advantage, because since we know the types
are stable at compile time, we can generate far more optimized code
(no variant types, native function calls, etc).

And yes, it accepts less code. It refuses to accept code that is not
type stable. More on that in a second:

 At that point the developer has the choice to explicitly cast or put
 in a floor() or one of a number of options.

 That's exactly what I claim would be the defect of the strict model -
 people would start putting excessive casts ensuring there would be cases
 where information is lost. For example, assume we knew $something is even:

 function bar(int $something): int {
 assert($something %2 == 0);
 $x = $something / 2;
 return foo($x);
 }

 Now everything is fine (ignoring the typing for a second), right? We're
 dealing with integers, /2 always divides evenly, all is great. Now we
 introduce strictness, so we'd need to say something like:

 function bar(int $something): int {
 assert($something %2 == 0);
 $x = $something / 2;
 return foo((int)$x);
 }

 Now assume somebody messed up on the routine code reformatting merge and
 the code somehow ended up like:

 function bar(int $something): int {
 $x = $something / 2;
 return foo((int)$x);
 }

 Do you see what the problem is? Now we lost the check for $something
 being even, but we would never know about it since type system forced us
 to insert (int) (which we didn't need) and thus disabled the controls
 for the bug of $something not being even (which we did need).

Actually, in this case, the int cast does tell us something. It says
that the result (truncation) is explicitly wanted. Not to the compiler
(tho that happens), but to the developer.

With coercive typing as proposed in Ze'ev's RFC, that would need to
happen anyway. In both proposals that would generate a runtime error.
The difference is, with strict types, we can detect the error ahead of
time and warn about it.

 But more important question is - with (int) the coercive model can use
 this information too, so what's the difference from strict model on that
 code? There seems to be none.

In this precise example there is none, because division is not type
stable (it depends on the values of its arguments). Let's take a
different example

function foo(float $something): int {
return $something + 0.5;
}

With coercive types, you can't tell ahead of time if that will error
or not. With static types, you can.

 Without strict typing this code is always stable, but you still need
 to generate full type assertions in a compiled version of foo() and
 use ZVALs for $x, hence reducing the effect of the optimization
 significantly.

 Wait, you said this code is invalid so no code will be generated. Did
 you mean code after introducing (int)? Then strict has no advantage
 anymore as we can derive the info from (int) anyway.
 Otherwise, I can't see how you can avoid generating typechecks in foo()
 unless the only place it can ever be called from is bar() - but I don't
 see how you can ensure that in PHP, and if you could, I don't see why
 weak model could not make the same conclusions on the same code.

No, I was talking about trying to do the same trick (using native
function calls) with coercive types.

 So far the only advantage I've seen seems to be that your compiler
 would reject code that looks suspicious to it and thus force the
 programmer to coerce the variables into the types manually - by (int) or
 floor() - something that the coercive model would do for you
 automatically. Once coerced, the same code would have the same type info

Actually, no. Coercive as proposed by Ze'ev would cause 8.5 to error
if passed to an int type hint. So you'd need the cast there as well.
Either that, or error at runtime as well.

Hence, in both cases casts would be required. One could tell you ahead
of time where you forgot a cast, the other would wait until runtime
(when the edge-case was hit).

 (and thus same potential optimizations)

Re: [PHP-DEV] Coercive Scalar Type Hints RFC

2015-02-22 Thread Jefferson Gonzalez


On 02/22/2015 06:28 PM, François Laupretre wrote:

Hi Stas,

It seems the actual problem is that we have too many compiler / code analysis 
experts in the community ;)

(don't get me wrong, I am not saying that for you, I just admire your patience 
explaining the same again and again to people who never read one line from PHP 
core source).




Well I never have worked on a JIT/AOT and I have to admit I haven't done 
any contributions to the PHP engine (and it seems I do not have any 
rights to write some couple of messages expressing concerns/views 
because of that).


On the other side I took the wxwidgets extension in an effort to revive 
it (because I believe PHP can have other use cases). Improved its code 
generator (and other stuff that involved a relation with the PHP source 
code) which now generates more than 905941 lines of code that constitute 
the extension (github.com/wxphp/wxphp/tree/master/src).


So I have indeed read source from PHP core. In any case, sorry if I have 
annoyed some, that never was my intention, we as humans can't posses all 
the knowledge of the world, so thats why we always learn from somebody 
else, whats the purpose of a community without participation :)


Cheers!



--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints RFC)

2015-02-22 Thread Stanislav Malyshev

Hi!

 It is still a performance advantage, because since we know the types
 are stable at compile time, we can generate far more optimized code
 (no variant types, native function calls, etc).

I don't see where it comes from. So far you said that your compiler
would reject some code. That doesn't generate any code, optimized or
otherwise. For the code your compiler does not reject, still no
advantage over dynamic model.

 Actually, in this case, the int cast does tell us something. It says
 that the result (truncation) is explicitly wanted. Not to the compiler
 (tho that happens), but to the developer.

No, it doesn't say that in this case. The developer didn't actually want
truncation. They just wanted to call foo(). You forced them to use
truncation because that's the only way to call foo() in your compiler.
They said it's ok since truncation is over value that is int anyway, and
they are true - except when it stops to be true in the future. That
generates brittle code because it forces the developer to take risks
they otherwise wouldn't take - such as use much stronger forced
conversions instead of more appropriate dynamic ones.

 With coercive typing as proposed in Ze'ev's RFC, that would need to
 happen anyway. In both proposals that would generate a runtime error.

No, it wouldn't need to happen since no-DL conversion is allowed.

 The difference is, with strict types, we can detect the error ahead of
 time and warn about it.

Static analyzer can warn about it regardless of type model. The only
difference in strict model is that when compiling - not ahead of time,
but in runtime - it would produce hard error even in case of even
number, which can work just fine without it.

 In this precise example there is none, because division is not type

That's what I am saying - if the code runs, there's no difference. The
only difference your model runs less code, and forces (or, rather,
strongly incentivizes) people to wrote more dangerous one because some
of the non-dangerous one is not allowed.

 stable (it depends on the values of its arguments). Let's take a
 different example
 
 function foo(float $something): int {
 return $something + 0.5;
 }
 
 With coercive types, you can't tell ahead of time if that will error
 or not. With static types, you can.

I'm not sure what this proves. Yes, of course there are cases where
strict typing (please let's not confuse it with static typing - these
are different things, static typing is when everything's type is known
in advance and this is not happening in PHP, that's kind of the whole
point) would disallow some code that dynamic typing allows. Nobody
argues with that. What I am arguing with is that this difference is
somehow useful - especially for JIT optimizations.

 No, I was talking about trying to do the same trick (using native
 function calls) with coercive types.

I'm not sure what you are comparing to what. You provide some code and
say in my compiler, this code A would not work, while in dynamic model
it would. Instead, you should write code B. This code B would run faster
in my compiler. But that is not a proof your compiler is better!
Because code B would also run faster in dynamic model, and in addition,
code A would also run (though indeed not faster than B).

 Actually, no. Coercive as proposed by Ze'ev would cause 8.5 to error
 if passed to an int type hint. So you'd need the cast there as well.
 Either that, or error at runtime as well.

We were talking about the case where the argument was even, you must
have missed that part. If the argument is not even, indeed both models
would produce the same error, no difference there. The only difference
in your model vs. dynamic model so far is that you forced the developer
to do manual (int) instead of doing much smarter coercive check on
entrance of foo(). There's no performance improvement in that and
there's reliability decrease.

 Hence, in both cases casts would be required. One could tell you ahead
 of time where you forgot a cast, the other would wait until runtime
 (when the edge-case was hit).

You imply it's always the case of forgot and the casts always should
be there, which is not the case - actually, as I already said, I think
this is the main defect of your model, forcing manual casts everywhere.
Otherwise, I agree - that's the only difference. Still struggle to see
any JIT gain. So far only one advantage demonstrated was the obvious one
- if you obviously pass obvious non-int to int parameter in strict
model, this can be detected statically. It would be stupid to deny that
as it is pretty much immediately follows from the definition of strict
model. But that's the only difference I see and not much of an advantage
in my eyes as a) patently obvious cases would be pretty rare and b) in
many cases would also not be what developer wanted, leading to manual
casts and c) last but not least, static analyzer doing that can be as
easily written without having these strict rules in core PHP!

Re: [PHP-DEV] JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints RFC)

2015-02-22 Thread Lester Caine

On 23/02/15 00:25, Anthony Ferrara wrote:
 And as the static analyzer traces back, if it finds possibilities that
 don't match (for example, if you assigned it directly from $_POST),
 it's able to say that either the original assignment or the function
 call is an error.

Why would using an integer I've passed in a URL be a 'fault'? All of the
data navigation functions pass their state via the URL and one simply
protects against hackers by filtering the state to a default value if it
does not return the correct integer data.

-- 
Lester Caine - G8HFL
-
Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk
Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints RFC)

2015-02-22 Thread Anthony Ferrara

Zeev,


 Partially.

 The static analysis and compilation would be pure AOT. So the errors would
 be told to the user when they try to analyze the program, not run it.
 Similar
 to HHVM's hh_client.

 How about that then:

 1. The developers runs a static analyzer on the program.
 2. It fails because the static analyzer detects float being fed to an int.
 3. The user changes the code to convert the input to int.
 4. You can now optimize the whole flow better, since you know for a fact
 it's an int.

 Is that an accurate flow?

Yes. At least for what I was talking about in this thread.

 However, there could be a runtime compiler which compiles in PHP's
 compile flow (leveraging opcache, etc). In that case, if the type
 assertion
 isn't stable, the function wouldn't be compiled (the external analyzer
 would
 error, here it just doesn't compile). Then the code would be run under the
 Zend engine (and error when called).

 Got you.  Is it fair to say that if we got to that case, it no longer
 matters what type of type hints we have?

Once you get to the end, no. Recki-CT proves that.

The difference though is the journey. The static analyzer can reason
about far more code with strict types than it can without (due to the
limited number of possibilities presented at each call). So this
leaves the dilema: compiled code that behaves slightly differently
(what Recki does) or whether it always behaves the same.

 So think of it as a graph. When you start the type analysis, there's one
 edge
 between $input and foo() with type mixed. Looking at foo's argument, you
 can say that the type of that graph edge must be int.
 Therefore it becomes an int. Then, when you look at $input, you see that
 it
 can be other things, and therefore there are unstable states which can
 error
 at runtime.

 So when you say it 'must be an int', what you mean is that you assume it
 needs to be an int, and attempt to either prove that or refute that.  Is
 that correct?
 If you manage to prove it - you can generate optimal code.
 If you manage to refute that - the static analyzer will emit an error.
 If you can't determine - you defer to runtime.

 Is that correct?

Basically yes.

Anthony

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints RFC)

2015-02-22 Thread Anthony Ferrara

Stas,

On Sun, Feb 22, 2015 at 8:35 PM, Stanislav Malyshev smalys...@gmail.com wrote:
 Hi!

 The difference though is the journey. The static analyzer can reason
 about far more code with strict types than it can without (due to the
 limited number of possibilities presented at each call). So this
 leaves the dilema: compiled code that behaves slightly differently
 (what Recki does) or whether it always behaves the same.

 Wait, so are you saying that advantage of having strict typing in PHP
 core is that some analyzer - which does not share code with PHP core,
 AFAIU - if it interpreted PHP types in strict manner and provided
 warnings where types it can statically deduce do not match and the
 authors of the code agreed with its suggestions and rewrote their code
 so that the analyzer would not complain, would in some cases result in
 code that might be JIT-optimized more efficiently?

 That is not a claim about strict typing in PHP core having any benefit
 at all. I'm not sure even this claim is true (as adding (int) doesn't
 actually improve performance - it just shifts around the place where the
 conversion is done, and once conversion is done, you can do the same
 optimizations as before) - but even if there's some situation where it
 is true, I don't see how it makes difference for PHP core (even in
 situation of PHP core + JIT extension or non-Zend PHP runtime with
 AOT/JIT).

Please don't twist my words. Look at everything I said, don't take one
statement from one very specific topic out of context as some sort of
proof that there are no benefits.

Anthony

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints RFC)

2015-02-22 Thread Jefferson Gonzalez


On 02/22/2015 09:15 PM, Stanislav Malyshev wrote:

We were talking about the case where the argument was even, you must
have missed that part. If the argument is not even, indeed both models
would produce the same error, no difference there. The only difference
in your model vs. dynamic model so far is that you forced the developer
to do manual (int) instead of doing much smarter coercive check on
entrance of foo(). There's no performance improvement in that and
there's reliability decrease.



How is coercive much smarter? Basically what coercive would do is 
similar to what the intval(), floatval(), etc... set of functions do 
with some type checking on the mix to ensure a value matches some set of 
rules.


How casting (int) could be such dangerous thing? Lets take for example 
this code:


echo (int) whats cooking!;
echo intval(whats cooking);

Both statements print 0, so how is casting unsafe???

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

RE: [PHP-DEV] JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints RFC)

2015-02-22 Thread Zeev Suraski

 -Original Message-
 From: Anthony Ferrara [mailto:ircmax...@gmail.com]
 Sent: Monday, February 23, 2015 3:21 AM
 To: Zeev Suraski
 Cc: PHP internals
 Subject: Re: [PHP-DEV] JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints
 RFC)

 Zeev,

  Partially.

  The static analysis and compilation would be pure AOT. So the errors
  would be told to the user when they try to analyze the program, not run
 it.
  Similar
  to HHVM's hh_client.

  How about that then:

  1. The developers runs a static analyzer on the program.
  2. It fails because the static analyzer detects float being fed to an
  int.
  3. The user changes the code to convert the input to int.
  4. You can now optimize the whole flow better, since you know for a
  fact it's an int.

  Is that an accurate flow?

 Yes. At least for what I was talking about in this thread.

OK.

So the code after the fix would look like this:
?php declare(strict_types=1);
function foo(int $int): int {
return $int + 1;
}

function bar(int $something): int {
$x = (int) $something / 2;  // (int) or whatever else makes it clear
it's an int
return foo($x);
}
?

Let me explain how this could play out with coercive type hints:
?php
function foo(int $int): int {
return $int + 1;
}

function bar(int $something): int {
$x = $something / 2;
return foo($x);
}

We can all agree that determining the types of just about anything here is
ultra-easy, so easy you could do it with a static analyzer, as you
suggested.  $int and $something are integers, while $x is either an integer
or a float.  We also know that both foo() and bar() expect integers.

What's the optimal code we could generate here?
First, on the function body of foo(), we can clearly and easily translate
the whole into machine code, as we know we'll get a long and need to return
a long.
Moving to the caller scope in bar(), given we know $x is either a float or
an integer, we could either generate code that calls coerce_to_int($x), or
even some optimize machine code that checks zval.type and either uses the
lval or converts dval.  This can be done in AOT, no need to wait for
runtime.  Once we know for a fact we have an integer in our hands - we can
make the call directly to the optimized foo(), a C level call without the
overhead of a PHP function call.

If you look at the generated code, it's going to be remarkably similar
between the two cases.  If the developer chooses to pick the casting route,
it will look almost identical - except it will be convert_to_long() that is
called instead of coerce_to_int(), the former being more aggressive than the
latter.

Can you see anything impossible or otherwise wrong with my description of
how the AOT compiler would work in this case, with coercive type hints?  If
not, there are no performance benefits for the Strict typed version after
the user alters his code to behave similarly to what coercive type hints
would bring.

Based on our Twitter discussion, I think I may have not made my position
clear regarding where our differences are.  I'm not claiming that you can't
do the optimizations you say you can do.  Not at all.  My point is that we
can do the very same optimizations with coercive types as well - basically,
that there is no delta.

  However, there could be a runtime compiler which compiles in PHP's
  compile flow (leveraging opcache, etc). In that case, if the type
  assertion isn't stable, the function wouldn't be compiled (the
  external analyzer would error, here it just doesn't compile). Then
  the code would be run under the Zend engine (and error when called).

  Got you.  Is it fair to say that if we got to that case, it no longer
  matters what type of type hints we have?

 Once you get to the end, no. Recki-CT proves that.

Do you mean that the statement is unfair or that it no longer matters?   If
it's the former, can you elaborate as to why?

 The difference though is the journey. The static analyzer can reason about
 far more code with strict types than it can without (due to the limited
 number of possibilities presented at each call). So this leaves the
 dilema:
 compiled code that behaves slightly differently (what Recki does) or
 whether
 it always behaves the same.

  So think of it as a graph. When you start the type analysis, there's
  one edge between $input and foo() with type mixed. Looking at foo's
  argument, you can say that the type of that graph edge must be int.
  Therefore it becomes an int. Then, when you look at $input, you see
  that it can be other things, and therefore there are unstable states
  which can error at runtime.

  So when you say it 'must be an int', what you mean is that you assume
  it needs to be an int, and attempt to either prove that or refute
  that.  Is that correct?
  If you manage to prove it - you can generate optimal code.
  If you manage to refute that - the static analyzer will emit an error.
  If you can't determine - you defer to runtime.

  Is that correct?

 Basically

Re: [PHP-DEV] JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints RFC)

2015-02-22 Thread Stanislav Malyshev

Hi!

 It rejects code because doing code generation on the dynamic case is
 significantly harder and more resource intensive. Could that be built
 in? Sure. But it's a very significant difference from generating the
 static code.

I can appreciate that. Dynamic typing is hard to translate into
statically typed code efficiently. But I don't see how that is related
to PHP having strict types - surely even strict types do not make PHP
statically typed, in fact, I don't see how they improve much - so far
you've shown me code examples that you compiler *wouldn't* handle. I
don't see not being able to handle code is an advantage. Could I see
examples of code that strict model *can* handle and that work better in
that model?

 And even if we generated native code for the dynamic code, it would
 still need variants, and hence ZPP at runtime. Hence the static code
 has a significant performance benefit in that we can indeed bypass
 type checks as shown in the PECL example a few messages up (more than
 a few).

I don't see how you can bypass type checks unless you know the variable
types at the time of the call, from some external source or some
information you collected about the code. If you know that, you could as
well generate the same check-less code for weak/dynamic model.

 Passing a float to an integer parameter would result in a runtime
 E_RECOVERABLE_ERROR if the float has dataloss.
 
 So in the case I cited: foo($someint / 2), that will generate an
 E_RECOVERABLE_ERROR in Zeev's proposal, as well as in the static
 typing mode of mine.

It sounds like you've missed the part of my reply where I was saying
that I am considering the case of even numbers.

 With coercive typing as proposed in Ze'ev's RFC, that would need to
 happen anyway. In both proposals that would generate a runtime error.

 No, it wouldn't need to happen since no-DL conversion is allowed.
 
 Sure it would. 3/2 is 1.5. Which would fatal if I passed it to
 foo(int) under Zeev's RFC. Because of data loss.

Again, you seem to miss the part where I said that we're considering a
non-DL case. For DL case, both behave the same so there's indeed no
difference (while you claimed there's some advantage for strict model?)

 This very particular case, yes, because of the simplicity of the types
 involved. But with strict typing you only need to look at 1 success
 case, but with coercive typing you need to look at many more.

I do not see why you can ignore the fact that your assumptions about the
variable types could be wrong with strict typing. PHP is not a static
typed language, so unless you can prove definitely that the variable
absolutely can not be anything other than the prescribed type (prior to
the call), you still need to have code that accounts for the other
possibility. If you can, however, prove that, both strict and dynamic
typing would behave exactly the same!
You could, of course, build your static analyzer in a way that would
reject every code where it can not prove all types - however I hope you
understand it is not an option for PHP core?

 Also, in many (I'd argue most) cases coercive has to either issue a
 warning (it doesn't know) or error on valid and functioning code.
 Example:
 
 function isdivisibleby2(string $foo): bool {
 if (preg_match('(\D)', $foo)) {
 return false;
 }
 return 0 == ($int % 2);
 }
 
 function something2(string $foo): int {
 if (!isdivisibleby2($foo)) {
 return 10;
 }
 return foo($foo / 2);
 }
 
 This code would never raise a runtime error in Zeev's coercive
 proposal. However, when looking at it statically, you cant tell
 (unless you've got a regex decompiler).
 So static analysis on dynamic types will either error on valid code,
 or not error on invalid code (and I'm not even talking about the
 halting problem here).

True, but PHP is built on dynamic types, and neither proposal changes
that. So you either propose to make PHP fully statically typed (which I
hope you do not) or say static analysis is not perfect - which I
wholeheartedly agree, but then again CS is full of unsolvable problems,
and static analysis is, unfortunately, reduceable ultimately to one of
them, so no wonder here. The same case would, of course, be true with
strict and non-strict runtime typing - simply because PHP is not
statically typed.

 Whereas with strict typing, the error would appear in both cases
 (static and runtime). And you could fix it.

If you are saying that you can construct code, containing an error,
which will be missed by coercive typing but would fail (not necessarily
because of this specific error, but because of type mismatch) with
strict typing, it is of course trivially true. But so what? This in no
way proves strict typing caught the error - to prove that, the type
failure should be causally connected to the error, in your examples it
is not.

Moreover, you somehow bring example of the code that is actually not
wrong, practically speaking (as it divides by 2 the number

RE: [PHP-DEV] Type hints ...

2015-02-22 Thread François Laupretre

Hi Lester,

I am not sure I understand well, but the extended type syntax partially 
described in https://wiki.php.net/rfc/dbc may correspond to what you describe. 
Such extended syntax will be part of 'Design by Contract', meaning it's 
potentially too slow to run in production and checks can be turned on and off 
globally. When it is available, PHP argument type hints will become simplified 
fast checks that run every time, even in production.

Extended types will support nested syntax as complex as 
'object(Iterable)|array('id' = int(]0:), * = string|array(string))'. No limit 
to the syntax you may support here. It will also be available as a dynamic 
feature which will allow to check a variable against a dynamically-defined 
type. *This* will bring dramatic performance improvement in data validation. I 
don't imagine type hints will bring much in terms of overall performance.

I guess that's what you mean but please confirm. I think this will be my next 
project for PHP, after STH if it passes.

Regards

François

 De : Lester Caine [mailto:les...@lsces.co.uk]

 Currently I have an array of variables and the docblock annotation tells
 me just what each element is intended to be. I process the variables on
 that basis and while it may be helpful to have some higher level of
 'restraint', I have a working flexible system. As a variable is
 processed it is constrained by the appropriate rules. If PHP adds 'Type
 Hints' they will only apply to where I am passing an array variable, and
 the type hint adds additional processing to that which I already
 maintain myself. How will that improve performance?

It won't, except if you remove some redundant checks from your PHP code. Type 
checks performed by STH are faster than the equivalent PHP code, that's the 
only possible performance improvement I imagine.

 Add to this equation that the type and constraints of a variable may
 well vary from one record set to another. It may well be that a fixed
 set of types can be defined, but these are not the types currently being
 defined and would include date types in parallel with a group of numeric
 types.
 
 Passing 'strict' types in some cases just does not compute in my book,
 and even 'coercive' types only addresses a subset of the types needed so
 that it adds another layer of 'checking' over what we already have in
 much of the existing user code base. People keep going on about
 different rule sets but this just adds another set of 'rules' rather
 than a single solution.
 
 --
 Lester Caine - G8HFL
 -
 Contact - http://lsces.co.uk/wiki/?page=contact
 L.S.Caine Electronic Services - http://lsces.co.uk
 EnquirySolve - http://enquirysolve.com/
 Model Engineers Digital Workshop - http://medw.co.uk
 Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
 
 --
 PHP Internals - PHP Runtime Development Mailing List
 To unsubscribe, visit: http://www.php.net/unsub.php


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

[PHP-DEV] Getting function namespace at runtime

2015-02-22 Thread guilhermebla...@gmail.com

Hi internals,

I came really close to reach the final state of my to be proposed private
class, interface and trait support here.
However, I have a bug under this circumstance:
https://gist.github.com/guilhermeblanco/3392925014c9f8374acc

I'd love if someone could give me a hand on how could I get the currently
active namespace (which does not exist at runtime, only compile time) in
order to do the checks inside of VM.
The only way I somehow through that could work was through something like
EX(called_scope) or EX(func) or EX(call).

Would love if someone could give me a north to finish the patch and put the
finalized RFC for voting. =)

[]s,

-- 
Guilherme Blanco
MSN: guilhermebla...@hotmail.com
GTalk: guilhermeblanco
Toronto - ON/Canada

RE: [PHP-DEV] JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints RFC)

2015-02-22 Thread Zeev Suraski

 -Original Message-
 From: Anthony Ferrara [mailto:ircmax...@gmail.com]
 Sent: Monday, February 23, 2015 1:35 AM
 To: Zeev Suraski
 Cc: PHP internals
 Subject: Re: [PHP-DEV] JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints
 RFC)

 Zeev,

  And note that this can only work with strict types since you can do
  the necessary type inference and reconstruction (both forward from a
  function call, and backwards before it).

  Please do explain how strict type hints help you do inference that you
  couldn't do with dynamic type hints.  Ultimately, your whole argument
  hinges on that, but you mention it in parentheses almost as an
 afterthought.
  I claim the opposite - you cannot infer ANYTHING from Strict STH that
  you cannot infer from Coercive STH.  Consequently, everything you've
  shown, down to the C-optimized version of strict_foo() can be
  implemented in the exact same way for very_lax_foo().  Being able to
  optimize away the value containers is not unique to languages with
  strict type hints.  It's done in JavaScript JIT engines, and it was done
  in our
 JIT POC.

 I do here: http://news.php.net/php.internals/83504

 I'll re-state the specific part in this mail:

 ?php declare(strict_types=1);
 function foo(int $int): int {
 return $int + 1;
 }

 function bar(int $something): int {
 $x = $something / 2;
 return foo($x);
 }

 ^^ In that case, without strict types, you'd have to generate code for
 both
 integer and float paths. With strict types, this code is invalid.

Ok, but how does that support your case?  That alludes to the functionality
difference between strict STH and dynamic STH, and perhaps your Static
Analysis argument.

How does it help you generate better code code?

Suggesting that the nature of a type hint can help you determine what's the
value that's going to be passed to it is akin to saying that the size and
shape of a door can tell you something about the person or beast that's
standing on the other side.  It just can't.

Let me illustrate it in a less colorful way.

Snippet 1:
  ... code that deals with $input ...
  foo($input);

  function foo(int $x)
  {
 ...
  }

Snippet 2:
  ... code that deals with $input ...
  foo($input);

  function foo(float $x)
  {
 ...
  }

Question:
What can you learn from the signatures of foo() in snippet 1 and 2 about the
type of $input?  Does the fact I changed the function signature from snippet
1 to 2 somehow affects the type of $input?  In what way?

If I understood you correctly, you're assuming that $input will too come
over using a strict type hint, which would tell you that it's an int and
therefore safe.  But a coercive type hint will do the exact same job.

 You can tell because you know the function foo expects an integer. So you
 can infer that $x will have to have the type integer due to the future
 requirement. Which means the expression $something / 2 must also be an
 integer. We know that's not the case, so we can raise an error here.

This is static analysis, not better code generation.  And it boils down to a
functionality difference, not performance difference.

Zeev

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] Coercive Scalar Type Hints RFC

2015-02-22 Thread Pierre Joye

Can you all of you stop this madness with moving discussions off list?

It is detestable, against almost all openness and principles behind an oss
project like php. If we can't discuss anymore design, plans, ideas etc on
the list then we are doomed, for good.
On Feb 22, 2015 3:49 PM, Anthony Ferrara ircmax...@gmail.com wrote:

 Adding in a thread that was started in private, but absolutely is
 worth sharing with the group:


 -- Forwarded message --
 From: Etienne Kneuss col...@php.net
 Date: Sun, Feb 22, 2015 at 8:42 AM
 Subject: Re: [PHP-DEV] Coercive Scalar Type Hints RFC
 To: Zeev Suraski z...@zend.com
 Cc: Anthony Ferrara ircmax...@gmail.com, Dmitry Stogov dmi...@zend.com




 On Sun Feb 22 2015 at 14:23:58 Zeev Suraski z...@zend.com wrote:
 
   There have been several attempts:
   for JS: http://users-cs.au.dk/simonhj/tajs2009.pdf
   or similar techniques applied to PHP, quite outdated though:
   https://github.com/colder/phantm
 
  Looks like WebKit's type inference is doing some pretty good job at
  analyzing code, although I'm not sure how much of it is static vs.
 dynamic.
  My guess is that a lot of it is static:
  twitter.com/kangax/status/558974257724940288


 My guess would be that it's almost entirely dynamic, or probabilistic
 (e.g. this nice recent work done at ETH: http://www.jsnice.org/).

 I think you underestimate the difficulty of statically recovering
 precise types from no-annotations without runtime witnesses ;) You
 don't want webkit anlysing the JS for 10 minutes until it renders the
 page. It is much more profitable to JIT these.

 
 
 
   You are right that the lack of static information about types is (one
 of
   the) a main issue. Recovering the types has typically a huge
 performance
   cost, or is unreliable
 
  We're not really talking about performance issue here, as static
 analysis is
  a separate activity that is unrelated to runtime performance.


 What I meant was: it is a performance issue for the static analyzer,
 not PHP itself.

 
 
   But seriously, time is getting wasted on this argument; it's actually a
   no-brainer: more static information helps tools that rely on static
   information. Yes. Absolutely. 100%.
 
  There's still disagreement between us on whether the different behavior
 of
  Strict STH constitutes additional static information or not, as it
 doesn't
  give you any extra information on the value being fed to the function,
 and
  it doesn't give you any extra information on what the function will
 receive.
  It only gives you information about how the function would behave if it
 gets
  a wrongly-typed value.



 1) for forward analyses (which are the most common for these
 applications): it gives you precious information from the beginning of
 the function and forward. You can consider it similarly to a cast: You
 don't necessarily know what the value coming in is, but you know which
 type you are having from that point forward.

 2) backward analyses could piggy-back the type constraints from the
 functions (strict or no strict) and check that they are met when
 constructing the value fed to the function.

 Having worked several years on static analysis tools for languages
 such as PHP, I can guarantee you that this information would help a
 lot. However, the other dynamic feature of PHP would still make
 analyses slow/unreliable/imprecise. Let's not imagine that this is the
 only thing missing for PHP to be static-analysis-wonderland, far from
 it.

 
  But my the bottom line is exactly the bottom line you ended with, and
 what I
  answered you on-list - how much weight should Static Analysis
 improvements
  have on our decision to introduce new language features?  My answer is
 not
  that much, if they have downsides.  Static Analyzers should be designed
 for
  languages and not vice versa.


 I fully agree in general that the flow should be this way. But it
 remains a bonus if a certain feature, as a plus, would help external
 tools. I believe it is worth mentionning..

 
  Thanks,
 
  Zeev

 --
 PHP Internals - PHP Runtime Development Mailing List
 To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] [RFC][DISCUSSION] Context Sensitive lexer

2015-02-22 Thread Marcio Almada

Hi, Stas

2015-02-22 19:20 GMT-03:00 Stanislav Malyshev smalys...@gmail.com:

 Hi!

 I like the idea. But we need to examine the cases carefully so we don't
 block some future routes - especially this is with regards to such
 things as type names which we wanted to reserve.

 I.e. method names resolution is probably clear, since they appear after
 - or ::, but for class names the context may be much more varied.
 --
 Stas Malyshev
 smalys...@gmail.com


I agree. You and Nikita are right. Doing more than that with a pure lexical
approach, without migrating to another lexer generator (which was already
attempted before) or using some form of lexer feedback (which at current
state breaks ext tokenizer) would be inadequate and create future issues.
I'll probably work on a more ambitious and adequate solution for PHP
7.1~7.2.

For now, as said before, I'll revert the RFC, and proposed patch, to
version 0.2 aiming only class|object members declaration and access. This
is perfectly achievable, has no drawbacks and brings many benefits. The RFC
will probably be ready for discussion again in ~2 days.

Thanks,
Márcio

Re: [PHP-DEV] JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints RFC)

2015-02-22 Thread Anthony Ferrara

Zeev,

 I think we are indeed getting somewhere, I hope.
 If I understand correctly, effectively the flow you're talking about in your
 example is this:

 1. The developers tries to run the program.
 2. It fails because the static analyzer detects float being fed to an int.
 3. The user changes the code to convert the input to int.
 4. You can now optimize the whole flow better, since you know for a fact
 it's an int.

 Did I describe that correctly?

Partially.

The static analysis and compilation would be pure AOT. So the errors
would be told to the user when they try to analyze the program, not
run it. Similar to HHVM's hh_client.

However, there could be a runtime compiler which compiles in PHP's
compile flow (leveraging opcache, etc). In that case, if the type
assertion isn't stable, the function wouldn't be compiled (the
external analyzer would error, here it just doesn't compile). Then the
code would be run under the Zend engine (and error when called).

 With strict typing at the foo() call site, it tells you that $input has to
 be an int
 or float (respectively between the snippets).

 I'm not following.
 Are you saying that because foo() expects an int or float respectively,
 $input has to be int or float?  What if $input is really a string?  Or a
 MySQL connection?

So think of it as a graph. When you start the type analysis, there's
one edge between $input and foo() with type mixed. Looking at foo's
argument, you can say that the type of that graph edge must be int.
Therefore it becomes an int. Then, when you look at $input, you see
that it can be other things, and therefore there are unstable states
which can error at runtime.

 Or are you saying that there was a strict type hint in the function that
 contains the call to foo(), so we know it's an int/float respectively?  If
 so, how would it be any different with a coercive type hint?

Not all data gets into a function from a parameter:

function bar() {
$x = $_POST['data'];
foo($x);
}

in that case, we know $x can only be a string or an array (unless we
find where that variable was written to in the program). So we know
for a fact that there's a type error, even though it wasn't a
parameter.

Going deeper, we can look at other cases:

function x() {
if (time() % 360  0) {
return 123;
}
}

function bar() {
$x = x();
foo($x);
}

In this case, we know that x() has two possible types: int/null. That
doesn't satisfy the valid possibilities for foo (int), hence there's a
possible type error.

The key difference is this: Forward analysis (typing $x by assignment)
can tell you valid modes for your program. Backward analysis
(determining $x's type by its usages) can tell you invalid modes for
your program. Combining them gives you more flexibility in
hard-to-infer/reconstruct situations.

Anthony

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

RE: [PHP-DEV] JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints RFC)

2015-02-22 Thread Zeev Suraski

 -Original Message-
 From: Anthony Ferrara [mailto:ircmax...@gmail.com]
 Sent: Monday, February 23, 2015 3:02 AM
 To: Zeev Suraski
 Cc: PHP internals
 Subject: Re: [PHP-DEV] JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints
 RFC)

 Zeev,

  I think we are indeed getting somewhere, I hope.
  If I understand correctly, effectively the flow you're talking about
  in your example is this:

  1. The developers tries to run the program.
  2. It fails because the static analyzer detects float being fed to an
  int.
  3. The user changes the code to convert the input to int.
  4. You can now optimize the whole flow better, since you know for a
  fact it's an int.

  Did I describe that correctly?

 Partially.

 The static analysis and compilation would be pure AOT. So the errors would
 be told to the user when they try to analyze the program, not run it.
 Similar
 to HHVM's hh_client.

How about that then:

1. The developers runs a static analyzer on the program.
2. It fails because the static analyzer detects float being fed to an int.
3. The user changes the code to convert the input to int.
4. You can now optimize the whole flow better, since you know for a fact
it's an int.

Is that an accurate flow?

 However, there could be a runtime compiler which compiles in PHP's
 compile flow (leveraging opcache, etc). In that case, if the type
 assertion
 isn't stable, the function wouldn't be compiled (the external analyzer
 would
 error, here it just doesn't compile). Then the code would be run under the
 Zend engine (and error when called).

Got you.  Is it fair to say that if we got to that case, it no longer
matters what type of type hints we have?

  With strict typing at the foo() call site, it tells you that $input
  has to be an int or float (respectively between the snippets).

  I'm not following.
  Are you saying that because foo() expects an int or float
  respectively, $input has to be int or float?  What if $input is really
  a string?  Or a MySQL connection?

 So think of it as a graph. When you start the type analysis, there's one
 edge
 between $input and foo() with type mixed. Looking at foo's argument, you
 can say that the type of that graph edge must be int.
 Therefore it becomes an int. Then, when you look at $input, you see that
 it
 can be other things, and therefore there are unstable states which can
 error
 at runtime.

So when you say it 'must be an int', what you mean is that you assume it
needs to be an int, and attempt to either prove that or refute that.  Is
that correct?
If you manage to prove it - you can generate optimal code.
If you manage to refute that - the static analyzer will emit an error.
If you can't determine - you defer to runtime.

Is that correct?

For now only focusing on these two parts so that we can make some progress;
May come back to others later...

Thanks,

Zeev

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints RFC)

2015-02-22 Thread Anthony Ferrara

Stas,

 It is still a performance advantage, because since we know the types
 are stable at compile time, we can generate far more optimized code
 (no variant types, native function calls, etc).

 I don't see where it comes from. So far you said that your compiler
 would reject some code. That doesn't generate any code, optimized or
 otherwise. For the code your compiler does not reject, still no
 advantage over dynamic model.

It rejects code because doing code generation on the dynamic case is
significantly harder and more resource intensive. Could that be built
in? Sure. But it's a very significant difference from generating the
static code.

And even if we generated native code for the dynamic code, it would
still need variants, and hence ZPP at runtime. Hence the static code
has a significant performance benefit in that we can indeed bypass
type checks as shown in the PECL example a few messages up (more than
a few).

 Actually, in this case, the int cast does tell us something. It says
 that the result (truncation) is explicitly wanted. Not to the compiler
 (tho that happens), but to the developer.

 No, it doesn't say that in this case. The developer didn't actually want
 truncation. They just wanted to call foo(). You forced them to use
 truncation because that's the only way to call foo() in your compiler.
 They said it's ok since truncation is over value that is int anyway, and
 they are true - except when it stops to be true in the future. That
 generates brittle code because it forces the developer to take risks
 they otherwise wouldn't take - such as use much stronger forced
 conversions instead of more appropriate dynamic ones.

Look at the RFC that Zeev proposed:
https://wiki.php.net/rfc/coercive_sth#user-land_additions

Passing a float to an integer parameter would result in a runtime
E_RECOVERABLE_ERROR if the float has dataloss.

So in the case I cited: foo($someint / 2), that will generate an
E_RECOVERABLE_ERROR in Zeev's proposal, as well as in the static
typing mode of mine.

Hence to say casts are needed is a bit over-stating this proposal...

 With coercive typing as proposed in Ze'ev's RFC, that would need to
 happen anyway. In both proposals that would generate a runtime error.

 No, it wouldn't need to happen since no-DL conversion is allowed.

Sure it would. 3/2 is 1.5. Which would fatal if I passed it to
foo(int) under Zeev's RFC. Because of data loss.

 The difference is, with strict types, we can detect the error ahead of
 time and warn about it.

 Static analyzer can warn about it regardless of type model. The only
 difference in strict model is that when compiling - not ahead of time,
 but in runtime - it would produce hard error even in case of even
 number, which can work just fine without it.

This very particular case, yes, because of the simplicity of the types
involved. But with strict typing you only need to look at 1 success
case, but with coercive typing you need to look at many more.

Also, in many (I'd argue most) cases coercive has to either issue a
warning (it doesn't know) or error on valid and functioning code.
Example:

function isdivisibleby2(string $foo): bool {
if (preg_match('(\D)', $foo)) {
return false;
}
return 0 == ($int % 2);
}

function something2(string $foo): int {
if (!isdivisibleby2($foo)) {
return 10;
}
return foo($foo / 2);
}

This code would never raise a runtime error in Zeev's coercive
proposal. However, when looking at it statically, you cant tell
(unless you've got a regex decompiler).

So static analysis on dynamic types will either error on valid code,
or not error on invalid code (and I'm not even talking about the
halting problem here).

Whereas with strict typing, the error would appear in both cases
(static and runtime). And you could fix it.

 In this precise example there is none, because division is not type

 That's what I am saying - if the code runs, there's no difference. The
 only difference your model runs less code, and forces (or, rather,
 strongly incentivizes) people to wrote more dangerous one because some
 of the non-dangerous one is not allowed.

More dangerous?

 stable (it depends on the values of its arguments). Let's take a
 different example

 function foo(float $something): int {
 return $something + 0.5;
 }

 With coercive types, you can't tell ahead of time if that will error
 or not. With static types, you can.

 I'm not sure what this proves. Yes, of course there are cases where
 strict typing (please let's not confuse it with static typing - these
 are different things, static typing is when everything's type is known
 in advance and this is not happening in PHP, that's kind of the whole
 point) would disallow some code that dynamic typing allows. Nobody
 argues with that. What I am arguing with is that this difference is
 somehow useful - especially for JIT optimizations.

I've shown it a few times in this thread. So far nobody has said not
possible to the code

Re: [PHP-DEV] [RFC] [FINAL DISCUSSION] Script only include/require

2015-02-22 Thread Yasuo Ohgaki

Hi Stas,

On Mon, Feb 23, 2015 at 7:00 AM, Stanislav Malyshev smalys...@gmail.com
wrote:

  I think this will be the final discussion before vote.
  This RFC is to make PHP stronger against script inclusion attacks just
 like
  other languages.
 
  https://wiki.php.net/rfc/script_only_include

 I still think this RFC takes a wrong road for the following reasons:

 1. Having any code in your app that allows to run include on
 user-controlled files (I'm not talking about filtered cases but user
 data controlling the path) is insecure and can not be made secure. It
 should just never be done. Trying to find workarounds for this is like
 safe_mode - good idea in theory, leads to worse security in practice.


This is mitigation proposal against script inclusions. The difference is
clear
by statistics.

Because this is mitigation, it does not aims to be a perfect solution. It
aims
to make PHP as secure as other languages.

I think system admins feel more comfortable with this change, too.
They know PHP programs are very weak against script inclusion attacks
compare to other languages.



 2. Default configuration would break tons of PHP scripts with extensions
 other than .php (very frequent case). The BC break potential of this is
 very big as it modifies core functionality.


Compatibility can be provided by one liner.

ini_set('zend.script_extensions', '.php .phar .inc .phtml .php4 .php5');

ini_set() does not emit any errors for non existing INIs.


3. Prohibiting phar uploads would also be a bc break, but more
 importantly, there still probably are ways to work around this by using
 phar files with extension different than .phar and then asking to
 include files within that phar file. As long as the eventual path would
 end in .php, your code would allow it.


Security is trade off relation, so I think this change acceptable trade off
to disable script inclusion (executing attacker programs).

Users can move uploaded files safely without move_uploaded_file() now.
I just made use of it to provide another mitigation, since script only
include
cannot be mitigation for uploading script files under docroot.

Also, the claim that move_upload_file() is obsolete is not based on
 anything as far as I can see. Why is it obsolete?


move_uplaoded_file() is needed for register_globals. Attacker could
specify source files (i.e. in $_FILES) other than uploaded files with
register_globals.

Current move_uploaded_file() checks source filename is really a
uploaded file's filename. It prevents moving other files, so it's not
completely useless but there is not real protections now because
values in $_FILES is safe now.

I know your point of view, but I hope you like this RFC.
Thank you for your comment. Your comments are very helpful to
come up with this RFC.

Regards,

--
Yasuo Ohgaki
yohg...@ohgaki.net

[PHP-DEV] Re: [RFC] Script only include/require

2015-02-22 Thread Yasuo Ohgaki

Hi Dmitry and Nikita,

On Mon, Feb 23, 2015 at 6:23 AM, Yasuo Ohgaki yohg...@ohgaki.net wrote:

 I wrote patch and made adjustment in the RFC
 https://wiki.php.net/rfc/script_only_include
 https://github.com/php/php-src/pull/
 Where to check filename extension is subject to be changed.
 At first, I thought implementing this as PHP code is good, but
 I've changed my mind. It seems better to be done in Zend code.
 Opinions are appreciated.

 This RFC aims to make PHP as secure as other languages
 with respect to script inclusion attacks.
 Note: File inclusion is not a scope of this RFC.

 INI Changes:
  - php_script - zend.script_extensions
  - Allow all files: * - NULL or 

 Open Issues:
  - Error type - Is it OK to raise E_ERROR/E_RECOVERABLE_ERROR in
zend_language_scanner.c?
  - Vote type - 50%+1 or 2/3

 If there is anyone who would like to vote no for this RFC,
 I would like to know the reason and try to address/resolve issue you have.

 Thank you.


We don't have care much about which error is raised from Zend engine, since
there
will be engine exception.

My questions are, is it ok to raise E_ERROR or E_RECOVERABLE_ERROR from
zend_language_scanner.c?

https://github.com/php/php-src/pull//files#diff-93ad74868f98ff7232ebea7c8b7fR624

Does engine exception catches error from zend_error_noreturn()?

Thank you.

Regards,

--
Yasuo Ohgaki
yohg...@ohgaki.net

[PHP-DEV] add me

2015-02-22 Thread gopal sharma

add me

Gopal Sharma

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

[PHP-DEV] [VOTE] Remove PHP 4 Constructors

2015-02-22 Thread Levi Morrison

Dear Internals,

I have moved the RFC for removing PHP 4 constructors[1] into voting
phase. As there are a lot of RFCs in discussion and voting right now I
will leave this RFC in voting phase until the evening (UTC-7) of March
6th which is 12 days away; this will hopefully allow everyone to be
able to review this RFC and vote on it without being rushed.

Cheers,

Levi Morrison


  [1]: https://wiki.php.net/rfc/remove_php4_constructors

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

[PHP-DEV] Type hints ...

2015-02-22 Thread Lester Caine

Silly question time again ...

Currently I have an array of variables and the docblock annotation tells
me just what each element is intended to be. I process the variables on
that basis and while it may be helpful to have some higher level of
'restraint', I have a working flexible system. As a variable is
processed it is constrained by the appropriate rules. If PHP adds 'Type
Hints' they will only apply to where I am passing an array variable, and
the type hint adds additional processing to that which I already
maintain myself. How will that improve performance?

Add to this equation that the type and constraints of a variable may
well vary from one record set to another. It may well be that a fixed
set of types can be defined, but these are not the types currently being
defined and would include date types in parallel with a group of numeric
types.

Passing 'strict' types in some cases just does not compute in my book,
and even 'coercive' types only addresses a subset of the types needed so
that it adds another layer of 'checking' over what we already have in
much of the existing user code base. People keep going on about
different rule sets but this just adds another set of 'rules' rather
than a single solution.

-- 
Lester Caine - G8HFL
-
Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk
Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints RFC)

2015-02-22 Thread Stanislav Malyshev

Hi!

 The difference though is the journey. The static analyzer can reason
 about far more code with strict types than it can without (due to the
 limited number of possibilities presented at each call). So this
 leaves the dilema: compiled code that behaves slightly differently
 (what Recki does) or whether it always behaves the same.

Wait, so are you saying that advantage of having strict typing in PHP
core is that some analyzer - which does not share code with PHP core,
AFAIU - if it interpreted PHP types in strict manner and provided
warnings where types it can statically deduce do not match and the
authors of the code agreed with its suggestions and rewrote their code
so that the analyzer would not complain, would in some cases result in
code that might be JIT-optimized more efficiently?

That is not a claim about strict typing in PHP core having any benefit
at all. I'm not sure even this claim is true (as adding (int) doesn't
actually improve performance - it just shifts around the place where the
conversion is done, and once conversion is done, you can do the same
optimizations as before) - but even if there's some situation where it
is true, I don't see how it makes difference for PHP core (even in
situation of PHP core + JIT extension or non-Zend PHP runtime with
AOT/JIT).

-- 
Stas Malyshev
smalys...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] Allow to use argument unpacking at any place in arguments list

2015-02-22 Thread Stanislav Malyshev

Hi!

 This makes it impossible to use this feature with some of the ext/std
 functions (array_udiff, array_interect_ukey, etc.) and just feels a bit
 incomplete...

I see how it can be useful with crazy functions like array_udiff, but
these are in tiny minority. What I am concerned about is that besides
those functions - which are weird anyway - code like foo($a, ...$b, $c)
would be completely unmanageable as it would be impossible to know where
$c is actually going. I think the case for weird array functions is
pretty narrow and can be handled in ad-hoc manner without introducing
this construct.

 I'm not sure if this change requires an RFC because this is a pretty
 small, advancement of already existing feature that doesn't contain any

It's a new syntax (yes, looking a lot like an old one, but still new),
so I think it requires an RFC.
-- 
Stas Malyshev
smalys...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

RE: [PHP-DEV] JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints RFC)

2015-02-22 Thread Zeev Suraski

 -Original Message-
 From: Jefferson Gonzalez [mailto:jgm...@gmail.com]
 Sent: Monday, February 23, 2015 3:58 AM
 To: Stanislav Malyshev; Anthony Ferrara
 Cc: Zeev Suraski; Jefferson Gonzalez; PHP internals
 Subject: Re: [PHP-DEV] JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints
 RFC)

 How casting (int) could be such dangerous thing? Lets take for example
 this
 code:

 echo (int) whats cooking!;
 echo intval(whats cooking);

 Both statements print 0, so how is casting unsafe???

One key premise behind both strict type hinting and coercive type hinting is
that conversions that lose data, or that 'invent' data, are typically
indicators of a bug in the code.

You're right that there's no risk of a segfault or buffer overflow from the
snippets you listed.  But there are fair chances that if you fed $x  into
round() and it contains whats cooking (string), your code contains a bug.

Coercive typing allows 'sensible' conversions to take place, so that if you
pass 35.7 (string) to round() it will be accepted without a problem.
Strict typing will disallow any input that is not of the exact type that the
function expects, so in strict mode, round() will reject it.  The point that
was raised by Stas and others is that this is likely to push the user to
explicitly cast the string to float;  Which from that point onwards, happily
accept whats cooking, keeping the likely bug undetected.

Zeev

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints RFC)

2015-02-22 Thread Stanislav Malyshev

Hi!

 How is coercive much smarter? Basically what coercive would do is

It can accept 2.0 but not 2.5. Explicit cast is a sledgehammer - it
would convert both to 2.


 How casting (int) could be such dangerous thing? Lets take for example
 this code:
 
 echo (int) whats cooking!;
 echo intval(whats cooking);
 
 Both statements print 0, so how is casting unsafe???

Casting by itself is not dangerous. What is dangerous is using casting
to work around type system - since in this case it could hide an error
(such as passing string whats cooking! to function requiring integer).
Of course, you can say such errors are of no importance to you - in
which case you should never use typed parameters at all and you'll be
fine :) (mostly)
-- 
Stas Malyshev
smalys...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints RFC)

2015-02-22 Thread Jefferson Gonzalez


On 02/22/2015 10:06 PM, Zeev Suraski wrote:

One key premise behind both strict type hinting and coercive type hinting is
that conversions that lose data, or that 'invent' data, are typically
indicators of a bug in the code.

You're right that there's no risk of a segfault or buffer overflow from the
snippets you listed.  But there are fair chances that if you fed $x  into
round() and it contains whats cooking (string), your code contains a bug.

Coercive typing allows 'sensible' conversions to take place, so that if you
pass 35.7 (string) to round() it will be accepted without a problem.
Strict typing will disallow any input that is not of the exact type that the
function expects, so in strict mode, round() will reject it.  The point that
was raised by Stas and others is that this is likely to push the user to
explicitly cast the string to float;  Which from that point onwards, happily
accept whats cooking, keeping the likely bug undetected.


Thats true, but I think where most problems will rise is when dealing 
with user input, example:


Good url
myurl.com/?id=10

Bad url
myurl.com/?id=somehing+else

So in the url example neither coercive or strict are safe, IMHO you as a 
developer should analyze the input and decide what to do if the value 
isn't of an expected type.


On strict you as a developer decide if casting is an accepted behavior, 
like when dealing with database output which may return values as 
string, or reading from config files, but you know the value is (int) 
compatible, so the casting is safe. Besides, in the v0.5 STH RFC the 
strict mode is optional.


I think both RFC's should join, dual mode coercive/strict :), but I 
guess that will not be possible until Anthony convinces the coercive 
camp how strict could be used to do better optimizations. Unless it 
happens the other way around and is proved with code/patches that same 
level of optimizations can be reached with coercive.


Anyway I just hope for scalar type hints, not just to improve code 
reliability, but also to gain some performance out of it. At the end I 
wish the best option is implemented since this is a really impacting 
feature for the future of the language.


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints RFC)

2015-02-22 Thread Anthony Ferrara

Ze'ev,

On Sun, Feb 22, 2015 at 6:57 PM, Zeev Suraski z...@zend.com wrote:
 -Original Message-
 From: Anthony Ferrara [mailto:ircmax...@gmail.com]
 Sent: Monday, February 23, 2015 1:35 AM
 To: Zeev Suraski
 Cc: PHP internals
 Subject: Re: [PHP-DEV] JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints
 RFC)

 Zeev,

  And note that this can only work with strict types since you can do
  the necessary type inference and reconstruction (both forward from a
  function call, and backwards before it).

  Please do explain how strict type hints help you do inference that you
  couldn't do with dynamic type hints.  Ultimately, your whole argument
  hinges on that, but you mention it in parentheses almost as an
 afterthought.
  I claim the opposite - you cannot infer ANYTHING from Strict STH that
  you cannot infer from Coercive STH.  Consequently, everything you've
  shown, down to the C-optimized version of strict_foo() can be
  implemented in the exact same way for very_lax_foo().  Being able to
  optimize away the value containers is not unique to languages with
  strict type hints.  It's done in JavaScript JIT engines, and it was done
  in our
 JIT POC.

 I do here: http://news.php.net/php.internals/83504

 I'll re-state the specific part in this mail:

 ?php declare(strict_types=1);
 function foo(int $int): int {
 return $int + 1;
 }

 function bar(int $something): int {
 $x = $something / 2;
 return foo($x);
 }

 ^^ In that case, without strict types, you'd have to generate code for
 both
 integer and float paths. With strict types, this code is invalid.

 Ok, but how does that support your case?  That alludes to the functionality
 difference between strict STH and dynamic STH, and perhaps your Static
 Analysis argument.

 How does it help you generate better code code?

Because strict types makes that an error case. So I can then tell the
user to fix it. Once they do (via cast, logic change, etc), I know the
types of every variable the entire way through. So I can generate
native code for both calls without using variants.

 Suggesting that the nature of a type hint can help you determine what's the
 value that's going to be passed to it is akin to saying that the size and
 shape of a door can tell you something about the person or beast that's
 standing on the other side.  It just can't.

It just can't yet it's done all the time. There is working code in
the wild that does exactly that.

 It doesn't tell you what's on the other side (which you seem to be
suggesting), but gives you the possibilities **that won't cause
error**.

So then if you find a possibility from the other direction that isn't
in the set of stable possibilities, you can tell the user (because
that would be a runtime error). The division case in the example shows
that.

 Let me illustrate it in a less colorful way.

 Snippet 1:
   ... code that deals with $input ...
   foo($input);

   function foo(int $x)
   {
  ...
   }

 Snippet 2:
   ... code that deals with $input ...
   foo($input);

   function foo(float $x)
   {
  ...
   }

 Question:
 What can you learn from the signatures of foo() in snippet 1 and 2 about the
 type of $input?  Does the fact I changed the function signature from snippet
 1 to 2 somehow affects the type of $input?  In what way?

With strict typing at the foo() call site, it tells you that $input
has to be an int or float (respectively between the snippets).

And as the static analyzer traces back, if it finds possibilities that
don't match (for example, if you assigned it directly from $_POST),
it's able to say that either the original assignment or the function
call is an error.

So yes, it does affect the stable-state types that $input can have.
And if we detect an error, we can tell the dev ahead of time about it.
And hence they can make the appropriate fix.

 If I understood you correctly, you're assuming that $input will too come
 over using a strict type hint, which would tell you that it's an int and
 therefore safe.  But a coercive type hint will do the exact same job.

No. I'm assuming that $input came from something that we can infer a
type set from. Which is basically anything in the language.

 You can tell because you know the function foo expects an integer. So you
 can infer that $x will have to have the type integer due to the future
 requirement. Which means the expression $something / 2 must also be an
 integer. We know that's not the case, so we can raise an error here.

 This is static analysis, not better code generation.  And it boils down to a
 functionality difference, not performance difference.

That static analysis enables better code generation. Which is
precisely what I said in an earlier post:
http://news.php.net/php.internals/83501 And I showed an example of the
better code generation.

I hope that makes my point a little clearer,

Anthony

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

RE: [PHP-DEV] JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints RFC)

2015-02-22 Thread Zeev Suraski

 -Original Message-
 From: Anthony Ferrara [mailto:ircmax...@gmail.com]
 Sent: Monday, February 23, 2015 2:25 AM
 To: Zeev Suraski
 Cc: PHP internals
 Subject: Re: [PHP-DEV] JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints
 RFC)

 Ze'ev,

It's Zeev, thanks :)

 Because strict types makes that an error case. So I can then tell the user
 to
 fix it. Once they do (via cast, logic change, etc), I know the types of
 every
 variable the entire way through. So I can generate native code for both
 calls
 without using variants.

I think we are indeed getting somewhere, I hope.
If I understand correctly, effectively the flow you're talking about in your
example is this:

1. The developers tries to run the program.
2. It fails because the static analyzer detects float being fed to an int.
3. The user changes the code to convert the input to int.
4. You can now optimize the whole flow better, since you know for a fact
it's an int.

Did I describe that correctly?

 With strict typing at the foo() call site, it tells you that $input has to
 be an int
 or float (respectively between the snippets).

I'm not following.
Are you saying that because foo() expects an int or float respectively,
$input has to be int or float?  What if $input is really a string?  Or a
MySQL connection?
Or are you saying that there was a strict type hint in the function that
contains the call to foo(), so we know it's an int/float respectively?  If
so, how would it be any different with a coercive type hint?

 I hope that makes my point a little clearer,

It actually does, I hope.  I think we are getting somewhere, but we're not
quite there yet.

Thanks,

Zeev

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

RE: [PHP-DEV] JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints RFC)

2015-02-22 Thread Zeev Suraski

 -Original Message-
 From: Anthony Ferrara [mailto:ircmax...@gmail.com]
 Sent: Monday, February 23, 2015 3:43 AM
 To: Stanislav Malyshev
 Cc: Zeev Suraski; Jefferson Gonzalez; PHP internals
 Subject: Re: [PHP-DEV] JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints
 RFC)

 Stas,

  It is still a performance advantage, because since we know the types
  are stable at compile time, we can generate far more optimized code
  (no variant types, native function calls, etc).

  I don't see where it comes from. So far you said that your compiler
  would reject some code. That doesn't generate any code, optimized or
  otherwise. For the code your compiler does not reject, still no
  advantage over dynamic model.

 It rejects code because doing code generation on the dynamic case is
 significantly harder and more resource intensive. Could that be built in?
 Sure.
 But it's a very significant difference from generating the static code.

I hope I demonstrated in the other email that lists how two use cases would
look with coercive type hints, the strategies and implementation of doing
the optimizations for those cases where we can infer the type in compile
time, would be similar in cost, complexity and resource consumption to the
optimizations you're talking about.  Even if we keep the handling of all
other types as AOTless/JITless, it would still have performance equivalent
to the strict case, given inputs that the strict case accepts.

Either way, I'm happy we all agree that equally-efficient code can be
generated for the dynamic case, which is the point I was making in the
Coercive typing RFC.  We still have the gap on whether it's truly a lot
harder and resource intensive - I don't think it is as we can do the very
same things in compile-time - but that's a smaller gap that I personally
care less about.  I wanted it to be clear to everyone that we can reach the
same level of optimizations for Coercive type hints as we can for Strict.

 And even if we generated native code for the dynamic code, it would still
 need variants, and hence ZPP at runtime. Hence the static code has a
 significant performance benefit in that we can indeed bypass type checks
 as
 shown in the PECL example a few messages up (more than a few).

We can only eliminate the ZPP structure during compile time if we know with
certainty what the type is.  If we do, we know that for both strict type
hints and coercive type hints (i.e. we either managed to prove it's an int
in the static analyzer in the strict case, or we managed to deduce what the
type is in the coercive case).  If we don't - we the ZPP structure it in
exactly the same way.

Thanks,

Zeev

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] JIT (was RE: [PHP-DEV] Coercive Scalar Type Hints RFC)

2015-02-22 Thread Anthony Ferrara

Zeev,

 So the code after the fix would look like this:
 ?php declare(strict_types=1);
 function foo(int $int): int {
 return $int + 1;
 }

 function bar(int $something): int {
 $x = (int) $something / 2;  // (int) or whatever else makes it clear
 it's an int
 return foo($x);
 }
 ?

 Let me explain how this could play out with coercive type hints:
 ?php
 function foo(int $int): int {
 return $int + 1;
 }

 function bar(int $something): int {
 $x = $something / 2;
 return foo($x);
 }

 We can all agree that determining the types of just about anything here is
 ultra-easy, so easy you could do it with a static analyzer, as you
 suggested.  $int and $something are integers, while $x is either an integer
 or a float.  We also know that both foo() and bar() expect integers.

 What's the optimal code we could generate here?
 First, on the function body of foo(), we can clearly and easily translate
 the whole into machine code, as we know we'll get a long and need to return
 a long.
 Moving to the caller scope in bar(), given we know $x is either a float or
 an integer, we could either generate code that calls coerce_to_int($x), or
 even some optimize machine code that checks zval.type and either uses the
 lval or converts dval.  This can be done in AOT, no need to wait for
 runtime.  Once we know for a fact we have an integer in our hands - we can
 make the call directly to the optimized foo(), a C level call without the
 overhead of a PHP function call.

Well, yes and no.

In this simple example, you could generate the division as float
division, then checking the mantissa to determine if it's an int.

long bar(long something) {
double x = something / 2;
if (x != (double)(long)x) {
raise_error();
}
return foo((long) x);
}

You're still doubling the number of CPU ops and adding at least one
branch at runtime, but not a massive difference.

However in general you'd have to use something like div_function and
use a union type of some sort. You mention this (about checking
zval.type at runtime). My goal would be to avoid using unions at all
(and hence no zval). Because that drastically simplifies both compiler
and code generator design.

Especially for a JIT compiler (local, not tracing) simplified design
generally translates to significantly faster runtime. Compare LLVM to
libjit: 50x difference in compile time.

 If you look at the generated code, it's going to be remarkably similar
 between the two cases.  If the developer chooses to pick the casting route,
 it will look almost identical - except it will be convert_to_long() that is
 called instead of coerce_to_int(), the former being more aggressive than the
 latter.

I wouldn't even bother with that, I'd just use a C cast (well, the ASM
equivalent). Saves function calls, zval representation, etc.

 Can you see anything impossible or otherwise wrong with my description of
 how the AOT compiler would work in this case, with coercive type hints?  If
 not, there are no performance benefits for the Strict typed version after
 the user alters his code to behave similarly to what coercive type hints
 would bring.

It's very much not about impossible. It's about complexity. Strict
code is easier to reason about, it's easier to analyze and it's easier
to code-generate because all of the reduced amount that you need to
support. And we're not talking about making users change their code
drastically. We're talking about -in many cases- minor tweaks.

Minor tweaks that would need to be done with your proposal as well. So
if we're going to require users change their code, why not make it
opt-in and give them the predictability that we can?

  Got you.  Is it fair to say that if we got to that case, it no longer
  matters what type of type hints we have?

 Once you get to the end, no. Recki-CT proves that.

 Do you mean that the statement is unfair or that it no longer matters?   If
 it's the former, can you elaborate as to why?

No, I meant that Recki proves what you said (once you get to a stable
type analysis of even untyped code it doesn't matter the hints exist
or not).

 
  So when you say it 'must be an int', what you mean is that you assume
  it needs to be an int, and attempt to either prove that or refute
  that.  Is that correct?
  If you manage to prove it - you can generate optimal code.
  If you manage to refute that - the static analyzer will emit an error.
  If you can't determine - you defer to runtime.
 
  Is that correct?

 Basically yes.

 Let me describe here too how it may look with coercive hints.  Instead of
 beginning with the assertion that it must be an int, we make no guess as to
 what it may be(*).  We would use the very same methods you would use to
 prove or refute that it's an int, to determine whether it's an int.  Our
 ability to deduce that it's an int is going to be identical to your ability
 to prove that it's an int.  If we see that it comes from an int type hint,
 from an int typed

87 matches

Mail list logo