Hi folks,

Thanks to Ilia for getting to ball rolling on scalar type hinting.

It seems there are 3 camps:
  - (C) the type checking camp: "when I say 'int' I mean 'int'". This
is what Ilia's patch does.
  - (H) the type hinting crowd: 'int' is a hint to the user that an
int is expected. This gels well with PHP's weakly typed scalars. I
think few people are in this crowd, but a lot of the (S) crowd are
mistakenly thought to be.
  - (S) the "sensible" middle: 'int' means an integer of course. The
manual is written somewhere between (S) and (H).

I believe I have a solution that caters to each crowd, without being
too complicated.

There are advantages and disadvantage to all of these:

 - The main disadvantage of each system is that it doesnt provide what
the other systems allow. Strong is too strong for many. Weak is too
weak for most.
 - Ilia had a very good point against (H), which is that many
functions return NULL or FALSE, and there are lots of errors when
these are automatically (and silently) converted to 0 or "". (H) will
not catch anything.
 - A strong argument against (C) is that this currently has no
parallel with how scalars are handled in PHP currently.
 - A (I think weak) argument for (C) is that this is how object type
hinting works
 - An argument for (H)/(S) is that the manual has been written in this
style, using this syntax.
 - A good argument against (C) is that it cannot be used to hint PHP's
builtin functions.
 - The (C) crowd suggested numeric and scalar to the (H) crowd, but I
dont think they were impressed.
 - I dont think there is a strong case for a strongly typed bool.


Here is the solution:

By default, use (S). The semantics of (S) are roughly provided in a
table at the bottom. The idea is that for ints, we take "5", and 5,
and fail on "str", FALSE, resource, etc.

Allow a very easy way to get (C) and (H) using '+' and '-'. "+int"
means fail on anything but an int. This is (C). "-int" means "I expect
an int, but I'll take whatever you give me, and cast it to an int".
This is (H). (H) is for those times where neither (C) nor (S) are
suitable, which occurs in the standard library a lot. I hope that it
wouldnt be used much otherwise.

With each case, the function author can expect that they if they ask
for X, they will get an X.

I think numeric isnt required anymore, which is good.

Example:

function add_user (+string name, string phone_number, int age, +int
friend_count, resource photo) { ... }




We may bike shed for a while about the choice of +/- vs "strict int"
or "weak int", as well as some of the choices in (S). Lets argue about
the overall idea first, and get to specifics later.

If people like this, I can work on the patch.


Thanks,
Paul


***** This is a suggested semantics for (S) ********

Each line is in the form: "Run-time type -> type hint = result". You
may read "x -> y = z" as "an x passed to a hinted parameter y gives a
z". * means all types I didn't mention explicitly. ?? means reasonable
people may disagree. I would lean towards FAIL in these cases.


array -> array = array
* -> array = FAIL

numeric string -> int = cast to int
real -> int = cast to int
int -> int = int
* -> int = FAIL

int -> numeric = int
real -> numeric = real
string -> numeric = real/int
bool -> numeric = ??
* -> numeric = FAIL

int -> bool = bool
bool -> bool = bool
null -> bool = false
real -> bool = bool
string -> bool = bool
* -> bool = ??


null -> null = null
* -> null = FAIL


array -> scalar = FAIL
int -> scalar = int
bool -> scalar = bool
null -> scalar = null
real -> scalar = real
string -> scalar = string
resource -> scalar = FAIL
object -> scalar = FAIL
MyObj -> scalar = FAIL

* -> mixed = *

int -> real = real
real -> real = real
numeric string -> real = real
* -> real = FAIL

array -> string = FAIL
int -> string = string
bool -> string = FAIL
null -> string = FAIL
real -> string = string
string -> string = string
resource -> string = FAIL
object -> string = __toString() or FAIL


resource -> resource = resource
* -> resource = FAIL

object -> object = object
MyObj -> object = MyObj
* -> object = FAIL

MyObj -> MyObj = MyObj
* -> MyObj = FAIL



***** This is a suggested semantics for (H) ********

Whatever is passed will be cast to whatever you ask for, using
existing casting rules, even if thats stupid.


***** This is a suggested semantics for (H) ********

If you ask for X, it must be X, except:
object with __toString() -> string = string

Anything else is FAIL (which I believe is an E_RECOVERABLE_ERROR).



-- 
Paul Biggar
paul.big...@gmail.com

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to