Re: [PHP-DEV] [RFC] UString

2015-07-01 Thread Sara Golemon
On Tue, Jun 30, 2015 at 10:36 PM, Joe Watkins pthre...@pthreads.org wrote:
 Another possible issue is engine integration:

 $string = (UString) $someString;
 $string = (UString) someString;

That sounds as a cool idea to discuss as a completely separate,
unrelated RFC, and not specific to UString.

e.g.   $obj = (ClassName)$arg;   /* turns into */ $obj = new ClassName($arg);

So you could use casting with any class which supports single-argument
constructors.

But again, orthogonal to this RFC.

-Sara

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Improved zend_string API

2015-07-01 Thread Bob Weinand

 Am 30.06.2015 um 20:16 schrieb François Laupretre franc...@php.net:
 
 De : morrison.l...@gmail.com [mailto:morrison.l...@gmail.com] De la part
 
 Just to time in with my $0.02: I feel that using macros as an
 abstraction in this case is bad practice. I believe that in *most*
 cases macros as an abstraction is a bad practice. Furthermore, there
 isn't any reason that `zend_string_*` functions cannot act as an
 abstraction layer since zend_string's are passed by pointer.
 
 Agreed. That's why ZSTR_VAL() and ZSTR_LEN() are functions now. Macros don't 
 provide enough isolation. The choice of renaming 'zend_string_' to 'ZSTR_' is 
 just a question of name consistency. The most important is that these are 
 functions.
 
 Zend_string are passed by pointer and, in theory, this type should be opaque, 
 (void *) for instance.
 
 @Bob - I remember an idea I had, that I should discuss with Dmitry, and which 
 can be implemented without any change to the proposed API. The idea is to 
 return the address of the string instead of the address of the struct. This 
 would allow using this address for the zend_string API and for any other 
 function expecting a plain (char *) address. Z_STRVAL() and '-val' would 
 become useless, of course. The other struct elements would just be below the 
 used address (as it is done in malloc()). In this case, the calling code must 
 consider the address as, either an opaque value that can be passed to the 
 zend_string API, or the address of an allocated memory buffer that can be 
 read and written, up to the declared size. This is possible only through an 
 encapsulated API. Compare it with malloc(), as both would use a similar 
 mechanism. When you call malloc() on a system, you don't care about the 
 underlying allocated structure, and it may be very different on different 
 systems. This is the same, a malloc() with a pair of additional features. You 
 don't have to know more about the implementation.

Interesting idea (from the concept). I even could get on board with this; I'm 
just not that sure about the performance impact. (alignment of refcount to base 
address etc.)

Also, you have to make sure, to not accidentally pass a real char * array to 
something that expects a char * from a zend_string *… which is where I'm not so 
sure if we can do it. It may loose some typing security :-/

 Working on the allocation scheme just requires to store a new 'allocated 
 size' element , which does not cost much and can avoid a lot of costly 
 [e]realloc() calls. Then new functions may be defined, if needed, to control 
 allocation policy. All of this doesn't require changing the existing API, 
 these are just additions. I really don't understand why you are so sure that 
 any change to the internal representation will require changes to the API. 
 New functions can be added, yes, but we can improve a lot of things while 
 keeping the same API.

It's just my experience… I might be wrong, but that's what I experienced 
generally for such low-level structures.

Also wrappers around such low-level structures often tend to be leaky or be 
just such a shitload of individual functions that nobody ever can remember it 
before having worked with it all day for a few months.
(I mean leaky in a sense that you maybe are able to do everything, but not 
quite in the performant way we like to)

 You base all your examples on the fact that zend_string represents a 
 structure. I don't assume anything at this level. Maybe we'll find that 
 performance is better with a 2-level storage, storing fixed-length 
 information in a pre-allocated array, for example, and storing the strings 
 elsewhere. I leave it as open as possible, while you prefer constraints just 
 because you cannot imagine today how it can evolve tomorrow.

That one even would not be an issue with current zend_string *. zstr-val is a 
char *. Whether the char pointer is now zend_string * + 24 or elsewhere doesn't 
matter and user code also shouldn't rely on *that*.

 About hash values, nobody said we should automatically reset the hash value 
 any time something is written. And you're wrong : we don't end up controlling 
 the hash value manually. We control it through two well-defined methods. This 
 is not low-level control, not the same as using 'zstr-h', for instance. It 
 is part of the API, nothing shocking there. And nothing says that someone 
 won't find some way to make hash management 'smarter', without doing millions 
 of useless operations. There may be new operations but, once again, the 
 existing ones will remain unchanged.
 
 As a conclusion, the zend_string API I propose provides some isolation, but 
 you will be glad to know that it is not as advanced as I'd like, mostly for 
 historical reasons. As an example, I am sure we will be annoyed by the 
 'persistent' argument to init/alloc/realloc. For init and alloc, it would be 
 better to have a flag mask, allowing other flags to be defined in the future. 
 We'll need this, 

Re: [PHP-DEV] [RFC] UString

2015-07-01 Thread Aaron Piotrowski

 On Jul 1, 2015, at 1:06 PM, Sara Golemon poll...@php.net wrote:
 
 On Tue, Jun 30, 2015 at 10:36 PM, Joe Watkins pthre...@pthreads.org wrote:
 Another possible issue is engine integration:
 
$string = (UString) $someString;
$string = (UString) someString;
 
 That sounds as a cool idea to discuss as a completely separate,
 unrelated RFC, and not specific to UString.
 
 e.g.   $obj = (ClassName)$arg;   /* turns into */ $obj = new ClassName($arg);
 
 So you could use casting with any class which supports single-argument
 constructors.
 
 But again, orthogonal to this RFC.
 
 -Sara
 
 -- 
 PHP Internals - PHP Runtime Development Mailing List
 To unsubscribe, visit: http://www.php.net/unsub.php
 

Expanding on this idea, a separate RFC could propose a magic __cast($value) 
static method that would be called for code like below:

$obj = (ClassName) $scalarOrObject; // Invokes 
ClassName::__cast($scalarOrObject);

This would allow UString to implement casting a string to a UString and allow 
users to implement such behavior with their own classes.

However, I would not implement such casting syntax for UString only. Being able 
to write $ustring = (UString) $string; without the ability to do so for other 
classes would be unusual and confusing in my opinion. If an RFC adding such 
behavior was implemented, UString could be updated to support casting.

Obviously a UString should be able to be cast to a scalar string using (string) 
$ustring. If performance is a concern, UString::__toString() should cache the 
result so multiple casts to the same object are quick.

Aaron Piotrowski
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP-DEV] [RFC] UString

2015-07-01 Thread Anatol Belski
Hi,

 -Original Message-
 From: Aaron Piotrowski [mailto:aa...@icicle.io]
 Sent: Wednesday, July 1, 2015 9:00 PM
 To: Sara Golemon
 Cc: pthre...@pthreads.org; internals@lists.php.net
 Subject: Re: [PHP-DEV] [RFC] UString
 
 
  On Jul 1, 2015, at 1:06 PM, Sara Golemon poll...@php.net wrote:
 
  On Tue, Jun 30, 2015 at 10:36 PM, Joe Watkins pthre...@pthreads.org
 wrote:
  Another possible issue is engine integration:
 
 $string = (UString) $someString;
 $string = (UString) someString;
 
  That sounds as a cool idea to discuss as a completely separate,
  unrelated RFC, and not specific to UString.
 
  e.g.   $obj = (ClassName)$arg;   /* turns into */ $obj = new
ClassName($arg);
 
  So you could use casting with any class which supports single-argument
  constructors.
 
  But again, orthogonal to this RFC.
 
  -Sara
 
  --
  PHP Internals - PHP Runtime Development Mailing List To unsubscribe,
  visit: http://www.php.net/unsub.php
 
 
 Expanding on this idea, a separate RFC could propose a magic
__cast($value)
 static method that would be called for code like below:
 
 $obj = (ClassName) $scalarOrObject; // Invokes
 ClassName::__cast($scalarOrObject);
 
 This would allow UString to implement casting a string to a UString and
allow
 users to implement such behavior with their own classes.
 
 However, I would not implement such casting syntax for UString only. Being
able
 to write $ustring = (UString) $string; without the ability to do so for
other classes
 would be unusual and confusing in my opinion. If an RFC adding such
behavior
 was implemented, UString could be updated to support casting.
 
 Obviously a UString should be able to be cast to a scalar string using
(string)
 $ustring. If performance is a concern, UString::__toString() should cache
the
 result so multiple casts to the same object are quick.
 
One way doing this is already there thanks
https://wiki.php.net/rfc/operator_overloading_gmp . Consider

$n = gmp_init(42); var_dump($n, (int)$n);

However the other way round - could be done on case by case basis, IMHO.
Where it could make sense for class vs scalar, casting class to class is a
quite unpredictable thing.

While users could implement it, how is it handled with arbitrary objects?
How would it map properties, would those classes need to implement the same
interface, et cetera? We're not in C at this point, where we would just
force a block of memory to be interpreted as we want.

Regards

Anatol



-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP-DEV] Improved zend_string API

2015-07-01 Thread François Laupretre
 De : Bob Weinand [mailto:bobw...@hotmail.com]

 Interesting idea (from the concept). I even could get on board with this; I'm
 just not that sure about the performance impact. (alignment of refcount to
 base address etc.)

zend_string *s;

s = zend_string_alloc(256, 0);
...
ZSTR_SET_LEN(s, snprint(s, ZSTR_LEN(s)+1, format, args...); /* 
Overflow-protected */
...
hash = ZSTR_HASH(s);
...
char *p = estrndup(s, ZSTR_LEN(s)); /* Result is a pure 'char *', not a 
zend_string */
...
ZSTR_RELEASE(s);

Isn't it nice ?

You're right, refcount alignement is the serious issue to solve. Peformance 
needs to be tested of course, but we compute '-val' much more often than we 
use the struct base address (for realloc/free only except when accessing gc). 
Accessing length and hash using an offset has the same cost as before.

 Also, you have to make sure, to not accidentally pass a real char * array to
 something that expects a char * from a zend_string *… which is where I'm
 not so sure if we can do it. It may loose some typing security :-/

The zend_string type would remain, even if typedef-ed to 'char'. So, 
'zend_string *' declarations would remain, they wouldn't become 'char *' (just 
a question of declarations because it would work too). In debug mode, we also 
can add a marker in the structure to detect when we receive an invalid address, 
as it is sometimes done in memory management libs.

 Also wrappers around such low-level structures often tend to be leaky or be
 just such a shitload of individual functions that nobody ever can remember it
 before having worked with it all day for a few months.
 (I mean leaky in a sense that you maybe are able to do everything, but not
 quite in the performant way we like to)

I agree, that's often the case but, even when performance impose low-level 
control, an appropriate abstraction layer often allows for a cleaner future. 
What I fear most is not this, it's having an idea impossible to implement 
because people designing the API did not think wide enough when it was time. 
The PHP history is full of such changes which required a lot of time and 
energy, just because everyone had been working during years with too 
short-sighted APIs. The art of APIs id to find the best compromise between 
usability, performance, and extensibility, knowing there's no ideal solution. 
And one additional difficulty is that many people think it is easy !

 Meh, that persistent/non-persistent is annoying me a bit too sometimes.
 I don't disagree that we should change that. We could e.g. use GC_TYPE()
 and add our custom flag here. But maybe it's there to be able to tell compiler
 what branch in perealloc() will be taken, so that that branch can be compiled
 out.
 Abstraction APIs are a powerful tool, but they come at a price…

Actually, the information is already stored in the struct, there's nothing to 
add. zend_string_release/free() functions even use it to determine which kind 
of mem they are freeing. It is not even consistent because only 
realloc/extend/truncate use this 'useless' arg. Actually, the only reason given 
is too optimize compilation. I know that everything has a price but I think 
that's going too much on the performance side. An argument with a single value 
allowed is not an argument, IMO. Remember, if the wrong value is given, your 
program crashes ! Anyway, I'd first like to measure the performance gains/loss 
of such choices.

 But issue is a bit now that with 7.1 we don't want to do major API changes.
 We're *allowed* to, but we shouldn't.

While the politics of BC breaks at the PHP level is now quite well-defined, it 
is less clear at the C level. During 5.x, we saw several macros and functions 
disappear without notice, and nobody seems so surprised about it. IMO, the 
hardest part is to get a consensus on such changes.

 I don't feel controlled by Zend. The only major appearance from Zend was in
 scalar types discussion in form of Zeev. Err yeah, and maybe the PHP 7 name.
 Hah.
 Yes, it's maybe Zend which pushed Dmitry to optimize PHP to death, I have
 no idea. But that's a good thing, stop ranting against that :-P

I don't say I'm controlled by Zend. I say we shouldn't have given the phpng 
whitecard with so few supervision and control from the community.

Regards

François


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] UString

2015-07-01 Thread Aaron Piotrowski

 On Jul 1, 2015, at 2:25 PM, Anatol Belski anatol@belski.net wrote:
 
 Expanding on this idea, a separate RFC could propose a magic
 __cast($value)
 static method that would be called for code like below:
 
 $obj = (ClassName) $scalarOrObject; // Invokes
 ClassName::__cast($scalarOrObject);
 
 This would allow UString to implement casting a string to a UString and
 allow
 users to implement such behavior with their own classes.
 
 However, I would not implement such casting syntax for UString only. Being
 able
 to write $ustring = (UString) $string; without the ability to do so for
 other classes
 would be unusual and confusing in my opinion. If an RFC adding such
 behavior
 was implemented, UString could be updated to support casting.
 
 Obviously a UString should be able to be cast to a scalar string using
 (string)
 $ustring. If performance is a concern, UString::__toString() should cache
 the
 result so multiple casts to the same object are quick.


 Hi,
 
 One way doing this is already there thanks
 https://wiki.php.net/rfc/operator_overloading_gmp . Consider
 
 $n = gmp_init(42); var_dump($n, (int)$n);
 
 However the other way round - could be done on case by case basis, IMHO.
 Where it could make sense for class vs scalar, casting class to class is a
 quite unpredictable thing.
 
 While users could implement it, how is it handled with arbitrary objects?
 How would it map properties, would those classes need to implement the same
 interface, et cetera? We're not in C at this point, where we would just
 force a block of memory to be interpreted as we want.
 
 Regards
 
 Anatol

Hello,

I was thinking that the __cast() static method would examine the parameter 
given, then use that value to build a new object and return it or return null 
(which would then result in the engine throwing an Error saying that 
$scalarOrValue could not be cast to ClassName). It was just a suggestion to see 
what others thought because someone suggested supporting casting syntax such as 
$ustring = (UString) $scalarString. I don’t really care for either method 
though (__cast() or enabling casting just for UString), as they don't offer any 
advantage over writing new UString($string) or UString::fromString($string).

Aaron Piotrowski
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] UString

2015-07-01 Thread Andreas Heigl
Hi Joe.

Am 01.07.15 um 07:36 schrieb Joe Watkins:
 [..]
 
 Another possible issue is engine integration:
 
 $string = (UString) $someString;
 $string = (UString) someString;
 
 These aren't very different to 'new UString', but for an integrated
 solution, kind of expected to work.

Why would that be expected behaviour? I mean I can't do

$date = (DateTime) $timestring;

after all, can I? But I can use

$date = new DateTime($timestring);

Just my 2 Cent.

Cheers

Andreas
-- 
  ,,,
 (o o)
+-ooO-(_)-Ooo-+
| Andreas Heigl   |
| mailto:andr...@heigl.org  N 50°22'59.5 E 08°23'58 |
| http://andreas.heigl.org   http://hei.gl/wiFKy7 |
+-+
| http://hei.gl/root-ca   |
+-+



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [PHP-DEV] [RFC] UString

2015-07-01 Thread Joe Watkins
Morning,

 Why would that be expected behaviour? I mean I can't do

$date = (DateTime) $timestring;

No, but you can't do:

 $string = (string) $datetime;

But can do:

$string = (string) $ustring;

Where $ustring is instanceof UString.

Even if you never write $string = (string) $ustring, the engine will
perform the same
action all the time, whenever you pass a UString to anything expecting
string.

It feels like a complete implementation should support both casts.

Cheers
Joe

On Wed, Jul 1, 2015 at 7:38 AM, Andreas Heigl andr...@heigl.org wrote:

 Hi Joe.

 Am 01.07.15 um 07:36 schrieb Joe Watkins:
  [..]
 
  Another possible issue is engine integration:
 
  $string = (UString) $someString;
  $string = (UString) someString;
 
  These aren't very different to 'new UString', but for an integrated
  solution, kind of expected to work.

 Why would that be expected behaviour? I mean I can't do

 $date = (DateTime) $timestring;

 after all, can I? But I can use

 $date = new DateTime($timestring);

 Just my 2 Cent.

 Cheers

 Andreas
 --
   ,,,
  (o o)
 +-ooO-(_)-Ooo-+
 | Andreas Heigl   |
 | mailto:andr...@heigl.org  N 50°22'59.5 E 08°23'58 |
 | http://andreas.heigl.org   http://hei.gl/wiFKy7 |
 +-+
 | http://hei.gl/root-ca   |
 +-+




Re: [PHP-DEV] Fix division by zero to throw exception (round 2)

2015-07-01 Thread Andrea Faulds
Hi Bob,

 On 2 Jul 2015, at 01:26, Bob Weinand bobw...@hotmail.com wrote:
 
 Am 29.06.2015 um 19:14 schrieb Andrea Faulds a...@ajf.me:
 
 Hmm. Using Error might make some sense given it used to raise E_WARNING. I 
 think DivisionByZeroError sounds like a good idea.
 
 Hey,
 
 I just committed that to master…

Great!

 But I noticed that intdiv(PHP_INT_MIN, -1) isn't very well suited for a 
 DivisionByZeroError.
 
 What do you think about adding an ArithmeticError for that case (and making 
 DivisionByZeroError subclass of it)?
 That ArithmeticError could then be reused for negative bitshifts, which would 
 solve the question what to do with that too.

Well, that specific case is integer overflow. Normally in PHP we just upgrade 
to float instead of throwing an error in these situations, but for intdiv() I 
didn’t think that made sense (it’s *integer* division). So, maybe OverflowError 
would be a better name. But we don’t really do overflow errors anywhere else 
that I can think of, so the more general ArithmeticError might be fine.

Thanks.
--
Andrea Faulds
http://ajf.me/





--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php