> -----Original Message-----
> From: Lee davis [mailto:leedavi...@gmail.com] 
> Sent: Monday, June 20, 2011 9:12 AM
> To: Robert Eisele
> Cc: internals@lists.php.net
> Subject: Re: [PHP-DEV] foreach() for strings
> 
> I think this would be quite a useful feature, and am In favor of it.
> However, I think caution should be taken when shifting array utilities out
> of their remit and allowing them to manipulate / traverse other data types.
> You may see the floodgates opening for more request to adapt array functions
> for other uses.
> 
> Say for instance..
> 
> Could we also use current(), next() and key() for iteration of strings?
> 
> $string = 'string';
> while ($char = current($string))
> {
> echo key($string)   // Would output the offset position I assume 0,1,2 etc??
> echo $char          // outputs each letter of string
> next($string);
> }
> 
> Lee
> 
> On Mon, Jun 20, 2011 at 12:27 PM, Robert Eisele <rob...@xarg.org> wrote:
> 
> > foreach() has many functions, looping over arrays, objects and implementing
> > the iterator interface. I think it's also quite intuitive to use foreach()
> > for strings, too.
> >
> > If you want to implement a parser in PHP, you have to go the way with for +
> > strlen + substr() or $x[$i] to address one character of the string. We
> > could
> > overdo the functionality of foreach()
> > by implementing LVAL's, too, in order to access single bits but this is
> > really uncommon, even if the way of thinking could be, that foreach() gives
> > a single attribute of each value, no matter
> > if it's a complex object with the iterator interface or a primitive. What
> > do
> > you think about this one? My point of view is, that foreach() is very
> > useful, which was acknowledged by many ppl via the comments of my article.
> >
> > I think, adding features like this persuades the one or the other PHP user
> > to upgrade to 5.4.
> >
> > Robert
> >

Doing this with an explicit iterator object is a fine idea. The syntax becomes 
something like:

foreach(new TextIterator($s, 'UTF8') as $pos=>$c)
{
    ...
}

On the other hand, I think that trying to support iteration without using an 
iterator object to mediate would be a disaster, and I'm opposed to doing 
something like that because:
1. The code just looks wrong. PHP developers are generally insulated from the 
char-arrayness of strings. In addition, since PHP isn't typesafe, the code 
becomes highly ambiguous. Is the code iterating an array, or a string? It is 
very hard to tell just by looking. It may be convenient to write, but it's 
certainly not convenient to read or maintain later. On the other hand, with a 
mediating iterator object, the intent becomes obvious, and the code is highly 
readable.
2. The odds of iterating any given string are slim at best. Supporting current, 
key, next, etc. would require the string object internally to get bloated with 
additional unnecessary data that is almost never used. This bloat isn't a 
single int either. For optimal performance it would need to consist of no less 
than two size_t (char position and binary position), and one encoding indicator.
3. Iteration cannot work without knowing which encoding to use for the string. 
Is it UTF8? UTF16? UTF7? Binary or some single byte encoding? Some other exotic 
wide encoding? Without an iterator object in the middle, there is no way to 
specify this encoding. Always treating this as binary would also be a mistake, 
since this is almost certainly never actually the correct behavior, even though 
it may often appear to behave correctly with simple inputs.
4. I've had simple mistakes caught numerous times when foreach complains about 
getting a scalar rather than an array. So far, it has been exactly right every 
time. Allowing strings to be iterated would, in the name of convenience, 
increase the probability of stupid mistakes evading detection. Even worse, the 
code itself would look logically correct until the developer finally realizes 
that they have a string and not an array. Errors like this are probably far 
more common in most projects than the need to iterate a string, so making this 
change hurts debugging in the common case, for the sake of syntactic sugar in 
the rare case. Not a good trade.

John Crenshaw
Priacta, Inc.

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to