Re: [PHP-DEV] Suggestion

2016-08-31 Thread Robert Williams
On Aug 31, 2016, at 11:49, Yasuo Ohgaki  wrote:
> 
> I remember an argument that "function" is useful to "grep functions".
> This is true, but we have tokenizer and tokenizer does better job.
> e.g. It excludes functions inside comments.
> 
> It may be time to consider simplifying things.

Perhaps, but I would typically be doing something like this when I’ve opened a 
PHP file in a basic text editor and am trying to find where a function is 
declared rather than used. The tokenizer is of no help in this use-case. Plus, 
I happen to like having a consistent item to lock onto visually — always hated 
missing that in languages that don’t have it. (For the same reason, I much 
prefer the function’s return type at the end rather than at the beginning of 
the line.)

--
Bob Williams


signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [PHP-DEV] [RFC] [Discussion] Third-party editing of RFCs

2016-05-12 Thread Robert Williams
This would be great if everyone just wanted to state their stance and be done 
with it. It reminds me of the election pamphlets that my state sends out to 
inform voters of what the upcoming ballet measures are and what various folks’  
for/against arguments are. But those arguments are collected in advance and 
there is only a single edition printed, so there are no direct responses. I’m 
not sure how well this format would work with the back-and-forth that usually 
happens in RFC discussions.

Will folks need to summarize (and respond to) all the arguments they want to 
address in their addition, and keep updating it as new arguments come in to 
which they want to respond? Will they be able to link to others’ comments and 
respond to them that way? And if they can link, what will stop these sections 
from becoming piles of spaghetti? Would folks wait until near the end of the 
discussion period to make their additions to avoid repeat visits, and how would 
that affect the discussion?

-Bob


signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [PHP-DEV] PHP 7.1 - Address PHPSadness #28?

2015-09-18 Thread Robert Williams
> On Sep 18, 2015, at 10:27, Lester Caine  wrote:
> 
> All I am saying is that 'exists()' is simply part of the toolkit that
> goes WITH extract(). There is a suitable tool in arrays and in objects
> so why not complete the toolkit in straight variables. The names are a
> mess between the three for many reasons and producing a complete new set
> of function names has been another call, but there is a simple hole here
> in a style of coding which there seems little logical reason NOT to fill.

Exactly, there’s clearly a gap here. Further, enough people want it that a few 
have shown up on this list, which means there are probably many, many 
thousands, or even millions, of people out in the wild that want it. I’m not 
familiar enough with PHP’s internals to say for sure, but I suspect it’s not 
terribly hard to implement (it’s just isset() without the extra null check). 
So… why not? Like anything else in the language, people don’t have to use it if 
they don’t want to, but it’s good to have tools. Yes, even tools that can be 
abused.

-Bob


signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [PHP-DEV] PHP 7.1 - Address PHPSadness #28?

2015-09-16 Thread Robert Williams
On Sep 16, 2015, at 06:00, Rowan Collins  wrote:
> 
> Absolutely. However, in order to need a dynamic check of variable existence, 
> you must also be using non-existence as a second sentinel value. You must be 
> saying "if the variable has never been assigned to, that means state X; if 
> it's been assigned a null value, that means state Y". My argument is that 
> using undefined variables like that is a bad choice of sentinel value, and if 
> you need more than one sentinel value in the first place, you probably need a 
> more complex data type (a struct-like object with separate state and value 
> properties, for instance).

I agree it’s bad practice to use “undefined” as a valid sentinel “value", but 
it’s something that needs to be checked in error handling sometimes. As an 
example, go back to template systems that just define a bunch of variables 
within the view’s scope: the view code needs to verify that it actually got 
needed variables before it uses them so that it can fail gracefully if it 
didn't. This is just defensive programming, not bad architecture (at least on 
the view’s part…), and while it can be avoided by having the template system 
use arrays or objects, if you don’t control the template system as the view 
writer, then you work with what you’re given.

-Bob


signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [PHP-DEV] PHP 7.1 - Address PHPSadness #28?

2015-09-16 Thread Robert Williams
On Sep 16, 2015, at 11:28, Stanislav Malyshev  wrote:
> 
>> 1 PHP defines null to include variables that have "not been set" 2
> 
> No, not really, PHP does not define that.

It does according to the docs:

"A variable is considered to be null if […] it has not been set to any value 
yet."

http://php.net/manual/en/language.types.null.php 


I think that’s more sloppy documentation (declared versus defined) as opposed 
to literal truth, but….

>> PHP also defines null to include variables that have been unset()
> 
> No, PHP does not define anything like that. In fact, unset() means the
> variable is destroyed and does not exist (of course, excepting refcount
> etc. issues). Null does not feature in this story at all.

I’ll refer again to the above documentation:

"A variable is considered to be null if […] it has been unset().”

This one can’t be chalked up to sloppy writing. Here, something is just plain 
wrong.

All of the above does touch on the gap, though. Ignoring the docs and looking 
at actual behavior of PHP, it’s clear that an undefined variable behaves 
differently from one assigned null. One can easily tell if a variable is null, 
but one can’t easily tell if a variable is just not defined. That strikes me as 
an oversight especially given the fact that one can use unset() to make a 
variable once again undefined. Put another way, we can create and destroy, but 
we can’t identify existence. It’s a small gap in language design. There are 
workarounds, but they’re workarounds.

> That's not a correct description. Correct description is that if you ask
> for a variable and it is not defined, null value is substituted.

That happens in the end, but not before an error is thrown. I want a 
one-keyword way to know that a variable is not defined without having to trip 
an error.

> That's your opinion. Mine, for example, is that it is a very useful
> feature. In any case, PHP is implemented this way, and there's no way to
> change it while the result remaining PHP language. If it makes PHP
> unsuitable for you, sorry, but that's what PHP is. I don't think it
> makes a lot of sense to discuss changing basic semantics of PHP engine -
> I don't think this is going to happen.

At this point, I don’t think anyone is proposing changing the semantics of the 
engine, just filling a gap that lets one determine whether a variable has 
already been defined -- like one can already do for constants, functions, 
classes, array elements, and object properties. You may or may not have use for 
the functionality, but other people do, and adding a clean way to determine how 
PHP would regard a given variable doesn’t strike me as in any way promoting bad 
code even if you don’t personally use it.

-Bob


signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [PHP-DEV] PHP 7.1 - Address PHPSadness #28?

2015-09-16 Thread Robert Williams
On Sep 16, 2015, at 11:14, Rowan Collins <rowan.coll...@gmail.com> wrote:
> 
> Robert Williams wrote on 16/09/2015 18:37:
>> The docs suggest that uninitialized variables are null, and the above makes 
>> it sound like that’s what you’re stating, too. But they’re not: they don’t 
>> exist at all, they’re uninitialized.
> 
> As soon as you access them, you will get the value null. You can do 
> everything with them that you could do with a variable initialised to null.

As soon as you access them, you get an error, and then you get null. Yeah, one 
could register an error handler, but that’s a lot of fuss just to see if a 
variable really exists or just exists because you accessed it, and it pulls you 
out of context. Basically, there’s no (clean, very simple) way to identify that 
you’re about to step in a pile of poo before you step in it.

>> If they were null, then PHP wouldn’t spit out errors about undefined 
>> variable accesses
> 
> Once again - PHP does not spit out errors when you access an undefined 
> variable. It spits out Notices in case you've done it by mistake. Maybe 
> you're thinking of JavaScript?

In our shop, notices are most definitely errors. And notices aren’t graceful 
error handling. I think PHP also also regards them as errors (thus the ability 
to catch them with an error handler) - it’s just a lower grade of error.

> I don’t think it’s a huge gap in the language, but it’s a definite gap, and 
> it’s one that many less-experienced programmers fall right into with safety 
> checks.
> 
> There is no safety check required. If you care about eliminating the Notices, 
> then initialise all your variables properly, and you will never have an 
> uninitialised variable (so a runtime check for one would be pointless). If 
> you don't care about eliminating the Notices, then simply treat uninitialised 
> variables as Null, and they will work absolutely fine.


That’s great in theory, but in practice, programmers’ code must frequently 
collaborate with other people’s code, often code from outside their 
organization. In these cases, they don’t have the control to just fix the 
problem on the other side. And in some cases, like with config files that set 
bare variables, there’s really nothing wrong with the way it’s done that needs 
to be fixed. Some folks might prefer to use a different model, but that’s just 
a preference.

Adding an exists() function doesn’t go against the foundation of PHP. It just 
fills a gap. Again, I can define variables, and I can destroy them, but I can’t 
tell if they’ve already been defined. In a dynamic language like PHP, I really 
don’t understand how folks can *not* see that as an oversight. Further, we can 
do this for everything else… constants, array elements, object properties, 
functions, and even classes, so why not variables? There's even have precedence 
for controlling behavior based on variable existence in extract(), so PHP 
itself does it in ways that directly affect the user, but the user can’t do it. 
Yes, there are workarounds for the gap. There are almost always workarounds in 
a sufficiently developed language, but that doesn’t mean there isn’t a gap.

-Bob


signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [PHP-DEV] PHP 7.1 - Address PHPSadness #28?

2015-09-16 Thread Robert Williams
> On Sep 16, 2015, at 14:57, Stanislav Malyshev  wrote:
> 
>> I’ll refer again to the above documentation:
>> 
>> "A variable is considered to be null if […] it has been unset().”
> 
> You are confusing two things.
> 
> 1. The variable has value of null.
> 2. The variable does not exist, so when you try to get its value,
> there's nothing to give you, but we don't want to produce fatal error
> because of such trifle, so we substitute null instead.

Nothing confused here. If the variable doesn’t exist, then it doesn’t exist, 
and accessing it rightly produces an error (albeit a low-grade error). PHP 
gives you a token null to be nice, but the fact is, you still committed an 
error by trying to access something that doesn’t exist. I’m fine with all of 
that, but I also would like to have the option of easily knowing about the 
problem before I hit it.

And, although you may not care whether the provenance of a variable’s null lies 
in buggy code or in explicit programmer/user decision, I consider that an 
extremely important distinction. It’s the difference between, “What the hell, 
delete anyway. They can always do the work again.” and, “Okay, we’ve got a 
signed delete order from the user, so delete away."

>> This one can’t be chalked up to sloppy writing. Here, something is just
>> plain wrong.
> 
> I'm afraid the something which is wrong here is your understanding of
> how undefined variables and nulls work :) "Considered to" is not the
> same as "is". It's a substitution, not identity.

I understand it just fine, thank you, but I disagree with your forgiveness of 
the docs. An unset variable is not null. Rather, it’s completely undefined, and 
PHP yells at you for just that reason if you try to access it. Then, it turns 
around and tries to appease you by giving you null. If it changed it to null on 
access and never threw the error, your statement would be more accurate, but 
that’s not what happens.

For the record, I’d actually prefer that PHP throw a fatal error if you access 
an undefined variable because it almost always means that your code is working 
with junk data. Right along with that, however, I’d like a way to easily tell 
that my data is bad (i.e., a variable hasn’t been defined) so that I can 
gracefully handle the problem and avoid the fatal.

-Bob


signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [PHP-DEV] PHP 7.1 - Address PHPSadness #28?

2015-09-16 Thread Robert Williams
On Sep 16, 2015, at 14:09, Rowan Collins  wrote:
> I can certainly sympathise with wanting to run without any notices - they are 
> generally hints for writing better code. In the vast majority of cases, 
> though, that means working out *why* the variable is undefined, otherwise you 
> might as well just use the @ operator to say "I don't care about this notice”.

Agree completely, and for that reason, we also ban @ except in those rare cases 
where PHP forces the issue (that’s for other discussions...). Indeed, this 
often comes up with short-lived issues that are resolved quickly. For example, 
a config file is missing a variable, so the user is told what it is, what kind 
of value to supply, and they goes off to fix it, or the user is warned and the 
app goes on with a default. But, all this happens without throwing errors and 
without having to detour through the global error handler. In other cases, the 
fix may not be so available, like with the template system that creates 
variables in the view’s scope; here, the goal is just to gracefully catch the 
condition and then define a default (and maybe complain to the author).

> This still implies that there's a common problem to fix, which I just don't 
> believe. The number of places where the following all hold true must be 
> vanishingly small:
> 
> - the third-party code is dumping bare variables into your scope (globals, or 
> a file-level include are the only mechanisms I can think of)

This seems to be pretty common with template systems, especially older ones. 
It’s also common with config files.

> - they do so unreliably, in the sense that different variables will be set 
> under different circumstances

More common than different variables in different circumstances is the missing 
variable. Consider the user who’s setting up an instance of an application that 
uses a config file with bare variables — very easy to forget something, or type 
the name wrong, or whatever. Or the programmer writing a plugin for a poorly 
documented application who wants to be sure the runtime environment of the 
plugin is what’s expected. Really, a lot of the same situations where one might 
need to check the existence of an array element or object property, except that 
the item to be checked happens to be a plain old variable.

> - you/they have used null as a value which needs to be positively asserted, 
> not taken as the default state, so that isset() does not meet your needs

Well, it’s not so important that null represents a non-default state, as that 
it represents any state. Often, null could just signal that a default should be 
used, but with an existence check, the programmer can be confidant that 
something/someone explicitly chose that option. In the case of a config, for 
example, you can be sure the user chose a particular value (which may be the 
default) versus having missed it altogether. There are cases where that 
knowledge is very important to safe programming.

Consider that you’ve refactored a delete function to add support for 
non-permanent deletion. You’ve done this by adding an optional parameter:

function DeleteFile($file, $permanentlyDelete = null) {}

If $permanentlyDelete is null, the default action specified in a config is 
used. The problem is, when the param is null, there’s no way for the function 
to know for sure what the programmer of the caller intended. There are two 
possibilities:

1 The programmer intentionally passed null because the config should be used
2 The programmer forgot to consider that case after the refactor, so null is 
automatically being used

Ultimately, the function just follows orders, of course, but doing it this way 
is dangerous because it’s all too easy to hit situation 2, which means files 
could be permanently deleted in error. As the programmer doing the refactor, 
it’s better to make the parameter mandatory, which forces a revisit to every 
caller to make the true/false/null decision explicit.

In this case, the extra bit of intelligence comes from forcing the caller to 
pass something. If the param is optional, we have no idea. It’s the same thing 
with exists() on a variable: it gives us extra intelligence about *why* a 
variable is null, whether it’s because of an actual decision somewhere or 
because of an error or oversight. And it provides that intelligence without 
fussing with PHP errors.

> - you have no way of presetting the variable to some other terminal value to 
> detect when it is  explicitly set

If the variables are injected before your code runs, it’s too late. At that 
point, if you try to set a magic sentinel value, you’re just overwriting the 
real value if there is one. It would work, however, if you can pre-initialize 
before the injection:

$foo = 'hope this gets overwritten';
require('/some/config.php');
if ($foo === 'hope this gets overwritten') {
//uh-oh
}

I’m not really a fan of this pattern, though. If there’s going to be a 

Re: [PHP-DEV] PHP 7.1 - Address PHPSadness #28?

2015-09-16 Thread Robert Williams
On Sep 16, 2015, at 06:44, Rowan Collins  wrote:
> 
> Can you give an example of code where you do not know this until the code 
> runs - i.e. where "is this variable set?" is something you can hang business 
> logic on?
> 
> Somewhere where it would make sense to write something like this, if the 
> exists() function were available for plain variables:
> 
> if ( exists($a) ) {
> ...
> } elseif ( is_null($a) ) {
> ...
> } else {
> ...
> }

Sure. It’s not common with controlling business logic (fortunately), but it is 
very common in error-handling code. I already mentioned the templates, but an 
even more common scenario is config files that are supposed to just define a 
bunch of standalone variables, which is a very common pattern. If you’re 
writing code that relies on those variables, or you’re writing the code that’s 
supposed to pull that file in to begin with, it’s smart to make sure all the 
variables actually got set, and if not, display an error to the user to go fix 
the config. Sometimes, isset() works for this if combined with a null check, 
but it fails in cases where null is an acceptable value.

-Bob


signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [PHP-DEV] PHP 7.1 - Address PHPSadness #28?

2015-09-16 Thread Robert Williams
On Sep 16, 2015, at 06:54, Rowan Collins  wrote:
> 
> I want to pull this out for a bit more attention: one of the crucial 
> questions in this thread is whether the language is wrong, or just the 
> documentation. There's one particularly wonky passage someone found in the 
> manual that makes it sound like unitialised variables have an intrinsic type 
> when accessed, rather than just referring the reader to the rules on casting 
> null to any given type; when I have time, I will find and reword it.


The docs suggest that uninitialized variables are null, and the above makes it 
sound like that’s what you’re stating, too. But they’re not: they don’t exist 
at all, they’re uninitialized. If they were null, then PHP wouldn’t spit out 
errors about undefined variable accesses because it would see them as the same 
thing as a (perfectly legal) null variable access. Differentiating these 
various cases is currently a little wonky, requiring some extra boilerplate 
code and/or ugly-ish workarounds (like calling get_defined_vars()). I don’t 
think it’s a huge gap in the language, but it’s a definite gap, and it’s one 
that many less-experienced programmers fall right into with safety checks.

-Bob


signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [PHP-DEV] PHP 7.1 - Address PHPSadness #28?

2015-09-15 Thread Robert Williams
On Sep 14, 2015, at 16:06, Stanislav Malyshev  wrote:
> 
>>> No. There's no reason for null to exist if isset returns true on 
>>> null. If one doesn't understand that, one should not be using
>>> null at all.
>> 
>> Nonsense.
> 
> Oh, thank you! That's a good start for a polite argument.

Apologies, no offense intended, but you do realize that you essentially said 
that any programmer who doesn't agree with what you profess is the One True 
Way, is doing it wrong? This type of statement has been made several times in 
this thread, and it's annoyingly dismissive. Given how PHP has evolved, and how 
many architectural inconsistencies there are, I'm not so sure there is a One 
True Way.

>> It just means that one isn’t using null the way you do.
> 
> No, it means one isn't using null the way it is intended to be used in
> the language.

It really seems to me that PHP is fundamentally confused in how it wants to 
treat null. It clearly has the concept of an undefined variable: variable 
existence is referenced throughout the docs along with null-ness (e.g., the 
docs for isset() say it returns true if the variable, "is set and is not null" 
and if the variable, "exists and has value other than null"), it throws 
undefined errors for using variables that aren't defined, and it even has an 
unset() construct that leads to an undefined variable that fails isset(). Yet, 
it also defines undefined variables as being null -- which is it? Even 
is_null() is caught up in this: it will throw an undefined error if passed an 
undefined variable while also returning true. Why an error for a value that 
supposedly fits the definition of null?

>> functions do. As I said, they’re misguided, right up there with 
>> register_globals. (They lead to similar security bugs, too.)
> 
> Please do not bring "security" as some superman-argument, expecting
> that is you say "security" your arguments suddenly become so much
> stronger. There's no special security problem here besides the trivial
> fact that if you write buggy code in security-sensitive context, you
> get security issue. So please leave security out of it.

Re-read what I said. I didn't say any of this was a security bug. I said it can 
lead to security bugs. And that is, I think, undeniably true because null 
handling in PHP is clearly confusing to a lot of folks, and any confusing API 
(buggy or not) will tend to lead to application-level security bugs if that API 
is often used in security contexts. Hasn't a bunch of work recently been 
completed with hashing and encryption APIs for this very reason? Same idea.

I also mentioned the security issue parenthetically, as a side-note, so please 
don't accuse me of overplaying the security card. My main point there was that 
isset() and empty() were ill-conceived.

And again, who's doing the bashing here? Ease off a bit -- I'm really not 
trying to attack you at all. 

>> If a variable that’s declared yet defined as null can’t “exist” as 
>> decided by isset(), why can an array element that’s declared yet 
>> defined as null “exist” as decided by array_key_exists()? Is the
> 
> Because these are two different functions doing different things.
> That's like asking why strlen returns string length but fopen does not
> return file length.

Clearly, they're different functions, but I think they're doing something 
different only so far as they operate in different contexts. Their purposes are 
closely related, IMHO, which is why they keep being paired together in this 
thread and why people keep using one when they should be using the other. If we 
allow that, then it would make sense for them to function similarly.

Okay, so to bring this home:

1 PHP defines null to include variables that have "not been set"
2 PHP also defines null to include variables that have been unset()
3 Variables can be explicitly set to null

All of these are null. And yet:

4 Calling isset(), which is documented to return true on null, on 1, 2, and 3 
will only return true for 3.

How's this consistent?

Returning true on 1 and 2 suggests clearly that PHP does distinguish between 
undefined variables and null variables. This supposition is also supported by 
numerous mentions of these concepts in the docs and by the various undefined 
error messages that PHP produces. It goes off the tracks, however, when it 
returns true for 3.

IMHO, PHP went south in its design when it tried to declare undefined variables 
as null. They're not; they're undefined. Beyond isset/empty/is_null, however, 
the rest of PHP seems to accept this. Even the three deviant functions tacitly 
acknowledge the issue in their documentation, e.g. with isset()'s existence 
requirement as part of a compound statement ("exists and has value other than 
null"). PHP falls short, however, in not providing a way to independently test 
each part of that compound statement.

Ideally, we would reverse course on the undefined-is-null declaration then 
adjust 

Re: [PHP-DEV] PHP 7.1 - Address PHPSadness #28?

2015-09-14 Thread Robert Williams
I really don’t understand the resistance to this type of change, other than 
knowing that a fix will necessarily be messy. The fact is, PHP distinguishes 
between a variable that has been declared but defined to null and one that 
hasn’t been declared. The value of the first may safely be assigned, compared, 
referenced, and so on, while attempting any of those things on the second one 
results in an error message. The question of whether relying on those 
differences is philosophically acceptable is a moot point: they *are* 
different, period, and the rest of the language should acknowledge and support 
that — just as do most other languages with a similar design (e.g., JavaScript).

Beyond that, the philosophical debate is clearly settled in the real world of 
code. One glance through the comments on the isset() documentation page, not to 
mention the bug reports and the discussions on the lists, shows that people 
have long struggled with this. Based on code I’ve seen from fairly experienced 
developers, the confusion around this has probably led to millions of 
production bugs, many of them in the areas of validation and security. When a 
particular aspect of a language's design promotes bugs to such an extent, it 
needs to be revisited.

I’m not sure what a good solution is, however. Changing isset() will have 
consequences: making it return true for null will magically fix a lot of bugs 
out there, but it’ll also break code where the programmer understood how it 
really works. Adding a parameter to control it is ugly. Extending defined() to 
variables might be a good choice, as it’s already doing for constants exactly 
what’s needed for variables; the only downside is that it would be better-named 
declared() than defined(), but that’s the case for constants, too, so I can 
live with it. For now, isset() and empty() can continue to work as-is but 
perhaps with a notice and deprecation when used with undeclared variables.

--
Bob W.


signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [PHP-DEV] PHP 7.1 - Address PHPSadness #28?

2015-09-14 Thread Robert Williams
On Sep 14, 2015, at 13:23, Stanislav Malyshev  wrote:
> 
> No. There's no reason for null to exist if isset returns true on null.
> If one doesn't understand that, one should not be using null at all.


Nonsense. It just means that one isn’t using null the way you do. You’re saying 
people shouldn’t be programming to distinguish between 
declared-but-defined-null and undeclared. But the fact is, from user land-POV, 
null *is* a value, and it’s frequently used where a variable needs to be 
defined with a sentinel value that clearly flags the variable as not having an 
otherwise valid value. The alternative is to use magic values as sentinels 
instead — e.g., 0 or  for an integer, or empty-string for a string — but 
that causes all sorts of bugs when those values are only rarely legitimate 
versus never legitimate (see: Y2K). Magic values also tend to make code less 
readable.

Programmers have been using null like this for decades, e.g., C strings. It’s 
also quite common in databases, where, at least with MySQL, null happens to 
translate directly to PHP null when data are queried. And frankly, I see 
nothing wrong with recognizing that a variable exists in the symbol table but 
has no real value versus one that doesn’t even exist. It’s like the difference 
between having a car in your garage that is out of gas versus not having a car 
at all.

And again, PHP itself clearly distinguishes the two states everywhere but in 
isset() and empty(). IMHO, isset() and empty() are both misguided (beyond the 
null problem) because they attribute meaning to certain values of user land 
data, and they bundle several checks together. And they do all that without any 
domain knowledge controlling what values are or are not valid. If I’m counting 
eggs in my basket, zero is most definitely a valid value, while it’s not if I’m 
looking at the weight of a specific egg. PHP shouldn’t be making that decision 
for me in blanket form. Yet, that’s exactly what these functions do. As I said, 
they’re misguided, right up there with register_globals. (They lead to similar 
security bugs, too.)

Incidentally, what’s the difference, philosophically, between these two:

$foo = null;
var_dump(isset($foo)); //false

$foo = [‘bar’ => null];
var_dump(array_key_exists(‘bar’, $foo)); //true

If a variable that’s declared yet defined as null can’t “exist” as decided by 
isset(), why can an array element that’s declared yet defined as null “exist” 
as decided by array_key_exists()? Is the symbol table really that different 
from an array? It gets weirder yet where the two start to overlap in user land:

$foo = null;
var_dump(isset($foo)); //false
var_dump(array_key_exists('foo', $GLOBALS)); //true

$array = ['foo' => null];
$bar = null;
extract($array);
print_r(get_defined_vars()); //now have both $foo and $bar in scope

I don’t understand how anyone can see this as being consistent and 
self-explanatory, especially if we assume the words ‘exists’ and ‘set’ to have 
their traditional programming meanings (‘set’ meaning to assign a value and 
‘exists’ meaning there’s an in-scope entry with the given name in the symbol 
table).

--
Bob W.


signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [PHP-DEV] How does the PHP Ghost one-liner work?

2015-01-30 Thread Robert Williams
On Jan 30, 2015, at 12:05, Patrick Schaaf p...@bof.demailto:p...@bof.de 
wrote:
 % php -r '$e=0;for($i=0;$i2500;$i++){$e=0$e;} gethostbyname($e);’

What a funny way to say gethostbyname(str_repeat(0, 2501));

Wow, I somehow missed the interpolation of $e into the value… self-slap. 
Guess I was too focused on looking to the loop as the important part when 
really, it’s just stupid code, as you point out, probably written by someone 
who knows little about PHP.

With that in mind, there is obviously no unintended side-effect at work here. 
Sorry for wasting everyone’s time… as you were.

--
Bob Williams
Business Unit Information Officer and
Senior Vice President of Software Development
Newtek Business Services Corp.
(602) 263-0300 x12458 | http://www.thesba.com/



Notice: This communication, including attachments, may contain information that 
is confidential. It constitutes non-public information intended to be conveyed 
only to the designated recipient(s). If the reader or recipient of this 
communication is not the intended recipient, an employee or agent of the 
intended recipient who is responsible for delivering it to the intended 
recipient, or if you believe that you have received this communication in 
error, please notify the sender immediately by return e-mail and promptly 
delete this e-mail, including attachments without reading or saving them in any 
manner. The unauthorized use, dissemination, distribution, or reproduction of 
this e-mail, including attachments, is prohibited and may be unlawful. If you 
have received this email in error, please notify us immediately by e-mail or 
telephone and delete the e-mail and the attachments (if any).


[PHP-DEV] How does the PHP Ghost one-liner work?

2015-01-30 Thread Robert Williams
A PHP one-liner is being bandied about as one test of the recently discovered 
Ghost vulnerability in gethostbyname(). Taken from:

http://ma.ttias.be/quick-tests-ghost-gethostbyname-vulnerability-cve-2015-0235/

Here it is:

% php -r '$e=0;for($i=0;$i2500;$i++){$e=0$e;} gethostbyname($e);’

What’s not being discussed is how it works. From the naive viewpoint of a PHP 
end-user, I’d expect this one-liner to have the same effect:

% php -r '$e=0$e; gethostbyname($e);’

But it doesn’t. Can someone familiar with PHP’s internals explain why this code 
triggers the overflow, and whether it will actually do so reliably?

More importantly, does this indicate any problems with PHP? It seems like the 
loop should just be optimized away to a single assignment, but even if the 
engine isn’t smart enough to do that, I’d still expect that the same few bytes 
of memory at the same memory address would simply get set to the same value 
over and over. This code suggests that’s not the case, though, that there are 
side-effects. Also, just by lowering the counter to 2499, I get a completely 
different outcome on one particular server:

*** glibc detected *** double free or corruption (out): 0x00acce20 ***
Aborted

FWIW, here’s some C that was provided to more directly check for the 
vulnerability:

#include netdb.h
#include stdio.h
#include stdlib.h
#include string.h
#include errno.h

#define CANARY in_the_coal_mine

struct {
  char buffer[1024];
  char canary[sizeof(CANARY)];
} temp = { buffer, CANARY };

int main(void) {
  struct hostent resbuf;
  struct hostent *result;
  int herrno;
  int retval;

  /*** strlen (name) = size_needed - sizeof (*host_addr) - sizeof 
(*h_addr_ptrs) - 1; ***/
  size_t len = sizeof(temp.buffer) - 16*sizeof(unsigned char) - 2*sizeof(char 
*) - 1;
  char name[sizeof(temp.buffer)];
  memset(name, '0', len);
  name[len] = '\0';

  retval = gethostbyname_r(name, resbuf, temp.buffer, sizeof(temp.buffer), 
result, herrno);

  if (strcmp(temp.canary, CANARY) != 0) {
puts(vulnerable);
exit(EXIT_SUCCESS);
  }
  if (retval == ERANGE) {
puts(not vulnerable);
exit(EXIT_SUCCESS);
  }
  puts(should not happen);
  exit(EXIT_FAILURE);
}



--
Bob Williams
Business Unit Information Officer and
Senior Vice President of Software Development
Newtek Business Services Corp.
(602) 263-0300 x12458 | http://www.thesba.com/



Notice: This communication, including attachments, may contain information that 
is confidential. It constitutes non-public information intended to be conveyed 
only to the designated recipient(s). If the reader or recipient of this 
communication is not the intended recipient, an employee or agent of the 
intended recipient who is responsible for delivering it to the intended 
recipient, or if you believe that you have received this communication in 
error, please notify the sender immediately by return e-mail and promptly 
delete this e-mail, including attachments without reading or saving them in any 
manner. The unauthorized use, dissemination, distribution, or reproduction of 
this e-mail, including attachments, is prohibited and may be unlawful. If you 
have received this email in error, please notify us immediately by e-mail or 
telephone and delete the e-mail and the attachments (if any).


Re: [PHP-DEV] Fix incorrect ternary '?' associativity for 7.0?

2014-12-15 Thread Robert Williams
On Dec 14, 2014, at 23:50, Leon Sorokin leeon...@gmail.com wrote:

 On 12/14/2014 10:45 PM, Robert Williams wrote:

 I strongly suspect far more code would be *fixed* if the ternary operator 
 were changed to match what other languages do.

 If you have 'incorrectly' functioning code today that results in passing
 unit tests and a correctly functioning business. Then a sudden change to
 the behavior of this code would necessarily result in failing unit tests
 and an incorrectly functioning business.

What world is this that you live in where every line of code that’s written is 
fully unit-tested, where functional bugs in large, highly complex applications 
are both obvious and immediately apparent? In my world, I’ve inherited millions 
of lines of legacy code written seemingly to defy the possibility of unit 
testing, where there are large chunks of code that may run once every several 
years, and where many types of logic bugs are simply undetectable unless a team 
of auditors on the business side is double-checking every result of the code. 
Sure, I also have a million or two lines of newer code that is heavily 
unit-tested, but even that code has bugs.

Given that we have this bug to begin with (and yes, it’s a bug), as well as 
many of the others that have worked their way into the PHP code base, it 
strikes me that PHP itself is written in my world, not yours. Hey, reality 
bites.

Also, code that is thoroughly unit-tested is not the code we need to worry 
about for the very reasons you espouse. If the ternary behavior is changed, the 
one or two bugs that may be introduced in every several hundred K LOC will 
become immediately apparent on first test-run and probably be fixed in 30 
minutes or less. It’s the crappy code that we have to worry about, the code 
that’s broken and no one even knows about it. In these cases, I maintain, 
fixing ternary would only improve the code’s functioning.

-Bob

--
Bob Williams
SVP, Software Development
Newtek Business Services, Inc.
“The Small Business Authority”
http://www.thesba.com/



Notice: This communication, including attachments, may contain information that 
is confidential. It constitutes non-public information intended to be conveyed 
only to the designated recipient(s). If the reader or recipient of this 
communication is not the intended recipient, an employee or agent of the 
intended recipient who is responsible for delivering it to the intended 
recipient, or if you believe that you have received this communication in 
error, please notify the sender immediately by return e-mail and promptly 
delete this e-mail, including attachments without reading or saving them in any 
manner. The unauthorized use, dissemination, distribution, or reproduction of 
this e-mail, including attachments, is prohibited and may be unlawful. If you 
have received this email in error, please notify us immediately by e-mail or 
telephone and delete the e-mail and the attachments (if any).


Re: [PHP-DEV] Fix incorrect ternary '?' associativity for 7.0?

2014-12-14 Thread Robert Williams
Some thoughts from user land…

On the concern of breaking code out there that relies on the current behavior, 
I strongly suspect far more code would be *fixed* if the ternary operator were 
changed to match what other languages do. I hate to admit it, but my own shop 
is a good example. We have a particular application that has thousands of 
business rules in the form of boolean expressions written using the ternary 
operator. When I took over the application, every one of them that was 
non-trivial (i.e., 95% of them) was wrong because the developers who wrote them 
didn’t understand how the operator works. It’s still a problem. If the operator 
were fixed in v7 — really fixed, not just made non-associative — all those 
rules would magically work; that alone would present a strong impetus for us to 
migrate that app.

And the problem isn’t restricted to “those” guys that worked on that 
application back in the dark ages. The folks working here now still struggle 
with ternary if they haven’t used it recently. They’ve had enough code flagged 
in reviews to know that it’s problematic, but the usual solution is to rewrite 
it with switch or if-else, or to go crazy with parentheses. Either way, their 
code won’t break if the operator is fixed.

I’ve also interviewed a lot of PHP developers, and I usually ask about the 
ternary operator. Depending on their experience level, my hope is that they 
either A) know all about it and can use it without fear, but are respectful of 
the confusion it can cause others, or B) they don’t really understand it, but 
they know it can bite them so that they’re careful if they encounter it or feel 
the need to use it. The vast majority of the time, however, I get C) they think 
they know all about it and have no fear of using it — but their understanding 
is completely wrong. This is across the board, all experience levels from 
junior guys with a year under their belts to senior guys with 10+ years.

So, it seems that very few PHP developers actually understand how the ternary 
operator works in PHP. And those that do, because they tend to rarely use it in 
nested form, usually either just avoid doing so even when it makes sense or 
uses parentheses to avoid having to think too hard. Either way, their code is 
probably safe, and even if it’s not, nested use of ternary is so rare in most 
code bases that a manual review is not too troublesome.

As for the vast majority of developers who don’t understand the operator: the 
code everyone here is so worried about breaking is largely written by these 
folks, and *it’s already broken*. Fixing the ternary operator now will only 
help most of this code, while making it non-associative will break the code in 
a different way while also breaking the code of those who do understand the 
operator. Fixing it now, or changing it now to fix it later - either way, 
working code that doesn’t rely on parentheses needs to be adjusted. 
Fortunately, the nested ternary is a rare beast (in most apps, anyway), but 
even so, most folks would like to do that review only once. And if it’s to 
adjust code to work more sanely, the way most other languages do it, well, it 
stings a lot less in that case.

So my opinion, as a manager of millions of lines of closed-source code that I 
know no one else will fix for me, is to make changes to the ternary operator 
just once, and make that change one that fixes it to fit most people’s 
expectations. That’s the path that would be most beneficial to me.

--
Bob Williams
SVP, Software Development
Newtek Business Services, Inc.
“The Small Business Authority”
http://www.thesba.com/


Notice: This communication, including attachments, may contain information that 
is confidential. It constitutes non-public information intended to be conveyed 
only to the designated recipient(s). If the reader or recipient of this 
communication is not the intended recipient, an employee or agent of the 
intended recipient who is responsible for delivering it to the intended 
recipient, or if you believe that you have received this communication in 
error, please notify the sender immediately by return e-mail and promptly 
delete this e-mail, including attachments without reading or saving them in any 
manner. The unauthorized use, dissemination, distribution, or reproduction of 
this e-mail, including attachments, is prohibited and may be unlawful. If you 
have received this email in error, please notify us immediately by e-mail or 
telephone and delete the e-mail and the attachments (if any).


Re: [PHP-DEV] [RFC] Loosening heredoc/nowdoc scanner

2014-08-30 Thread Robert Williams
If the syntax of heredocs/nowdocs is to be loosened, the biggest aspect I’d 
like to see addressed is indentation. I can certainly see how it would be nice 
to loosen the restrictions around the post-closing-token newline to allow 
easier use in places like array definitions, but, I’ve never run into that 
problem myself. I have, however, been annoyed by the indentation limitations, 
so much so that I could probably count with my fingers the number of times that 
I’ve used the construct in the couple million lines of PHP I’ve written — even 
though I *want* to use it about once a week. I just find the side-effect on 
code formatting, when used in a container structure (class, function, method, 
loop, whatever), to be more than I can handle. Look at this simple example from 
the docs to see what I mean:

http://php.net/manual/en/language.types.string.php#example-89

One of the key benefits of consistent indentation is that it allows very rapid 
visual navigation of code by nearly eliminating the need for reading until 
you’ve zoned in on the right section of code. To illustrate this very well, 
just try to visually identify the code structure in this old-school BASIC code:

http://www.atariarchives.org/basicgames/showpage.php?page=3

That this restriction warrants italicized text inside a pink warning box in the 
docs suggests both that lots of people bump into it and that it runs counter to 
the expectations of the language.

Now, perhaps I’ve not felt constrained by the newline restriction simply 
because I so rarely get to use the construct at all because of the indentation 
restriction. That’s actually probably the case. So that idea is important to 
address, but I think it’s pointless to address without also addressing 
indentation.

What if we could do something like this:

function foo() {
   $string = 
  THEEND
  This is the document text. Any
  whitespace appearing in a column
  that’s before the starting token
  is automatically ignored for all
  lines.
   THEEND;

A few particulars:

* If the starting token appears immediately adjacent to the  sequence, then 
parsing is done according to existing rules (perhaps with the changes in the 
RFC). This both maintains BC and avoids issues where more or less whitespace is 
ignored when, for example, the variable is renamed.

* Extending the original proposal, the closing token can be indented without 
concern. Align it with the starting token, with the assignment line, whatever.

* The closing token could not appear within the text.

I’ve not given this solution deep thought, so I’m sure there are problems I’m 
missing. But if there’s a good solution to the indentation restrictions, I 
think it would be a huge win. And with the AST-based parser, there may be 
solutions that are possible now that were previously impossible, which makes 
this a good time to reconsider the problem.

--
Bob Williams
SVP, Software Development
Newtek Business Services, Inc.
“The Small Business Authority”
http://www.thesba.com/


Notice: This communication, including attachments, may contain information that 
is confidential. It constitutes non-public information intended to be conveyed 
only to the designated recipient(s). If the reader or recipient of this 
communication is not the intended recipient, an employee or agent of the 
intended recipient who is responsible for delivering it to the intended 
recipient, or if you believe that you have received this communication in 
error, please notify the sender immediately by return e-mail and promptly 
delete this e-mail, including attachments without reading or saving them in any 
manner. The unauthorized use, dissemination, distribution, or reproduction of 
this e-mail, including attachments, is prohibited and may be unlawful. If you 
have received this email in error, please notify us immediately by e-mail or 
telephone and delete the e-mail and the attachments (if any).

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [VOTE][RFC] intdiv()

2014-07-30 Thread Robert Williams
They don’t necessarily need to be symbols. Pascal, for example, uses ‘/' for 
floating-point division, ‘div' for integer division, ‘mod' for modulus, and 
‘rem' for remainder. For example:

20 / 8 = 2.5
20 mod 8 = 4

In PHP, we already have precedence for non-symbol in operators like ‘and', 
‘or', and ‘instanceof', so it wouldn’t feel too out of place for less commonly 
used operators.

Just a thought :).

--
Robert E. Williams, Jr.
Senior Vice President of Software Development
Newtek Businesss Services, Inc. -- The Small Business Authority
https://www.newtekreferrals.com/rewjr
http://www.thesba.com/

Notice: This communication, including attachments, may contain information that 
is confidential. It constitutes non-public information intended to be conveyed 
only to the designated recipient(s). If the reader or recipient of this 
communication is not the intended recipient, an employee or agent of the 
intended recipient who is responsible for delivering it to the intended 
recipient, or if you believe that you have received this communication in 
error, please notify the sender immediately by return e-mail and promptly 
delete this e-mail, including attachments without reading or saving them in any 
manner. The unauthorized use, dissemination, distribution, or reproduction of 
this e-mail, including attachments, is prohibited and may be unlawful. If you 
have received this email in error, please notify us immediately by e-mail or 
telephone and delete the e-mail and the attachments (if any).

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Scalar Type Hinting With Casts (re-opening)

2014-07-23 Thread Robert Williams
On Jul 14, 2014, at 10:13, Andrea Faulds a...@ajf.me wrote:
 We are using hack’s syntax (int, float, bool, string, no 
 integer/double/boolean aliases).


On 20 Jul 2014, at 14:11, Andrea Faulds a...@ajf.me wrote:
 The patch actually warns you if you try to do this now:

function foo(double $foo) {}
foo(1.0);

 If you use one of the non-existent aliases (double), and pass the type that 
 alias is supposed to hint for (a float), the error message notes you might be 
 looking for the actual type hint (`float`).


It seems odd to me to not support the aliases. Since I can do this:

   $foo = (integer)$bar;

I would expect to be able to do this:

   function foo(integer $param) {}

Also, do int, float, and numeric accept numbers in octal, hex, scientific 
notation, etc.? I don’t believe there are any examples in the RFC that 
intentionally or accidentally show what happens with, say, 0x2f as a value.


--
Bob Williams
SVP, Software Development
Newtek Business Services, Inc.
“The Small Business Authority”
http://www.thesba.com/

Notice: This communication, including attachments, may contain information that 
is confidential. It constitutes non-public information intended to be conveyed 
only to the designated recipient(s). If the reader or recipient of this 
communication is not the intended recipient, an employee or agent of the 
intended recipient who is responsible for delivering it to the intended 
recipient, or if you believe that you have received this communication in 
error, please notify the sender immediately by return e-mail and promptly 
delete this e-mail, including attachments without reading or saving them in any 
manner. The unauthorized use, dissemination, distribution, or reproduction of 
this e-mail, including attachments, is prohibited and may be unlawful. If you 
have received this email in error, please notify us immediately by e-mail or 
telephone and delete the e-mail and the attachments (if any).

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Scalar Type Hinting With Casts (re-opening)

2014-07-23 Thread Robert Williams
On Jul 23, 2014, at 11:37, Andrea Faulds a...@ajf.me wrote:

 Aliases mean inconsistency. We shouldn’t unnecessarily have multiple names 
 for the same thing, just one. Also, for every alias we support, another 
 reserved word is added. Hence we only allow one set of names. This is also 
 Facebook’s approach with Hack, which removes the aliases entirely for type 
 casting. I might propose we deprecate/remove the aliases in a future RFC.

I agree the aliases, by definition, represent inconsistency. However, that 
inconsistency is already in the language, and those keywords already exist. By 
not supporting the aliases in new functionality that’s so closely related, the 
inconsistency is itself inconsistent.

Although I don’t have voting rights, I would completely support a separate RFC 
to remove the aliases in PHP 6/7. But in the mean time, as long as they’re 
still in the language, I think they need to be fully supported anywhere their 
“real” counterparts can be used. Hack didn’t just remove them in some cases, it 
removed them across the board. And I think that’s the right approach: they’re 
supported across the board until they’re no longer supported, and then they’re 
not supported anywhere.

 So, yes it does permit non-decimal numbers, but it’s a bug I need to fix.

I’m not sure I follow. If I have function foo(int $bar) {}, what happens in 
these cases:

   foo(0x2f);
   foo(‘0x2f’);

Related, is “numeric basically the union of “int” and “float” (as appears to 
be the case from the chart in the RFC), or is it something more along the lines 
of is_numeric()? There could be consistency issues lurking here, too.


Regards,
Bob

--
Robert E. Williams, Jr.
Senior Vice President of Software Development
Newtek Businesss Services, Inc. -- The Small Business Authority
https://www.newtekreferrals.com/rewjr
http://www.thesba.com/


Notice: This communication, including attachments, may contain information that 
is confidential. It constitutes non-public information intended to be conveyed 
only to the designated recipient(s). If the reader or recipient of this 
communication is not the intended recipient, an employee or agent of the 
intended recipient who is responsible for delivering it to the intended 
recipient, or if you believe that you have received this communication in 
error, please notify the sender immediately by return e-mail and promptly 
delete this e-mail, including attachments without reading or saving them in any 
manner. The unauthorized use, dissemination, distribution, or reproduction of 
this e-mail, including attachments, is prohibited and may be unlawful. If you 
have received this email in error, please notify us immediately by e-mail or 
telephone and delete the e-mail and the attachments (if any).

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] [RFC] Skipping parameters take 2

2013-09-02 Thread Robert Williams
On Sep 2, 2013, at 15:54, Lester Caine les...@lsces.co.uk wrote:

 Parameter hashes are what we have been converting everything TO because it was
 supposed to be the 'proper way to do it' a few years back.

If you have lots of parameters to pass in, the better solution is to use an 
object, which lets you formally define what is required to be passed in. The 
parameter hash, or the parameter block as it was known in decades past, is a 
compromise solution that was widely used before OOP really caught on. In this 
context, it is essentially the same thing as a simple object but without the 
formal definition.

The lack of formal definition is what makes it a terrible idea because it 
obfuscates the parameters of the function. If you hand off the code to someone 
else to use, they must to look at the implementation to see how it works to 
figure out what to pass in. Oh, there may be separate docs, but they can't be 
auto-generated, and we all know how well code docs are kept up-to-date when 
they're not auto-generated. Plus, PHPDoc doesn't support parameter blocks, 
which means that IDEs can't offer the same level of assistance with 
code-completion that they offer for both objects and straight parameters -- 
another huge downside.

Parameters were invented as an abstraction around passing raw, untyped and 
unnamed stacks of data. Parameter blocks take us back to that.

Back to the topic, I like what Stas has proposed. Further, I don't see named 
parameters as replacing the utility of default parameters. For long parameter 
lists, named parameters would usually make more sense to use, I think, but for 
medium lists, I think the default keyword is much cleaner because it doesn't 
require doubling or tripling the length of what you type for the parameter list 
to ensure you're getting the default value for a parameter. And where named 
parameters are overkill, it keeps the focus of anyone reading the code firmly 
on the values, not the parameter names. In other words, the two features 
complement each other, with one or the other being better in different contexts.

I don't get a vote, but if I did, I'd say implement what Stas has put forth, 
and if named parameters can come into the picture at some point, implement 
that, too.


--
Bob Williams

Notice: This communication, including attachments, may contain information that 
is confidential. It constitutes non-public information intended to be conveyed 
only to the designated recipient(s). If the reader or recipient of this 
communication is not the intended recipient, an employee or agent of the 
intended recipient who is responsible for delivering it to the intended 
recipient, or if you believe that you have received this communication in 
error, please notify the sender immediately by return e-mail and promptly 
delete this e-mail, including attachments without reading or saving them in any 
manner. The unauthorized use, dissemination, distribution, or reproduction of 
this e-mail, including attachments, is prohibited and may be unlawful. If you 
have received this email in error, please notify us immediately by e-mail or 
telephone and delete the e-mail and the attachments (if any).

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Older style frameworks ...

2012-08-25 Thread Robert Williams
On Aug 25, 2012, at 17:24, Guillaume Rossolini g.rossol...@gmail.com wrote:

 What you say is true, versions get old. But as Lester pointed out, they
 work. that is why some computer systems that have been outdated for years
 are still functioning today.  It is hard to make a case for rewriting code
 that already works, don't you think?

Actually, no. In fact, if anything, I think it's just the opposite: it's hard 
to make a case for letting code stagnate. No matter how stable an app may seem, 
there will come a day when it will suffer a serious blow. It might be that a 
critical component has a major security breach, or a business requirement comes 
along that forces you to play nice with another system in such a way that it 
can only be done with newer software, or the underlying hardware breaks and 
cannot be replaced, or the company grows and the app can no longer scale... 
whatever. The world changes, and it's going to take your app with it one way or 
the other.

The term is technical debt, and like financial debt, you must deal with it, you 
must keep ahead of it. If you're pouring all your resources into new code, then 
you're piling up technical debt.

Yes, refactoring code to modernize it isn't always fun, but if you do it 
regularly, then the debt remains low and you settle in making making many small 
changes at a constant pace and over time, rather than making big changes too 
quickly. Other benefits come from regularly revisiting code, too, e.g., fixing 
hidden bugs, increasing performance, reworking algorithms and flows with the 
benefit of experience and hindsight, etc.

How much maintenance, as a share of overall work, is right for an app varies 
with the overall quality of the code and how rapidly the app is evolving. It 
may be 20% or 80%, you have to decide for yourself. But if all you're doing is 
adding code without ever refactoring code, you're setting yourself up for a 
real mess when the sh** hits the fan, and you're doing your company (or 
customers, or clients) a terrible disservice.

BTW, before you blow this off as an unrealistic view, consider my own 
situation. A few years ago, I took over management of a few million lines of 
code that had largely come to be via the throw it together and never touch it 
again method. Most of it was written in the PHP 4 era (some in the PHP 3 era), 
and it had things like reliances on pre-1.0 beta releases of pear libraries 
that had long since lived a full life and died in unmaintained peace. These 
were mission critical applications, so we couldn't just toss them out or stop 
new development. At the same time, they were starting to become problematic, 
with serious performance issues and security flaws.

Our approach was simple: whenever we added anything, we took the time do it the 
right way, which almost always included refactoring old code that was involved. 
And, we spent most of our time fixing things, pulling dependencies on 
third-party libraries (like pear), and so forth. Over time, things have 
improved dramatically, and now most of that old code at least runs under 5.3 
comfortably. Our newest apps specifically target 5.3 and will be switched to 
5.4 as soon as Zend gets its act together and releases Zend Server with 5.4. 
Perhaps most interesting, the business users are really starting to understand 
the importance of staying on top of technical debt, since they were bit so 
badly by not doing so and are now starting to reap the benefits of doing so.

My point is, I feel your pain. But you have to power through that, get the code 
cleaned up, and then get it on a maintenance schedule that keeps it that way. 
The rest of the world is not going to slow down for you, so stop waiting for 
that to happen and just jump in.


Regards,
Bob

Notice: This communication, including attachments, may contain information that 
is confidential. It constitutes non-public information intended to be conveyed 
only to the designated recipient(s). If the reader or recipient of this 
communication is not the intended recipient, an employee or agent of the 
intended recipient who is responsible for delivering it to the intended 
recipient, or if you believe that you have received this communication in 
error, please notify the sender immediately by return e-mail and promptly 
delete this e-mail, including attachments without reading or saving them in any 
manner. The unauthorized use, dissemination, distribution, or reproduction of 
this e-mail, including attachments, is prohibited and may be unlawful. If you 
have received this email in error, please notify us immediately by e-mail or 
telephone and delete the e-mail and the attachments (if any).

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] bug #49510

2012-07-15 Thread Robert Williams
On Jul 14, 2012, at 22:58, Stas Malyshev smalys...@sugarcrm.com wrote:

 The question is - should we apply it to 5.3/5.4? It is a behavior
 change, even though current code behavior does not match documented
 behavior.

I understand the concerns of BC, but I don't understand our general reluctance 
to break BC to fix a bug. What bug fix is *not* going to also be a behavior 
change? So long as everything is well-documented, I think most bug fixes should 
go through right away.

And in cases like this, where there is a clear deviation from the 
documentation, it's all the more important to fix it, IMHO, because a lot of 
folks have probably written code that relies on the documented behavior such 
that their code is now broken and they don't even realize it. Those who 
identified the deviation in testing and incorporated a workaround probably also 
notated their code as such so that it's an easy fix later when the bug is fixed 
- which takes us back to simply documenting fixes very well.

The only time I think holding a bug fix should be a consideration is when the 
docs aren't clear on the correct behavior, and the behavior is extensively 
relied upon. By itself, extensive use is not an excuse, IMHO. Other than in the 
embedded world, code should never be a write-it-and-forget-it-affair... things 
change in the language, things change in the OS, features are added that are 
useful, under-used features are removed, security issues are fixed, 
requirements change, etc., etc. This industry is all about change, and I think 
most reasonable people are okay with bug fixes that affect BC so long as 
they're well-documented; they may grumble a bit, but they properly recognize it 
as a necessary evil. Plus, that's why automated testing is pushed so hard :-).

Those programmers who have code where bug fixes will extensively break things 
without their knowing it have code that's already a maintenance nightmare, and 
they probably aren't doing regular PHP upgrades until such time as they get 
their code under control. Similarly, those who have code that may be fairly 
lean but is not well-maintained also probably aren't doing regular PHP 
upgrades. So who, exactly, are we servicing by withholding bug fixes? All we're 
really doing is making it that much harder to upgrade to future major versions 
by turning them as much into giant collections of accumulated, BC-breaking bug 
fixes as they are collections of cool new features.


--
Bob Williams

Notice: This communication, including attachments, may contain information that 
is confidential. It constitutes non-public information intended to be conveyed 
only to the designated recipient(s). If the reader or recipient of this 
communication is not the intended recipient, an employee or agent of the 
intended recipient who is responsible for delivering it to the intended 
recipient, or if you believe that you have received this communication in 
error, please notify the sender immediately by return e-mail and promptly 
delete this e-mail, including attachments without reading or saving them in any 
manner. The unauthorized use, dissemination, distribution, or reproduction of 
this e-mail, including attachments, is prohibited and may be unlawful. If you 
have received this email in error, please notify us immediately by e-mail or 
telephone and delete the e-mail and the attachments (if any).

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] is_numeric_string an hexadecimal numbers (123 == 0x7B)

2012-04-17 Thread Robert Williams
On Apr 17, 2012, at 5:39, Hartmut Holzgraefe 
hartmut.holzgra...@gmail.commailto:hartmut.holzgra...@gmail.com wrote:

Same here, i never even knew that this worked in a string context
until recently. Autocast/comparison rules are already complicated
enough as they are documented now, and i failed to find anything
in the manual that would actually say that hex in a string
context is support to work at all ...

Would this end up changing the behavior of the user land is_numeric() function? 
The behavior actually is documented under that function:

Finds whether the given variable is numeric. Numeric strings consist of [...]. 
Hexadecimal notation (0xFF) is allowed too but only without sign, decimal and 
exponential part.

If so, although this does technically break BC in that case, I for one will not 
miss it. The only effect this will have on our code is to make validation of 
numeric input much easier and less error-prone.

--
Bob Williams

Sent from my iPad


Notice: This communication, including attachments, may contain information that 
is confidential. It constitutes non-public information intended to be conveyed 
only to the designated recipient(s). If the reader or recipient of this 
communication is not the intended recipient, an employee or agent of the 
intended recipient who is responsible for delivering it to the intended 
recipient, or if you believe that you have received this communication in 
error, please notify the sender immediately by return e-mail and promptly 
delete this e-mail, including attachments without reading or saving them in any 
manner. The unauthorized use, dissemination, distribution, or reproduction of 
this e-mail, including attachments, is prohibited and may be unlawful. If you 
have received this email in error, please notify us immediately by e-mail or 
telephone and delete the e-mail and the attachments (if any).


Re: [PHP-DEV] is_numeric_string an hexadecimal numbers (123 == 0x7B)

2012-04-17 Thread Robert Williams
On 4/17/12 08:17, Nikita Popov nikita@googlemail.com wrote:


The last one is more problematic. It is explicitly documented as
accepting hexadecimal numbers. In my eyes it too should not accept
them, but I could imagine that people rely on this.

This always struck me as mistaken design. Why accept hex or decimal, but
not the other bases that PHP knows about? I can see a small number of
scenarios where having it accept hex input is definitely useful, but I
suspect that the vast majority of cases out there where it's used is in
validation routines expecting straightforward, base-10 numbers. And I know
that, of all such cases I've seen (and I've seen quite a few, since one of
our interview test questions implicitly covers it), most programmers are
blissfully ignorant of the hex support and unwittingly allow bad user data
to slip into their applications to become trusted data. Not good.

As I mentioned in my last message, I wouldn't be bothered if this behavior
were simply removed. I think it would affect a small number of people
knowingly relying on the feature, while it would fix probably many
thousands of bugs out there lurking in less-aware programmers' code. Even
better, though, might be to add a flag parameter that would give the
programmer explicit control over its behavior, including which bases to
allow (and including the bases currently MIA).

-Bob

--
Robert E. Williams, Jr.
Associate Vice President of Software Development
Newtek Businesss Services, Inc. -- The Small Business Authority
https://www.newtekreferrals.com/rewjr
http://www.thesba.com/







Notice: This communication, including attachments, may contain information that 
is confidential. It constitutes non-public information intended to be conveyed 
only to the designated recipient(s). If the reader or recipient of this 
communication is not the intended recipient, an employee or agent of the 
intended recipient who is responsible for delivering it to the intended 
recipient, or if you believe that you have received this communication in 
error, please notify the sender immediately by return e-mail and promptly 
delete this e-mail, including attachments without reading or saving them in any 
manner. The unauthorized use, dissemination, distribution, or reproduction of 
this e-mail, including attachments, is prohibited and may be unlawful. If you 
have received this email in error, please notify us immediately by e-mail or 
telephone and delete the e-mail and the attachments (if any).

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP-DEV] Return Type Hinting for Methods RFC

2011-12-23 Thread Robert Williams
On Dec 22, 2011, at 18:59, Will Fitch will.fi...@gmail.com wrote:

 Would you prefer to allow methods with type hinted return values to return 
 null at will, or add a marker noting that it *may* return null?

My preference would be to have a marker, and when null is not allowed, if the 
function tries to return it (or fails to return anything at all), then an error 
is raised or exception thrown. This behavior would be great for those cases 
where you're trying to protect against situations that theoretically should be 
impossible, much like the role of assertions. The marker then would handle 
those situations where you want to explicitly allow null return values based on 
routine inputs.

A common problem with PHP is that you just don't know what a function might do 
if you don't look over its code or it's docs. The marker makes one part of its 
behavior explicit, thus abbreviating the guessing game. (Not to mention when 
the docs themselves errantly fail to mention that a function can return types 
other than the obvious)

In fitting with the PHP way, perhaps it would make sense that the marker 
indicates not just that null may be returned, but that any type may be 
returned. This would allow, say, returning false or -1 instead of null. Or 
maybe it's better just to allow indication of multiple types and have the 
marker be for just for null.

 public ArrayIterator getIterator()

I would really, really prefer to always have the 'function' keyword be present. 
It offers something to scan for when quickly reviewing code, it makes it easier 
to do search-and-replace operations on functions, and it allows text editors 
that don't have the full-blown lexical analyzer of an IDE to still be able to 
pick out all the functions and offer the user an easy navigation feature.

I do, however, quite like the idea of putting the return type at the end of the 
function declaration. I've always disliked the way C and its derivatives stick 
the return type at the beginning (along with an ever-increasing list of other 
keywords), since it makes it harder to quickly scan the code by forcing a more 
thorough mental parsing instead of just letting you snap your eye to a known 
position. As for an operator suggestion, one precedent I can think of is from 
Pascal, which uses a colon:

function GetName(): string

This puts the result type at the end where you always know right where to look 
for it without mental parsing, and it reads naturally in that the effect (the 
result) is listed after the cause (the function). And, at least for English 
writers, the colon's purpose is intuitive because of its use in English 
grammar. Finally, PHP doesn't already use the single colon for anything that I 
can think of off-hand.

I'd also like to comment on the use of type checking in PHP. I completely agree 
that having more broad checking available in the language would be a great 
thing. However, I also understand the criticisms against it. What if, instead 
of specifying strict scalar types, like int, one could specify a type class 
(not in the OOP sense)? The concept has already been alluded to, but I don't 
think anyone has run with the idea yet. I'm thinking here of things like PHP's 
filter functions or the character classes in regular expressions. So you might 
specify a type of 'digits', which would allow anything that consists only of 
the numbers 0-9, or which could losslessly (by which I mean reversible to the 
same starting value) be cast to such a beast, equivalent to this monstrosity 
that I frequently find myself using:

if (!\ctype_digit((string)$parameterValue) {
   ...
}

(I think it was Stas that mentioned using is_numeric() for things like this, 
but I find that function virtually useless since it unconditionally allows 
oddities like hex values that you typically don't want to allow. The other 
alternative, is_int(), forces the very type of strict checking--and thus, 
calling-side casting--that we all wish to avoid.)

Allowing specifications of types that are more flexible than the base scalars 
would enable type checking but retain the advantages that a dynamic language 
offers.

That said, I suspect that no one is talking about this option because it's been 
discussed a million times in the past and deemed a bad and/or unworkable 
solution for whatever reasons. :-)


--
Bob Williams

Notice: This communication, including attachments, may contain information that 
is confidential. It constitutes non-public information intended to be conveyed 
only to the designated recipient(s). If the reader or recipient of this 
communication is not the intended recipient, an employee or agent of the 
intended recipient who is responsible for delivering it to the intended 
recipient, or if you believe that you have received this communication in 
error, please notify the sender immediately by return e-mail and promptly 
delete this e-mail, including attachments without reading or saving them in any 
manner. The unauthorized 

Re: [PHP-DEV] Return Type Hinting for Methods RFC

2011-12-23 Thread Robert Williams
On 12/23/11 13:34, Will Fitch will.fi...@gmail.com wrote:


There's still the matter of whether allowing null to be returned,
regardless of the situation, or using another token to identify that
it could return null. I'd like to know what others think. I see Stas'
argument that you'll still have to check, but I'm not so sure that is
such a bad thing.

I see it as a very bad thing, for two reasons:

1) Unconditionally allowing null to be returned takes away an element of
control. You can't get away from error handling, but it's nice to be able
to handle errors how you want. Having nulls thrown at you at any time
means you have to be ready to handle them at any time, rather than
handling them off in a separate area where you have taken the time to
properly prepare for them. This makes for a lot more redundant code
unrelated to the core functionality of the code, and it kills much of the
utility of things like fluent interfaces.

2) With type-hinted parameters, the choice has already been made not to
allow null values at any time. Rather, the programmer must explicitly
allow them in the parameter declaration. Doing the same with return types
would provide an important bit of consistency.


Regards,

Bob

--
Robert E. Williams, Jr.
Associate Vice President of Software Development
Newtek Businesss Services, Inc. -- The Small Business Authority
https://www.newtekreferrals.com/rewjr
http://www.thesba.com/







Notice: This communication, including attachments, may contain information that 
is confidential. It constitutes non-public information intended to be conveyed 
only to the designated recipient(s). If the reader or recipient of this 
communication is not the intended recipient, an employee or agent of the 
intended recipient who is responsible for delivering it to the intended 
recipient, or if you believe that you have received this communication in 
error, please notify the sender immediately by return e-mail and promptly 
delete this e-mail, including attachments without reading or saving them in any 
manner. The unauthorized use, dissemination, distribution, or reproduction of 
this e-mail, including attachments, is prohibited and may be unlawful. If you 
have received this email in error, please notify us immediately by e-mail or 
telephone and delete the e-mail and the attachments (if any).

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php