On 07/04/2024 23:50, Jordan LeDoux wrote:
By a "scalar" value I mean a value that has the same semantics for reading, writing, copying, passing-by-value, passing-by-reference, and passing-by-pointer (how objects behave) as the integer, float, or boolean types.


Right, in that case, it might be more accurate to talk about "value types", since arrays are not generally considered "scalar", but have those same behaviours. And Ilija recently posted a draft proposal for "data classes", which would be object, but also value types: https://externals.io/message/122845


As I mentioned in the discussion about a "scalar arbitrary precision type", the idea of a scalar in this meaning is a non-trivial challenge, as the zval can only store a value that is treated in this way of 64 bits or smaller.


Fortunately, that's not true. If you think about it, that would rule out not only arrays, but any string longer than 8 bytes long!

The way PHP handles this is called "copy-on-write" (COW), where multiple variables can point to the same zval until one of them needs to write to it, at which point a copy is transparently created.


The pointer for this value would fit in the 64 bits, which is how objects work, but that's also why objects have different semantics for scope than integers. Objects are potentially very large in memory, so we refcount them and pass the pointer into child scopes, instead of copying the value like is done with integers.


Objects are not the only thing that is refcounted. In fact, in PHP 4.x and 5.x, *every* zval used a refcount and COW approach; changing some types to be eagerly copied instead was one of the major performance improvements in the "PHP NG" project which formed the basis of PHP 7.0. You can actually see this in action here: https://3v4l.org/oPgr4

This is all completely transparent to the user, as are a bunch of other memory/speed optimisations, like interned string literals, packed arrays, etc.

So, there may be performance gains if we can squeeze values into the zval memory, but it doesn't need to affect the semantics of the new type.



In general I would say that libbcmath is different enough from other backends that we should not expect any work on a BCMath implementation to be utilized in other implementations. It *could* be that we are able to do that, but it should not be something people *expect* to happen because of the technical differences.

Some of the broader language design choices would be transferable though. For instance, the standard names of various calculation functions/methods are something that would remain independent, even with the differences in the implementation.


Yes, that makes sense. Even if we don't have an interface, it would be annoying if one class provided $foo->div($bar), and another provided $foo->dividedBy($bar)


For money calculations, scale is always likely to be a more useful configuration. For mathematical calculations (such as machine learning applications, which I would say is the other very large use case for this kind of capability), precision is likely to be the more useful configuration. Other applications that I have personally encountered include: simulation and modeling, statistical distributions, and data analysis. Most of these can be done with fair accuracy without arbitrary precision, but there are certainly types of applications that would benefit from or even require arbitrary precision in these spaces.


This probably relates quite closely to Arvid's point that for a lot of uses, we don't actually need arbitrary precision, just something that can represent small-to-medium decimal numbers without the inaccuracies of binary floating point. That some libraries can be used for both purposes is not necessarily evidence that we could ever "bless" one for both use cases and make it a single native type.


My intuition at the moment is that a single number-handling API would be challenging to do without an actual proposed implementation on the table for MPDec/MPFR.


I think it would certainly be wise to experiment with how each library can interface to the language as an extension, before spending the extra time needed to integrate it as a new zval type.


But even with these extensions available in PHP, they are barely used by developers at all because (at least in part) of the enormous difference between PECL and PIP. For PHP, I do not think that extensions are an adequate substitute like PIP modules are for Python.


Yes, this is something of a problem. On the plus side, a library doesn't need to be incorporated into the language to be widely installed, because we have the concept of "bundled" extensions; and in practice, Linux distributions add a few "popular" PECL extensions to their list of installable binary packages. On the minus side, even making it into the "bundled" list doesn't mean it's installed by default everywhere, and userland libraries spend a lot of effort polyfilling things which would ideally be available by default.


This is, essentially, the thesis of the research and work that I have done in the space since joining the internals mailing list.


Thanks, there's some really useful perspective there.

Regards,

--
Rowan Tommins
[IMSoP]

Reply via email to