The free-for-all that is CPAN namespaces is particularly hard to navigate, and I would be enthusiastic if we could create a hierarchy that makes it easy to understand the relationship between them.

I would recommend against using the "top level" space, Number::Binary, as that would reserve a namespace which seems to cover a very broad topic, binary numbers, for just encoding/decoding.  It would also imply similar functionality for a parallel Number::Decimal, or some other numbering system.

What about using the existing Number::Encode hierarchy, and naming the module Number::Encode::Binary?  It's not a bad fit for what you're doing, and since the precedent for Number::Encode is already set, it makes it easier to find it.  Write the API so that it is easy to incorporate additional formats, either via subclassing by other sub modules in the N::E::B namespace, or by adding them to the N::E::B module itself.

Number::Encode is probably ripe for adoption, and could be repurposed for a top level "documentation of namespace" module, maintaining the existing code for any possible existing users, but relegating it to a historical (albeit working) footnote.



On 6/3/21 9:39 AM, Timothe Litt wrote:

Hmmm, Number::Binary doesn't seem to be taken, at least according to meta::cpan.

What about $enc = Number::Binary->encoder("name"); $dec = Number::Binary->decoder("name");?

You could also make it easy to get inverse functions by accepting an object; e.g. $dec = Number::Binary->decoder($enc);

e.g. $e80 = Number::Binary->encoder("intel80")->encode(2.71828)

    printf( "%s: %s\n", Number::Binary->name("intel80"), Number::Binary->decoder("intel80")->decode($e))'

    >> Intel extended floating point: 2.71828

Supporting BigXXX for the native value seems important, not only for the large floats, but some ints.

E.g. Double precision integers (128 bit, or 64 bit on a 32-bit platform - or 72 bit on a 36-bit platform)

You're the expert; whether the encoders figure out whether bignum was used in the caller, or the decoder constructors take a "use_big" option, or the result of decode is specified as a possibly overloaded object that can do math, or ... I leave to you.

Besides finessing bignums, making the output of decode an object allows for a path to info methods - besides name, sizes common to all, maybe things like "smallest number of bits that are needed for this value" (e.g. would this fit in a smaller format without loss of precision?)  Not sure what the right list would be, but once there's one, it will probably grow.

Yes, small can get interesting too - e.g. saturating 8-bit bytes packed in something bigger...

Sounds like fun.  Hope this helps.  Good luck.

On 03-Jun-21 08:52, Peter John Acklam wrote:
Thanks for the feedback!

I see your point regarding the Math:: namespace and agree that it
isn't the best. Alas, Number::Encode is already taken. I suggested
Number::Pack and Number::Unpack because they aren't taken and because
of the module's similar functionality to pack() and unpack().

I agree that there should be a simple wrapper and that the format
should not be a part of the module name specified by the user. I also
agree about not assuming floating point. Actually, one of the use
cases is encoding/decoding unsigned 24 bit integers, which are used by
ImageMagic when reading/writing PAM (portable anymap) images.

There is also Data::IEEE754, but I think the Data:: namespace is too
general. I will only be dealing with numbers.

Peter

tor. 3. jun. 2021 kl. 13:22 skrev Timothe Litt <tlhack...@cpan.org>:
I'd be a bit careful about assuming floating point - will someone want to 
pack/unpack BCD? Or PDP-10 Gfloat (well, OK that's a floating format)?  Or...

I don't like Math:: - it implies that it does arithmetic (or calculus, or 
statistics, or - more than a conversion).

And I'd rather not have a format name encoded in the module that the user calls.

How about Number::Encode->new("name") & Number::Decode->new("name")?

Let "name" get to a subclass, so other formats can be supported just by adding a module - 
e.g. "Number:Encode::BCD" could be require'd if *->new('bcd') called.  Obviously, you'd 
implement IEEE754, Intel80, and whatever else...

Define the API for the subclasses - encode(),decode(), perhaps some info 
functions (e.g. a printable name, perhaps exponent and fraction range/#bits, 
...)

Then someone who wants Number::Decode::VAX_DFLOAT just calls 
Number::Decode->new('vax_dfloat') - after writing it.

Some of these can get interesting if you want to decode and actually do math - 
presumably you'll support Math::BigXxx / bignum? (binary128, VAX H_Floating 
are, IIRC about 36 decimal digits)

And some program that reads archived data can have a description language that is simply "name"  
"format" "byte offset" "length", and not worry about what module handles what format.  In 
fact, such a program might appreciate the trivial modules Number::Encode::INTEGER32 (and perhaps the less obvious 
Number::Encode::INTEGER32_ONESCOMPLEMENT)...

I suspect there are better names for the format, but the idea is to export a 
simple wrapper so the next format can be added by anyone, and the callers don't 
have to know too much.

FWIW.


On 03-Jun-21 06:23, Peter John Acklam wrote:

I also plan to implement the 80 bit "extended precision" format, which
is not IEEE 754 compatible. Perhaps the best and simplest is
Number::Pack and Number::Unpack?

Peter

tor. 3. jun. 2021 kl. 11:43 skrev Peter John Acklam <pjack...@gmail.com>:

Hi

I am working on two modules for encoding and decoding numbers as per IEEE754. 
The pack() function can encode and decode the formats binary32 (single 
precision) and binary64 (double precision). My module can also handle binary128 
(quad precision), binary16 (half precision), bfloat16 (not an IEEE754 format, 
but it follows the IEEE754 pattern), and a few other formats.

My question is about the namespace. Is Math::IEEE754::Encoder (and 
...::Decoder) OK? Or is Number::IEEE754::Encoder better? Or any other?

Here is an example showing how I use it:

my $encoder = Math::IEEE754::Encoder -> new("binary16");
my $bytes = $encoder -> (3.14159265358979);  # = "\x42\x48"

my $decoder = Math::IEEE754::Decoder -> new("binary16");
my $number = $decoder -> ($bytes);               # = 3.140625

The reason for returning an anonymous function rather than implementing the 
function directly, is speed. There are some constants involved, and I don't 
want to compute them for each function call.

Cheers,
Peter John Acklam (PJACKLAM)


Reply via email to