Just from skimming some of the relevant docs (not having written a
driver for Apache Ignite before), some thoughts:
* It does indeed look like there is enough info, both as documentation
and example code, to write codecs and drivers for Ignite
* The formats and protocols look rather baroque, with significant
historical baggage -- it's going to take quite a bit of work to get
a fully compliant driver, though it does look like a smaller subset
could be built to match just a particular need
* There is a strong Java flavor to everything; there is some impedance
mismatch with Raku (such as the Char array
<https://ignite.apache.org/docs/latest/binary-client-protocol/data-format#char-array>
type, which is an array of UTF-16 code units that doesn't
necessarily contain valid decodeable text)
* There seems to be a contention in the design between desire to
support a schema-less/plain-data mode and a schema/object mode; Raku
easily has the metaobject protocol chops to make the latter possible
without invoking truly deep magic, but it does require somewhat more
advanced knowledge to write
So in short: It looks doable, but quite a fair chunk of work depending
on how complete you need it to be, and some decisions need to be made
about how pedantically to support their Java-flavored APIs.
On 1/3/22 7:39 PM, Piper H wrote:
Glad to hear these suggestions, @Geoffery.
I also have a question, this product has a clear binary protocol, do
you know how to port it to perl or perl6?
https://ignite.apache.org/docs/latest/binary-client-protocol/binary-client-protocol
I was using their python/ruby clients, but there is not a perl version.
Thanks.
Piper
On Tue, Jan 4, 2022 at 11:15 AM Geoffrey Broadwell <g...@sonic.net
<mailto:g...@sonic.net>> wrote:
I love doing binary codecs for Raku[1]! How you approach this
really depends on what formats and protocols you want to create
Raku modules for.
The first thing you need to be able to do is test if your codec is
correct. It is notoriously easy to make a tiny mistake in a
protocol implementation and (especially for binary protocols) miss
it entirely because it only happens in certain edge cases.
If the format or protocol in question is open and has one or more
public test suites, you're in good shape. Raku gives a lot of
power for refactoring tests to be very clean, and I've had good
success doing this with several formats.
If there is no public test suite, but you can find RFCs or other
detailed specs, you can often bootstrap a bespoke test suite from
the examples in the spec documents. Failing that, sometimes you
can find sites (even Wikipedia, for the most common formats) that
have known-correct examples to start with, or have published
reverse engineering of files or captured data.
If the format is truly proprietary, you'll be getting lots of
reverse engineering practice of your own. 😉
Now that you have some way of testing correctness, you'll want to
be able to diagnose the incorrect bits. Make sure you have some
way of presenting easily-readable text expansions of the binary
format, because just comparing raw buffer contents can be rather
tedious (though I admit to having found bugs in a public test
suite by spending so much time staring at the buffers I could tell
they'd messed up a translation in a way that made the test always
pass). If the format or protocol has an official text
translation/diagnostic/debug format -- CBOR, BSON, Protobuf, etc.
all have these -- so much the better, you should support that
format as soon as practical.
Once you get down to the nitty-gritty of writing the codec, I find
it is very important to make it work before making it fast. There
is a lot of room for tuning Raku code, but it is WAY easier to get
things going in the right direction by starting off with idiomatic
Raku -- given/when, treating the data buffer as if it was a normal
Array (Positional really), and so on.
Make sure that with every protocol feature that you add, that you
make tests newly pass, and (I find at least) that you write the
coding and decoding bits at the same time, so you can check that
you can round-trip data successfully. For the love of all that is
good, don't implement any obtuse features before the core features
are rock solid and pass the test suite with nary a hiccup.
After that, when you think you're ready to optimize, write
performance /tests/ first. Make sure you test with data that will
both use your codec in a typical manner, and also test out all the
odd corners. You're looking for things that seem weirdly slow;
this usually indicates a thinko like copying the entire buffer
each time you read a byte from it, or somesuch.
Once you've got the obvious performance kinks worked out, come by
and ask again, and we can give you further advice from there. Or
heck, just come visit us on IRC (#raku at Libera.chat), and we'll
be happy to help. (Do stick around for a while though, because
traffic varies strongly by time of day and day of week.)
Best Regards,
Geoff (japhb)
[1] I'm a bit of a nut for it, really. In the distant past, I
wrapped C libraries to get the job done, but more recently I've
done them as plain Raku code (and sometimes NQP, the language that
Rakudo is written in).
I've written some of the binary format codecs for Raku:
* https://github.com/japhb/CBOR-Simple
<https://github.com/japhb/CBOR-Simple>
* https://github.com/japhb/BSON-Simple
<https://github.com/japhb/BSON-Simple>
* https://github.com/japhb/Terminal-ANSIParser
<https://github.com/japhb/Terminal-ANSIParser>
* https://github.com/japhb/TinyFloats
<https://github.com/japhb/TinyFloats>
Modified or tuned others:
* https://github.com/samuraisam/p6-pb/commits?author=japhb
<https://github.com/samuraisam/p6-pb/commits?author=japhb>
* https://github.com/japhb/serializer-perf
<https://github.com/japhb/serializer-perf>
* (Lots of stuff spread across various Cro
<https://github.com/croservices> repositories)
Added a spec extension for an existing standardized format (CBOR):
* https://github.com/japhb/cbor-specs/blob/main/capture.md
<https://github.com/japhb/cbor-specs/blob/main/capture.md>
And I think I forgot a few things. 😁