On Tue, May 17, 2022 at 5:30 AM Thierry Arnoux <[email protected]>
wrote:

> Eight seconds is respectable, we should try to add this new verifier to
> our set.mm CI.
>
> Mario's suggestion is actually the way metamath-knife works: this avoids
> both to allocate and to copy each token. However, Typescript seems to
> create a new String every time even a substring or a slice is taken, so if
> you want to get this kind of speedup in TS, you'd probably need to use shared
> array buffers
> <https://microsoft.github.io/PowerBI-JavaScript/interfaces/_node_modules_typedoc_node_modules_typescript_lib_lib_es2017_sharedmemory_d_.sharedarraybuffer.html>,
> which `slice` method, I believe, does not copy. That structure shall also
> allow use in multi-threading, which is another way to get speedup.
>

Note that even if subslicing creates a new string, it would still be
worthwhile to do the slicing on the fly since these strings only exist for
a short time: many of them are discarded (like comment tokens or structural
tokens like $c or $.) and only math tokens and label tokens are retained
for longer. GC generally does better if you have lots of short lived
allocations than if you have them all in flight simultaneously (which is
what happens if you first tokenize the entire file and then go through the
array of tokens).


> _
> Thierry
>
>
> On 17/05/2022 16:16, Antony Bartlett wrote:
>
> Thierry Arnoux writes:
>
> > Congrats, a new verifier is always a good news!
>
> Thanks! :-)
>
> > What's the current running time?
>
> It should be respectable.  It's verifying set.mm for me in about eight
> seconds on the MacBook Pro I'm using.  Unfortunately XCode is currently
> broken on said device so I'm unable to build other verifiers in order to
> provide a fair comparison, other than to make the obvious statement that
> this is not, and never will be, a competitor for metamath-knife.
>
> Mario's suggestion could potentially shave off maybe a couple of seconds
> from that, but I won't know until I try.
>
>     Best regards,
>
>         Antony
>
>
>
> On Mon, May 16, 2022 at 7:41 PM Mario Carneiro <[email protected]> wrote:
>
>> The pop/reverse does seem unnecessary. I would just keep track of the
>> position in the array, possibly using a little structure containing the
>> array and the current position with some functions to get the current token
>> and move to the next. In fact, you can use this interface to parse the next
>> token on demand, so that you don't need to precalculate the array of tokens
>> (which involves millions of tiny allocations); you just hold on to the
>> input string and pull off tokens as you need.
>>
>> On Mon, May 16, 2022 at 10:45 AM Thierry Arnoux <[email protected]>
>> wrote:
>>
>>> Congrats, a new verifier is always a good news!
>>>
>>> And Typescript is a very relevant language.
>>>
>>> Interestingly enough, Raph Levien (& al)'s Python verifier has the exact
>>> same "trick" with reversing the token buffer
>>> <https://github.com/david-a-wheeler/mmverify.py/blob/82491f94d4f216cdd2aed214a28126da5489bb36/mmverify.py#L107>,
>>> then popping from it. This had me wonder
>>> <https://github.com/david-a-wheeler/mmverify.py/pull/7#discussion_r864428713>
>>> .
>>>
>>> What's the current running time?
>>>
>>> In any case I would suggest adding it to our set.mm CI!
>>>
>>> _
>>> Thierry
>>>
>>>
>>> On 16/05/2022 21:01, Antony Bartlett wrote:
>>>
>>> Hi,
>>>
>>> I'm writing to announce I have completed my port of the checkmm verifier
>>> from C++ to TypeScript.  The test suite passes.
>>>
>>> If you have nodejs installed (and tens of millions of developers
>>> worldwide do) you can verify an mm file in your current working directory
>>> with a command, e.g.
>>>
>>> npx checkmm set.mm
>>>
>>> Though naturally it's more efficient to install my checkmm package first.
>>>
>>> Link to package and source code repositories here
>>> https://www.npmjs.com/package/checkmm
>>> https://github.com/Antony74/checkmm-ts
>>>
>>>
>>> (that's the announcement done, just a few remarks to follow).
>>>
>>> The O(n^2) problem I mentioned previously turned out to be due to the
>>> JavaScript Array.shift operation (which removes an item from the front of
>>> an array) being an O(n) operation which was used n times.  Glauco reports
>>> previously having a similarly unexpected problem with Array.slice.  The
>>> takeaway from this is to make sure you properly understand the capabilities
>>> of whatever containers you decide to use, and in particular to be wary of a
>>> JavaScript array that might be a million items or more long.
>>>
>>> I just reversed the array, by the way - which is also an O(n) operation,
>>> but I only had to do it once, then my tokens could be removed in the
>>> desired order with Array.pop which is an O(1) operation.
>>>
>>> What enabled me to track down this problem in about three quarters of an
>>> hour was knowing my way around the Visual Studio Code debugger - that's
>>> what I didn't have in 2019 when I first encountered this difficulty.
>>> Debuggers have their detractors (Linus Torvalds is probably the best
>>> known), and in the context investigating performance issues they're
>>> sometimes called the "poor man's profiler", but personally I find a
>>> debugger to be an indispensable general purpose tool
>>>
>>> I did this port because I like Eric Schmidt's C++ source code for
>>> checkmm.  I alway respect a job well done by source code, but actually it's
>>> quite rare for me to like the source code itself.  As such my port should
>>> be pretty faithful to the original.  Hopefully you should almost be able to
>>> step through the two source files line-by-line and treat it like a Rosetta
>>> Stone between C++ and TypeScript.
>>>
>>> C++: https://github.com/Antony74/checkmm-ts/blob/main/cpp/checkmm.cpp
>>> TypeScript:
>>> https://github.com/Antony74/checkmm-ts/blob/main/src/checkmm.ts
>>>
>>> It should be said though that reading code written with C++'s Standard
>>> Template Library containers is a bit of a specialised skill.  It's never
>>> let me down with unexpectedly slow operations not immediately obvious from
>>> the documentation like JavaScript has above, but with that caveat aside, I
>>> anticipate my TypeScript port being found to be slightly easier to read.
>>>
>>>     Best regards,
>>>
>>>         Antony
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Metamath" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/metamath/CAJ48g%2BD5uC0tesStx%2Bj-bJe1Zmw4bkx1pqG-puoHpAWosX7k0Q%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/metamath/CAJ48g%2BD5uC0tesStx%2Bj-bJe1Zmw4bkx1pqG-puoHpAWosX7k0Q%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Metamath" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/metamath/af9b7cfc-19eb-fbd1-65a6-10494e7e6dd4%40gmx.net
>>> <https://groups.google.com/d/msgid/metamath/af9b7cfc-19eb-fbd1-65a6-10494e7e6dd4%40gmx.net?utm_medium=email&utm_source=footer>
>>> .
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "Metamath" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/metamath/CAJ48g%2BBeKYGRpg9S__TUKb9f6B0u97EpTaDut55VCKDtCApF5Q%40mail.gmail.com
> <https://groups.google.com/d/msgid/metamath/CAJ48g%2BBeKYGRpg9S__TUKb9f6B0u97EpTaDut55VCKDtCApF5Q%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> --
> You received this message because you are subscribed to the Google Groups
> "Metamath" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/metamath/4770daf0-39e2-a32f-f04e-34e8a6d9a2fb%40gmx.net
> <https://groups.google.com/d/msgid/metamath/4770daf0-39e2-a32f-f04e-34e8a6d9a2fb%40gmx.net?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Metamath" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/metamath/CAFXXJSv9w3%3DehrBjd1EHX_z%2BZ9Fu5ownMxwu6X9gZBuPyoBXTA%40mail.gmail.com.

Reply via email to