On Tuesday, 16 January 2024 at 16:56:04 UTC, Siarhei Siamashka
wrote:
On Tuesday, 16 January 2024 at 15:50:35 UTC, H. S. Teoh wrote:
Unfortunately there seems to be some discrepancy between the
output I got and the prescribed output in your repository. For
example, in your output the number 1556/0 does not have an
encoding, but isn't "1 Mai 0" a valid encoding according to
your dictionary and the original problem description?
You are not allowed to emit "1" as the first token in the
output as long as there are any dictionary word matches at that
position. The relevant paragraph from the problem statement:
==snip==
Encodings of phone numbers can consist of a single word or of
multiple
words separated by spaces. The encodings are built word by word
from
left to right. If and only if at a particular point no word at
all from
the dictionary can be inserted, a single digit from the phone
number can
be copied to the encoding instead. Two subsequent digits are
never
allowed, though. To put it differently: In a partial encoding
that
currently covers k digits, digit k+1 is encoded by itself if
and only if,
first, digit k was not encoded by a digit and, second, there is
no word
in the dictionary that can be used in the encoding starting at
digit k+1.
==snip==
I also spent a bit of time trying to figure out this nuance
when implementing my solution. It doesn't make much sense
visually (no back-to-back digits in the output either way), but
that's how it is.
Exactly, this is one of the things that make this problem a bit
annoying to solve :)
@"H. S. Teoh" you implemented the solution as a Trie!! Nice,
that's also what I did when I "participated" in the study. Here's
[my Trie solution in
Java](https://github.com/renatoathaydes/prechelt-phone-number-encoding/blob/fastest-implementations-print-or-count/src/java/Main.java).
These are basically the two common approaches to the problem: a
Trie or a numeric-based table. According to the study, people who
use scripting languages almost always go with the numeric
approach, while people coming from lower level languages tend to
use a data structure like Trie (if they don't know Trie, they
come up with something similar which is fascinating), which is
harder to implement but more efficient in general.
Can I ask you why didn't you use the [D stdlib
Trie](https://dlang.org/phobos/std_uni.html#codepointTrie)? Not
sure that would've worked, but did you consider that?