On Tuesday, 30 May 2017 at 10:54:49 UTC, Solomon E wrote:
I ran into a Rosetta code solution in D that had obvious errors. It's like the author or the previous editor wasn't even trying to do it right, like a protest against how many detailed rules the task had. I assumed that's not the way we want to do things in D.
...
Does anyone have any thoughts about this? Did I do right by D?

I'd say the previous version (by bearophile) suited the task much better, but both aren't perfect.

As a general note, consider the following paragraph of the problem statement:

"Some of the commatizing rules (specified below) are arbitrary, but they'll be a part of this task requirements, if only to make the results consistent amongst national preferences and other disciplines."

This literally means that, while there are complex rules in the real world for commatizing numbers, the problem is kept simple by enforcing strict rules. The minute concerns of the Real World, like "Current New Zealand dollar format overrides old Zimbabwe dollar format", are irrelevant to the formal problem being solved. Perhaps the example inputs section ("Strings to be used as a minimum") gets misleading, but that's what they are: examples, not general rules. By the way, as it's a wiki page, problem statement text could also be improved ;) .

Why? For example, look at Indian numbering system where commatizing is visibly different (https://en.wikipedia.org/wiki/Indian_numbering_system) - and we don't know whether the string should use it or not without the context. Or consider that hexadecimal numbers are usually split in groups of four digits, not three - and we don't know whether a [0-9]+ number is decimal or hexadecimal without the context. See, trying to provide an ultimate solution to real-world commatizing, while keeping it a single function without the context, can't possibly succeed.

What can be done, then? Well, the page authors already did the difficult part for us: they extracted the essence of a complex real-world problem into a small set of formal rules, which are now the formal problem statement. Now comes the easy part: to do exactly what is asked in the problem statement. The flexibility comes from having function parameters. If we have a solution to a formal problem, using it for the real-world version of the problem is either just specifying the right parameters (hopefully), or changing the function if the real world gets too complex for it. In the latter case, the more short and readable the existing solution is, the faster can we change the function to suit our real-world case.

-----

Now, where is the old version wrong? Turns out it just calls the function with default parameters for every line of input - which is wrong since the first two input lines need to be handled specially. Well, that's what the function parameters are for. To have a correct solution, we have to use custom parameters for the first two lines of input. The function itself is fine.

Your solution addresses this problem by special-casing the inputs inside the function, perhaps because of the misleading inputs section in the problem statement. That's a wrong approach. First, it introduces magic numbers 33 and 36 into the code, which is a bad programming practice (see here: https://en.wikipedia.org/wiki/Magic_number_(programming)#Unnamed_numerical_constants). Second, it's plain wrong. According to the problem statement, we don't have these rules for every possible line of >33 standalone decimals, or >36 characters in total. We just have to call our function with a concrete set of custom parameters for one concrete example, and other set of parameters for another example. That's to demonstrate that our function accepts and makes proper use of custom parameters! Special-casing example inputs inside the function is not a solution: if we go down this path, the perfect solution would be a bunch of "if" statements for every possible example input producing the respective example outputs, and empty function for all other possible inputs.

So, how do we call with special parameters? Currently, we can look at every other language except C# as inspiration: ALGOL 68, J, Java, Perl 6, Phix, Racket, and REXX. Your solution also has a good way to check example inputs: a unittest block. It even shows one of D's strengths compared to other languages. And there, you do use custom parameters to check that the function works. A good approach would be to put all the examples in the unittest instead of reading them from a file. This way, the program will be immediately usable and runnable: no need to create an additional arbitrarily-named file just to test it.

-----

All in all, the only thing I'd change in bearophile's solution is to remove the file reading loop, add the unittest block from your solution instead, and place all the examples there. Printing the result does not seem imperative on Rosettacode, and there are at least some entries in D which already use unittest for checking the problem requirements (for example, https://rosettacode.org/wiki/Sorting_algorithms/Cocktail_sort#D).

Lastly, please note that Rosettacode supports multiple versions in a single language (example: http://rosettacode.org/wiki/99_Bottles_of_Beer#D). As bearophile's version certainly has its merits, I strongly suggest to keep it available, either merged with your current version to produce the right solution, or as a second version.

Ivan Kazmenko.

Reply via email to