In private mail i've gotten a request to explain my solution and what my "searcher" does in these postmortem solutions. I decided answer it here in case other people are interested.
Let's start with the basic solution as I have it on the scoreboard The first observation is that input may looks like: xxx billion yyy thousand zzz so i'd like at some point to e.g. collect the yyy, then when I see the "thousand", I want to multiply it by 1000 and add it to the xxx billion I already have. After some experimenting and counting lengths, i decided i want to keep the "working value" in one variable, so at the moment i see the "thousand", the working variable will contain: xxx billion yyy and i want xxx billion yyy thousand which i can get by adding 999*yyy, and the yyy i can get with accumulator%1000 Shortly after that I realised this can handle "hundred" too, even if one hundred does not "finish" a group. so a basic core of my solution will be: if(thousand|million|billion|hundred) { $a += (number-1) * ($a%1000) } else { do normal < 1000 collection } The do_normal will also contain a $a+= (to add the xxx to the accumulator), so in more golfish style we get: $a+=$a%1000*(number-1)||normal_stuff which assumes that $a%1000 is never zero in the thousand|million|billion|hundred case. Which is ok since the input is things like "one thousand", not things like "thousand" or "zero thousand". It also assumes that (number-1) *IS* zero in the "normal number" case. number-1 is respectively 9 nines, 6 nines, 3 nines or 2 nines. How to generate that ? Like I said, I was originally only considering billion, million and thousand, so I wanted to generate 3,2 and 1 times 999. And I noticed that billion and million had two l's, so I decided to go count unique characters, and came up with y/abl// to get the 3,2,1. Unfortunately "eleven" and "twelve" also contain the letter l. These two however also contain "e" and "v", so we can compensate for that if we subtract at least 1, so: y/abl/-/e/ However, now I also want to handle "hundred". 99 is not a multiple of 3, so instead of 3,2,1 i want 9,6,3 for the basic three cases. so multiply by 3 and get ( now add the 9 multiplier too): 9x(3*y/abl/-3*/e/) In this form it's clear that that "3" may also be a bigger number (it's ok if eleven and twelve lead to a negative number instead of to 0). The "hundred" contains a "e", so it will do the /e/ subtraction. That's good, since we don't want "hundred" to lead to a multiple of 3, so we can change the "3" to get some other values. Since we are going to subtract at least 4, the 3*y/// must lead to at least 6, so we need characters that are unique to "hundred" appearing at least twice in total. So that leads us to consider "d". However, "thousand" also has a "d". Oh ! But that's just as useable as "a",so we get: 9x(3*y/dbl/-4*/e/) (And no other number like seventeen or twenty or whatever will make this expression non-zero) Next step is to handle the normal stuff (everything < 100). There is quite a bit of regularity there. - if it ends on "ty" (in fact, if it contains "y" at all), it's a number times 10. We can get that by e.g. putting a "0" at the end of our number. - if it ends on "teen" it's some number+10. In fact, if it contains "teen" at all. No ! Even "tee" is enough. And if we consider "ten" a special case, even "te" is good enough. And the number in number+10 will be a single digit in all cases. So we can get that by simply putting "1" in front - All other numbers are either single digit (zero, one, etc), or we have (again) the exceptional cases of "eleven" and "twelve". Fortunately both are easy to recogize from e.g. the "l" in there. So that leads to something like: - If the middle term leads to 1 and 2 for "eleven" and "twelve": /te|l/.(handle_simple_number).0 x/y/ - If the middle term leads to 11 and 2 for "eleven" and "twelve": /te|lv/.(handle_simple_number).0 x/y/ (other cases exist but do not seem interesting) So let's look at the simple numbers: zero one two three four five six seven eight nine (ten) (eleven) (twelve) They are not unique in the first letter, but they ARE unique in the first two letters. But not from the third letter on, since we want to get e.g. 5 for both "five" and "fifty", and 2 for both "two" and "twenty". Twelve is interesting in that if we look only at the first two chars, it always is the same as "two", so will give us a 2 (that's why I only considered formulas that map 12 to 2 above). Ok, so the next step is a search for an expression that takes *exactly* the first two chars from a string, and returns the correct number for them. One obvious method is to make a little table, and just look it up. E.g. in one of my early attempts I used /..//2*$-[zeontwthfofisiseeiniQQel!~$&] However, it's clear to see that the constants string is very long. It would be nice if we could somehow combine the first two chars so the lookup table is twice as short and doesn't need the /2. Rick went for pack(H2,$_) as the combiner, I went for matching the first char with /./ and then doing bitoperations on $& and the first char of $' to create a unique character for each of the simple numbers. The first idea is to simply use $& & $', but unfortunately that does not give a different char for each word. So I used some bitflipping to make them unique. In the end I went for: ($& ^"o")& $' Many other ways are possible, unfortunately whatever way I chose, the lookup table alwayus contained non \w characters, so i could not drop the quotes around the lookup string. So I just went for the lowest tie-break. I'll use V instead of o here since it leads to more readable chars. My initial versions were: /./+index(')"4 &37%.Q!',$&^V&$') These leave out the "zero" lookup to make the table one shorter. They also map "ten" to a non-possible char, so I can use /te|lv/ instead of /tee|lv/. It would be nice if "eleven" automatically mapped to "1", since then I can simply leave out the last two characters. But in fact, we can ! The y/dbl// allows us to replace the l by something else. "l" is the second letter of "eleven", so the combiner for "eleven" will change, but it's only the fourth letter in "twelve", so that one will not change. And we can choose this something else not to exist in any simple number and such that it will combine for "eleven" to the same character as "one" does. For better tiebreaker i also write the index as a $+ lookup, and get: y/dbl/\xe4/ and /te|\xe4/././*$+['^A^S^\^I^O^Z^V^L^G'!~($&^o&$')] The final step is to commify the resulting number, where i used the suboptimal $_=$a;{s/\B..\d\b/,$&/&&redo All that combined with a little tie break improvement leads to: -lp040 $^+=$^%1e3*(9x(3*y/dbl/\xe4/-4*/e/))||/te|\xe4/././*$+['^A^S^\^I^O^Z^V^L^G'!~($&^o&$')].$[x/y/}$_=$^;{s/\B..\d\b/,$&/&&redo Which was my final submission. However, i still was not very happy with how i went from "to first chars" to corresponding number, and i had noticed that in fact "crypt" sort of does what i want: crypt($pass, $salt) takes the first two chars from $salt, then does some des-like encryption on $pass (modified by the salt-chars) and then returns a string representation of this result. So the idea was to find some interesting $pass, such that if I somehow make a number out of the resultstring the same mapping is done. Making a number is not so hard with things like: vec(crypt($pass, $_), $o, $b) unpack(N, crypt($pass, $_)) unfortunately these numbers will be all over the map (especially in the unpack("N", ..) case). So to improve the chances to find a good $pass, I decided to add a modulo so that more output numbers can give the result we want. While the challenge was still running I decided to concentrate on vec(crypt($pass, $_), $o, 4)%$m So I wrote a program that tries all $pass strings consisting of \w chars and starting with a-zA-Z_ or - as $pass. It then does the crypt with "nine" and sees on which positions ($o) the result is exactly 9, and if there is such a case, continues with 8,7 and 6. After that point we can make 5 either directly or with 15%10, so it also starts keeping track of possible modulos. Unfortunately "crypt" is pretty slow, even if you code this program in C. So while I had the idea a few days before the challenge ended, my searcher found nothing while it was still running. A few minutes after the end I found this very lucky hit: k= zrQ0W,o=16,b= 4,m=10:0,1,2,3,4,5,6,7,8,9,0,8 which you can check with the following perl code: perl -wle 'print join",", map vec(crypt(zrQ0W,$_),16,4)%10, qw(zero one two three four five six seven eight nine ten eleven)' And it indeed prints: 0,1,2,3,4,5,6,7,8,9,0,8 And we still have that trick to modify "eleven" in the y/dbl// up our sleeves, so we can use that fixup that last 8 to a 1. So we get: (adding a few Rick/Bob imporvements): $^+=$^%1e3*(9x(3*y/dbl/Q//-4*/e/))||/te|Q//.vec(crypt(zrQ0W,$_),16,4)%10 .e./y/}$_=$^;{s/\B(?=(...)*$)/,/g Unfortunately this string has two big drawbacks: - the offset is 16, while it would be nice to need only one digit. - The 8 at the end causes us to need that Q as compensation. Another char lost After that I got interested in how fast you can make crypt, and in particular started looking at "john the ripper", http://www.openwall.com/john/ Unfortunately that program does not have a direct "crypt", but generates a kind of intermediate code. Then you also convert your target password string to that intermediate code and then do the compare. So I wrote some code to undo that, and for the rest stole the implementation from that code wholesale, and put that in a program that does a smart search over the possible $pass values (smart in the sense that it tries to give up on a particular key as fast as possible so it won't do useless crypts). Once I had that, I started searching for cases that would not have a two-digit offset or that would not need to fixup the "eleven" case. Unfortunately none of these exist in the 5-char $pass space. And to gain in the 6-char space, BOTH must be true. I finished searching the 6-char space a few days ago, and no, none exist where both are true, though this one was close: k= _IEhBI,o=11,b= 8,m=10:0,1,2,3,4,5,6,7,8,9,0,1 (This is not yet completely certain. I have in fact not yet searched the space consisting of pure numbers like -1.2e-4 or 123456) So next i decided to try to get rid of the modulo at the end completely. Since the 6-char space was already completely searched, I'm now trying for vec(crypt($pass, $_), $o, 4) in the 7-char space. The search-space is so big, I didn't really expect success, but in fact after three hours I ran into: k=DjcSj30,o=12,b= 4,m=16:0,1,2,3,4,5,6,7,8,9,0,5 Which again has both drawback (offset two digits and eleven is wrong), but still gains because it has no modulo this time: $^+=$^%1e3*(9x(3*y/dbl/W/-4*/e/))||/te|W/.vec(crypt(DjcSj30,$_),12,4).e./y/}$_=$^;{s/\B(?=(...)*$)/,/g Chances are good a string exists with a one-digit offset and not needing fixup for "eleven", but, if so, I haven't found it yet. I plan to keep running this program in the background as my private strtol@HOME though. If you are interested in the C-code of the current searcher, you can find it at: http://www.xs4all.nl/~thospel/ASIS/zlet5.tar.gz The current version is for x86-mmx, though it's easy enough to get it to work on other processors by compiling john the ripper and replacing arch.h by the processor specific arch.h file john will give you. This program might contain the fastest known existing implementation of full forward crypt (the pure crypt part does 530000 crypts/second on my Athlon XP1800+). I'm very interested if anyone knows anything faster (I'm in fact wondering about bryDES). (the openssl fcrypt does 120000/sec, glibc ultra-fast-crypt does 40000/sec) Compile it with some high optimisation, put a start string in a file (e.g. I used A000000) and run it with that filename as argument (it will update that file ever so often to mark where it will start now). On my athlon XP1800 it works through a 5-char block in about 1 hour. (the code I had during the challenge needed about 12 hours for that) The code does about 440000 crypts/second, so the overhead is reasonable. It can't be made much faster unless I improve the crypt().