On Sat, Aug 07, 2010 at 07:25:20PM +0100, Luke Pebody wrote: > If it does work, step 3 can be simplified. By the time you reach step > 3, there are no substrings of the form XX or XYX. Thus it must be of > the form XYZXYZXYZ...
Hey, thanks! I've been worrying about the "If it does work" part, too, and spotting this pattern made it much easier to construct a proof. It's a bit verbose, but here it is. First, I'll import rules 1 through 4 from the contest analysis, and then prove a few more rules. Rule 5: If the next letter to be printed is X, it is never useful to push a letter different from X. A sequence like: push Y, push X, ..., pop that X can always be rewritten to push X, ..., pop that X, push Y which has the same cost and leaves the stack the same. The operations in the elided part cannot depend on the Y because it is hidden by the X. Rule 6: A sequence of pops is only useful if it leaves the next letter to be printed at the top of the stack (except at the end, when emptying the stack). Proof: A sequence of pops can only be followed by another pop, by a print (which satisfies rule 6), or by a push. If it's a push, then we have a sequence like: pop, push X, ..., pop that X which can be rewritten to push X, ..., pop that X, pop which has the same cost and leaves the stack the same. The correctness of my Step 1 follows directly from Rule 1. If there is an optimal solution for the output of Step 1, then an optimal solution for the input of Step 1 can be constructed by just adding print statements. The solutions will have the same number of pushes. The proof of correctness of Step 2 requires some definitions. Let's look at a single iteration of Step 2. - The input string, which has at least one substring matching XYX, is called StringXYX - The output string, which has the leftmost XYX replaced by just X, is called StringX - Suppose we have an optimal sequence for printing StringX, which follows rules 1-6 above. Call it SequenceX. - From SequenceX we can construct a sequence for printing StringXYX, which follows rules 1-6 above and has exactly one more push than SequenceX. Call it SequenceXYX. Step 2 is correct if SequenceXYX is optimal. Now, let's look at SequenceX right after it prints the X in question. To get a sequence for StringXYX we have to add "print Y" and "print X" after it, together with stack manipulations to make it work. The options for this are limited by rules 1-6: before each print statement we can either push the required letter, or pop down to the required letter. That leaves 4 possibilities to examine. Option 1: push Y, print Y, pop Y (leaving X), print X This is the straightforward option. It adds one push, as expected. There's no way to make any further savings because the state at the boundaries of the XYX in SequenceXYX is identical to the state before and after the X in SequenceX, which is optimal. Option 2: push Y, print Y, push X, print X This would create an XYX sequence on the stack, which according to Rule 4 is never necessary (it simplifies to Option 1), so we can disregard this case. Option 3: pop down to some Y, print Y, push X, print X This works if there is an Y already on the stack. Just like Option 1 it has one more push than SequenceX. It leaves the stack a little shorter, so other parts of SequenceXYX have to be modified to account for this. Can those modifications reduce the number of pushes again, so that the final cost of SequenceXYX is less than in Option 1? No, because the sequence of pops done here can have no effect until the X that was pushed is popped again, so an equivalent modification can be constructed for SequenceX that does the same sequence of pops at that point. And SequenceX is optimal, so that modification will not save a push. Option 4: pop down to some Y, print Y, pop down to some X, print X This works if an Y and an X are available on the stack. Unlike the other options it does not add a push, but it does leave the stack shorter so other parts of SequenceXYX have to be modified to account for this. Those modifications will result in at least one more push elsewhere. Proof: we're inserting this sequence just after a print X, so the stack must have the form X ... Y ... X here. If we look back along the operations of SequenceX that led to this, we will find a place where it pushes that second X instead of poppping down to the X already on the stack. Since a push adds to the cost, and SequenceX is optimal, this must mean that popping instead of pushing there would have required at least one extra push elsewhere. SequenceXYX does those pops in a different place but the same logic as in Option 3 applies. Conclusion: all options for constructing SequenceXYX out of SequenceX involve at least one extra push, and there is an optimal solution that adds exactly one push. This proves the correctness of Step 2. The proof of correctness of step 3 depends on Luke Pebody's observation that at this point the string is of the form XYZXYZXYZ... and his calculation of (2n div 3) + 1 pushes for a string of length n. Let's consider the string XYZXYZ. Due to rules 5 and 6 and the fact that the stack starts empty, there must be an optimal solution that starts with push X, print X, push Y, print Y, push Z, print Z. Now we have XYZ left to print, and XYZ on the stack but in the wrong order. We can get only one of XYZ by popping so we have to push two of them, for a total of 5 pushes. The options are: pop, pop (leaving X), print X, push Y, print Y, push Z, print Z push X, print X, pop, pop (leaving Y), print Y, push Z, print Z push X, print X, push Y, print Y, pop, pop (leaving Z), print Z All three leave XYZ on the stack, so if another XYZ sequence is added then the same logic applies: it takes two more pushes and will leave XYZ on the stack. This makes Step 3 valid for lengths divisible by 3. (It is possible to do variations, like using three pushes for the second XYZ and only one push for the third, but that just moves the pushes around. Because of the order of letters on the stack it always takes two pops to save one push, so at most a third of the letters can be done with pops.) Now consider strings that end on X or on XY. To match the calculation, a string that ends on XY should take one more push than the one without it, and a string that ends on X should take no extra pushes. We can use an extra rule here. Rule 7: As a consequence of Rule 6, it is never useful to pop the stack empty except at the end, which means the letter that is pushed first will still be on the stack when the last letter is printed. If the string ends on X then Rule 7 implies that the final X does not require a push. Thus, the cost of this string is the same as the cost of the string without the final X. If the string ends on XY then there is a solution that only uses one extra push: use the strategy above, that leaves XYZ on the stack, and then use pops to get X or Y and use a push for the other one. It is not possible to do better than this: if both the X and the Y are done with pops, then the stack must have the form X ... Y ... X here (an X on top because we print it and then pop it, and an X at the bottom due to Rule 7). By the same reasoning as for Option 4 in Step 2, arranging the stack to allow these pops would mean using at least one extra push elsewhere. This proves the correctness of Step 3 for string lengths >= 6. Correctness for shorter strings is easy to verify and I'll skip it here. So, there's my proof for the linear solution. Feel free to poke holes in it, and feel especially free to awe me by restating the whole thing in two paragraphs :) -- Richard Braakman (competing as Amtep) -- You received this message because you are subscribed to the Google Groups "google-codejam" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-code?hl=en.
