[gcj] Re: Round 1B 2012 amateur analysis (while we wait for the real one)

Pedro Osório Sun, 06 May 2012 03:20:16 -0700

Amit:

The insight is: imagine that contestant i gets a fraction Yi of the
public votes, the worst case scenario is if the rest of the public
votes are divided perfectly to allow all contestants to get the same
final score as our contestant (if there are any public votes left we
can distribute a small fraction to each of the other contestants and
they will all surpass contestant i). Thus, we compute the sum of
fractions (Sy(Yi)) that the other contestants need to equal contestant
i if he gets a fraction Yi.


If Sy(Yi) + Yi is smaller than 1, we still have public votes left and
we can distribute them equally to all other contestants and contestant
i will be the last, so we need to increase Yi. On the other hand, if
Sy(Yi) + Yi is larger than 1, the other contestants need more votes
than the ones available to equal contestant i, so we can have a
smaller Yi.

Now we have a strictly increasing function (Sy(Yi) + Yi increases as a
function of Yi), and we want to find Yi such that Sy(Yi) + Yi == 1,
typical binary search problem =)

On May 6, 7:49 am, Amit Jain <[email protected]> wrote:
> Hi tutufan
>
> It's kind of weird question but I could resist to ask it. How could you
> come up with binary search algorithm for this vote distribution.
> Any resource which I'm missing since I believe this kind of application of
> binary search is not any text book.
>
> P.S. You have motivated me with your analysis with the eye of amateur (I
> thought in this manner for Problem 1)
>
> Thanks
> Amit Jain
>
>
>
>
>
>
>
> On Sun, May 6, 2012 at 8:45 AM, tutufan <[email protected]> wrote:
> > I'm thinking that Yui Yu Ho had the key observation on this above.  Using
> > those numbers, I think you can liken this to a Birthday Paradox problem
> > where there are about 2^40 possible days of the year, and 2^500 people at
> > the party.  The wikipedia page has some approximations that suggest that
> > the one hits a 50% of finding a matching pair at around 26 million or so,
> > if I've plugged it in right.
>
> > On Saturday, May 5, 2012 9:51:59 PM UTC-5, [email protected] wrote:
>
> >> Hmm, I thought I had figured out that randomly generating two subsets
> >> could be as bad as a 1 in k chance of having the same sum, where k is
> >> the number of distinct sums...this seemed too unlikely to work in
> >> time.
>
> >> Though perhaps I made an arithmetic error.  More likely there is some
> >> other observation we can make about the distribution of the sums and
> >> their multiplicities beyond what the simple pigeonhole principle tells
> >> us.  That worst case was based on the power means inequality so it
> >> happens when all sums have the same multiplicity, which for cases I
> >> tried is definitely not the case.  I didn't figure out how to
> >> formalize that though.
>
> >> On May 5, 6:08 pm, Lewin Gan <[email protected]> wrote:
> >> > For A, binary search was the way I approached it as well. This was a
> >> > pretty straightforward trick, since you want to see if it is possible
> >> > to distribute the rest of the votes such that all of them go over some
> >> > amount.
>
> >> > For B, you can notice that you want to get to the square as early as
> >> > possible. That is, you can always wait for the water level to be the
> >> > appropriate level. That gives a BFS approach using a priorityqueue, by
> >> > popping off the square with the least time elapsed, and expanding that
> >> > edge.
>
> >> > I wasn't able to solve C, but most people solved the small using brute
> >> > force (O(2^20)). As for the large, it's impossible to construct a set
> >> > of 500 with all distinct subset sums (i.e. there are a total of 2^500
> >> > possible subset sums, but the maximum value we can obtain is 10^12 *
> >> > 500, thus, by pigeonhole, there must be at least some subset sums that
> >> > are equal). Thus, since there is such high probability of collisions,
> >> > randomly generating subsets of a fixed size until finding a match is a
> >> > valid approach (i.e. if you fix the size to be x, there are a total of
> >> > (50 ncr x) subsets, so choose x small enough for an exponential
> >> > solution to run in time, but large enough so your probability of not
> >> > finding a match decreases quickly enough).
>
> >> > Those are pretty short descriptions, but they give the basic ideal of
> >> > the solution.
>
> >> > On May 5, 4:08 pm, tutufan <[email protected]> wrote:
>
> >> > > The official analysis hasn't been posted (yet).  Sometimes they don't
> >> post
> >> > > one, so I'm not sure whether we get one this time.
>
> >> > > In any case, I'm impatient and interested in any insights, so here's
> >> some
> >> > > half-baked speculation.  (Disclosure: I didn't successfully answer
> >> any of
> >> > > these, so all of this might be crap.)
>
> >> > > (spoilers, of course)
>
> >> > > For A, I initially conjectured that it was sufficient when looking at
> >> a
> >> > > contestant to consider only the "other" contestant having the lowest
> >> score,
> >> > > and to take enough votes from them to guarantee that they cannot beat
> >> us.
> >> > >  This is false, though, as input like "0 0 100" shows.
>
> >> > > The next idea was that maybe it's sufficient to claim 1/N of 2X total
> >> > > score, imagining a sort of global leveling (imagine a landscape
> >> filling
> >> > > with water).  That doesn't work either, as some high scorers are like
> >> > > "peaks" that will stick up, ruining the simplicity of this
> >> calculation.
>
> >> > > Later, I had the idea of imagining that I had X buckets of water (or
> >> > > whatever) to distribute, with the goal of ensuring that my candidate
> >> > > doesn't end up with less than anyone else.  So, for example, if we
> >> have "0
> >> > > 2 5 98", we will have 105 buckets to distribute (based on audience
> >> vote).
> >> > >  We first distribute two buckets to "0", to bring it up to the next
> >> level,
> >> > > 2.  Then we distribute three buckets to each of the first two, to
> >> bring
> >> > > them all up to five.  This leaves us with 97 buckets, which we split
> >> across
> >> > > the first three, bringing them up to (5 + 105/3) each.  This
> >> indicates the
> >> > > proportion that my candidate must get not to lose. (Corner case: with
> >> one
> >> > > candidate, the answer is 0.)  I think this might be correct, but
> >> didn't
> >> > > have time to code it.
>
> >> > > The #1 rank solution to this problem was submitted in 4:03/4:34, and
> >> to me
> >> > > it uses a quite impressive insight.  The idea is that since we're
> >> trying to
> >> > > estimate a proportion, and we have a reasonably easy way to check our
> >> > > estimates, and furthermore that the result needs to be really close,
> >> but
> >> > > not exact, that we can simply run a binary search to converge on the
> >> > > answer.  (This also requires some sort of convexity property or
> >> something,
> >> > > which happens to hold here.)  Amazing.  Also amazing, or more like
> >> > > incomprehensible, is being able to read the problem, conceive the
> >> solution,
> >> > > type it, compile it, and go through the small/large submit cycles in
> >> under
> >> > > five minutes.  Is this the work of an AI?  :-)
>
> >> > > For B, this looks like a reasonably straightforward search.  We know
> >> > > there's no point in visiting any square more than once.  Search depth
> >> can
> >> > > be bounded by the best result found at any given point.  Perhaps A*
> >> or IDA*
> >> > > could be used to some benefit, but probably it's fast enough without.
> >>  It
> >> > > would be nice to figure out the set of initially reachable squares
> >> and then
> >> > > chain backwards from the end square, but I don't think this is
> >> possible,
> >> > > because of the need to count the clock forward.
>
> >> > > For C, I assumed that the small input case could be easily
> >> brute-forced,
> >> > > and that the large case could not, and required some sort of number
> >> theory
> >> > > insight.  (Why else would this problem be worth so many points?)  It
> >> would
> >> > > seem that the large case would take a spectacular amount of time to
> >> brute
> >> > > force (O(2^500)), but apparently this was not the case.  I believe
> >> the
> >> > > reason is that for the inputs given, a solution was always possible,
> >> and
> >> > > could be found within the alotted time (because the solution space
> >> was
> >> > > dense with solutions).  Indeed, at least one solution doesn't even
> >> have any
> >> > > provision for printing "Impossible".  Needless to say I'm slightly
> >> annoyed
> >> > > I didn't catch this.  So, I'm wondering, is this actually a provable
> >> > > property of sets of distinct whole numbers <= 10**12 having size 500,
> >> or
> >> > > was it just luck that these all had a quick solution?  (I also saw
> >> that a
> >> > > number of solvers made use of a random number generator--did this
> >> actually
> >> > > matter?)
>
> >> > > Mike- Hide quoted text -
>
> >> > - Show quoted text -
>
> >  --
> > You received this message because you are subscribed to the Google Groups
> > "Google Code Jam" group.
> > To view this discussion on the web visit
> >https://groups.google.com/d/msg/google-code/-/0iK_ZxzCv84J.
>
> > To post to this group, send email to [email protected].
> > To unsubscribe from this group, send email to
> > [email protected].
> > For more options, visit this group at
> >http://groups.google.com/group/google-code?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
"Google Code Jam" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-code?hl=en.

[gcj] Re: Round 1B 2012 amateur analysis (while we wait for the real one)

Reply via email to