On Thu, Mar 27, 2014 at 1:18 AM, Slaunger <[email protected]> wrote:
> I am working on solving a recent recreational mathematical problem on > Project Euler <http://projecteuler.net> . I have a solution, which works > fine for small N up to 10^5 but it takes too long to compute for the actual > problem, where N is of the order 2*10^7. The problem is nested loops, and I > am hoping to avoid one level in the loop by using clever numpy magic (as I > have done often before). However, there is one step, which I cannot figure > out how to do using numpy operations alone, and I am hoping for some help > > The subproblem is that I have in principle k = 1, ..., N sets of boolean > arrays > f_k and g_k each of length N. > > For each k I need to find the number of elements i where both f_k[i] and > g_k[i] are True and sum that up over all N values of k. > > A problem of the order 4*10^14 if I just do it brute force. This takes way > too long (there is a one minute rule). > > However, after a lot of thinking and by using some properties of the f_k > and > g_k I have managed to construct using only pure numpy function and only a > single loop over k, arrays > > f_k_changes_at > g_k_changes_at > > which contain the indices i at which the functions change it boolean value > from True to False or False to True. > > It so happens that the number of changes is only a small fraction of N, the > fraction decreases with larger N, so the size of these changes_at arrays > contains perhaps only 1000 elements instead of 10000000 for each k, a > significant reduction of complexity. > > Now, my problem is to figure out for how many values of i both f_k and g_k > are True given the changes_at arrays. > > As this may be a little hard to understand here is a specific example of > how > these arrays can look like for k = 2 and N = 150 > > f_2_changes_at = [ 2 3 39 41 58 59 65 66 93 102 145] > g_2_changes_at = [ 2 94 101 146 149] > > with the boundary condition that f_2[0] = g_2[0] = False > > Which expands to > i f_2 g_2 f_2 and g_2 > 0 F F F > 1 F F F <- > 2 T T T <- > 3 F T F > 4 F T F > ... > 38 F T F <- > 39 T T T > 40 T T T <- > 41 F T F > 42 F T F > ... > 57 F T F <- > 58 T T T <- > 59 F T F > 60 F T F > ... > 64 F T F <- > 65 T T T <- > 66 F T F > ... > 92 F T F <- > 93 T T T <- > 94 T F F > ... > 100 T F F <- > 101 T T T <- > 102 F T F > ... > 144 F T F <- > 145 T T T <- > 146 T F F > 147 T F F > 148 T F F <- > 149 T T T <- > > With the sum of elements fulfilling the condition being (see arrows) > > (2 - 1) + (40 - 38) + (58 - 57) + (65 - 64) + (93 - 92) + (101 - 100) + > (145 > - 144) + (149 - 148) = > 1 + 2 + 1 + 1 + 1 + 1 + 1 + 1 = 9 > > So, is there a numpy recipe for doing the equivalent process without > expanding it into the full arrays? > > I have tried looping over each element in the changes_at arrays and build > up > the sums, but that is too inefficient as I then have an inner for loop > containing conditional branching code > > Thanks in advance, Slaunger > > > > -- > View this message in context: > http://numpy-discussion.10968.n7.nabble.com/Is-there-a-pure-numpy-recipe-for-this-tp37077.html > Sent from the Numpy-discussion mailing list archive at Nabble.com. > _______________________________________________ > NumPy-Discussion mailing list > [email protected] > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Can you provide a link to the problem itself? -- JD
_______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
