I think you'll find the algorithm below to be a lot faster, especially if the arrays are big. Checking each array index against the list of included or excluded elements is must slower than simply creating a secondary array and looking up whether the elements are included or not.
b = np.array([ [0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 7, 8, 9, 10, 11], [5, 6, 1, 0, 2, 7, 3, 8, 1, 4, 9, 2, 10, 5, 3, 4, 11, 0, 0, 1, 2, 3, 4, 5] ]) a = [1,2,3,7,8] keepdata = np.ones(12, dtype=np.bool) keepdata[a] = False w = np.where(keepdata[b[0]] & keepdata[b[1]]) newindex = keepdata.cumsum()-1 c = newindex[b[:,w[0]]] Cheers, Rick On 2 February 2014 20:58, Mads Ipsen <mads.ip...@gmail.com> wrote: > Hi, > > I have run into a potential 'for loop' bottleneck. Let me outline: > > The following array describes bonds (connections) in a benzene molecule > > b = [[0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 7, > 8, 9, 10, 11], > [5, 6, 1, 0, 2, 7, 3, 8, 1, 4, 9, 2, 10, 5, 3, 4, 11, 0, 0, 1, > 2, 3, 4, 5]] > > ie. bond 0 connects atoms 0 and 5, bond 1 connects atom 0 and 6, etc. In > practical examples, the list can be much larger (N > 100.000 connections. > > Suppose atoms with indices a = [1,2,3,7,8] are deleted, then all bonds > connecting those atoms must be deleted. I achieve this doing > > i_0 = numpy.in1d(b[0], a) > i_1 = numpy.in1d(b[1], a) > b_i = numpy.where(i_0 | i_1)[0] > b = b[:,~(i_0 | i_1)] > > If you find this approach lacking, feel free to comment. > > This results in the following updated bond list > > b = [[0, 0, 4, 4, 5, 5, 5, 6, 10, 11] > [5, 6, 10, 5, 4, 11, 0, 0, 4, 5]] > > This list is however not correct: Since atoms [1,2,3,7,8] have been > deleted, the remaining atoms with indices larger than the deleted atoms > must be decremented. I do this as follows: > > for i in a: > b = numpy.where(b > i, bonds-1, bonds) (*) > > yielding the correct result > > b = [[0, 0, 1, 1, 2, 2, 2, 3, 5, 6], > [2, 3, 5, 2, 1, 6, 0, 0, 1, 2]] > > The Python for loop in (*) may easily contain 50.000 iteration. Is there > a smart way to utilize numpy functionality to avoid this? > > Thanks and best regards, > > Mads > > -- > +---------------------------------------------------------+ > | Mads Ipsen | > +----------------------+----------------------------------+ > | G?seb?ksvej 7, 4. tv | phone: +45-29716388 | > | DK-2500 Valby | email: mads.ip...@gmail.com | > | Denmark | map : www.tinyurl.com/ns52fpa | > +----------------------+----------------------------------+ _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion