On  7 Dez 2004, [EMAIL PROTECTED] wrote:

>  I have two lists names x and seq. 
>
> I am trying to find element of x in element of seq. I
> find them. However, I want to print element in seq
> that contains element of x and also the next element
> in seq. 
[...]
> 3. TRIAL 3:
> I just asked to print the element in seq that matched
> element 1 in X.  It prints only that element, however
> I want to print the next element too and I cannot get
> it. 
>>>> for ele1 in x:
>       for ele2 in seq:
>               if ele1 in ele2:
>                       print ele2
>
[...]
>>>> len(x)
> 4504
>>>> x[1:10]
> ['454:494', '319:607', '319:608', '322:289',
> '322:290', '183:330', '183:329', '364:95', '364:96']
>>>> len(seq)
> 398169
>>>> seq[0:4]
> ['>probe:HG-U95Av2:1000_at:399:559;
> Interrogation_Position=1367; Antisense;',
> 'TCTCCTTTGCTGAGGCCTCCAGCTT',
> '>probe:HG-U95Av2:1000_at:544:185;
> Interrogation_Position=1379; Antisense;',
> 'AGGCCTCCAGCTTCAGGCAGGCCAA']
[...]
> How Do I WANT:
>
> I want to print get an output like this:
>
>
>>probe:HG-U95Av2:1000_at:399:559;
> Interrogation_Position=1367; Antisense;'
> TCTCCTTTGCTGAGGCCTCCAGCTT
>
>>probe:HG-U95Av2:1000_at:544:185;
> Interrogation_Position=1379; Antisense;
> AGGCCTCCAGCTTCAGGCAGGCCAA

Hi, you got some replies how to do it, but IMO there are two other
possibilities:

(a) Turn seq into a dictionary with the parts of the string that are
    matched against from list x as keys.  Since seq is long that may be
    much faster.

    def list_to_dict (lst):
    d = {}
    reg = re.compile(':.+?:.+?:(.+?:.+?);')
    for val1, val2 in lst:
        key = reg.search(val1).group(1)
        d[key] = val1 + val2
    return d

    import re
    seq_dic = list_to_dict(zip(seq[::2], seq[1::2]))
    for key in x:
        val = seq_dic.get(key)
            if val: print val 

    The above function uses a regular expression to extract the part of
    the string you are interested in and uses it as key in a dictionary.
    To find the corrresponding list entries `zip(seq[::2], seq[1::2])'
    is used; seq[::2] is the first, the third, the fifth ... entry of
    the list and seq[1::2] is the second, the fourth, the sixth entry of
    the list. zip() packs them together in a tuple.

(b) If you care about memory iterate about seq with izip (from
    itertools).

    from itertools import izip as izip
    reg = re.compile(':.+?:.+?:(.+?:.+?);')
    for val1, val2 in izip(seq[::2], seq[1::2]):
        if reg.search(val1).group(1) in x:
           print val1, val2

    Instead of zip() izip() is here used (it does not create the whole
    list at once).  Alöso no dictionary is used.  What's better for you
    shows only testing.


   Karl
-- 
Please do *not* send copies of replies to me.
I read the list

_______________________________________________
Tutor maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/tutor

Reply via email to