On 16.07.2014 10:04, jarod...@libero.it wrote:
Hi there!!!
I have a file  with this data
['uc002uvo.3 ', 'uc001mae.1']
['uc010dya.2 ', 'uc001kko.2']
['uc003ejx.2 ', 'uc010yfr.1']
['uc001bhk.2 ', 'uc003eib.2']
['uc001znc.2 ', 'uc001efn.2']
['uc002ycq.2 ', 'uc001vnh.2']
['uc001odf.1 ', 'uc002mwd.2']
['uc010jkn.1 ', 'uc010luk.1']
['uc003uhf.3 ', 'uc010tqd.1']
['uc002rue.3 ', 'uc001tex.2']
['uc011dtt.1 ', 'uc001lkv.1']
['uc003yyt.2 ', 'uc003mkl.2']
['uc003pkv.2 ', 'uc003ytw.2']
['uc010bhz.2 ', 'uc002kbt.1']
['uc001wnj.2 ', 'uc009wtj.1']
['uc011lyh.1 ', 'uc003jvb.2']
['uc002awj.1 ', 'uc009znm.1']
['uc010bft.2 ', 'uc002cxz.1']
['uc011mar.1 ', 'uc001lvb.1']
['uc001oxl.2 ', 'uc002lvx.1']

I want to replace of the things after the dots, so I want to have  a file with
this output:

['uc002uvo ', 'uc001mae']
['uc010dya ', 'uc001kko']
...

I try to use regular expression but I have  a strange output

with open("non_annotati.csv") as p:
     for i in p:
         lines= i.rstrip("\n").split("\t")

lines is not the best variable name why not use:
           gene1, gene2 = i.rstrip("\n").split("\t")

         mit = re.sub(r'(\.\d$)','',lines[0])
         mit2 = re.sub(r'(\.\d$)','',lines[1])
         print mit,mit2


While Danny has pointed out the actual reason why your code is not working with this specific input data, it's generally not a good idea to make too specific assumptions about input formatting by specifying '\n' and ’\t' explicitly when all you want to do is to eliminate whitespace:

>>> help(s.split)
Help on built-in function split:

split(...) method of builtins.str instance
    S.split(sep=None, maxsplit=-1) -> list of strings

    Return a list of the words in S, using sep as the
    delimiter string.  If maxsplit is given, at most maxsplit
    splits are done. If sep is not specified or is None, any
    whitespace string is a separator and empty strings are
    removed from the result.

>>> s='uc002uvo.3 \tuc001mae.1\r\n'  # Windows line breaks
>>> s.split()
['uc002uvo.3', 'uc001mae.1']

and I agree with Joel that re is overkill here. In fact, your current regexp will fail with two digit numbers after the dot though I don't know whether such names can occur in your data.

Best,
Wolfgang

_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Reply via email to