On Nov 3, 9:55 pm, Matt <macma...@gmail.com> wrote:
> Hi All,
>
> I am trying to concatenate several hundred files based on their filename..  
> Filenames are like this:
>
> Q1.HOMOblast.fasta
> Q1.mus.blast.fasta
> Q1.query.fasta
> Q2.HOMOblast.fasta
> Q2.mus.blast.fasta
> Q2.query.fasta
> ...
> Q1223.HOMOblast.fasta
> Q1223.mus.blast.fasta
> Q1223.query.fasta
>
> All the Q1's should be concatenated together in a single file = 
> Q1.concat.fasta.. Q2's go together, Q3's and so on...
>
> I envision something like
>
> for file in os.listdir("/home/matthew/Desktop/pero.ngs/fasta/final/"):
>         if file.startswith("Q%i"):
>            concatenate...
>
> But I can't figure out how to iterate this process over Q-numbers 1-1223
>
> Any help appreciate.

I haven't tested this, so may have a typo or something, but it's often
much cleaner to gather your information, massage it, and then use,
than it is to gather it and use it in one go.


from collections import defaultdict

filedict = defaultdict(list)

for fname in sorted(os.listdir(mydir)):
    if fname.startswith('Q') and '.' in fname:
        filedict[fname[:fname.find('.')]].append(fname)

for prefix, fnames in filedict.iteritems():
    #print prefix, fnames
    concatenate...

HTH,
Pat
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to