On 2017-05-23 19:29, Mahmood Naderan via Python-list wrote: > There are some text files ending with _chunk_i where 'i' is an > integer. For example, > > XXX_chunk_0 > XXX_chunk_1 > ... > > I want to concatenate them in order. Thing is that the total number > of files may be variable. Therefore, I can not specify the number > in my python script. It has to be "for all files ending with > _chunk_i". > > Next, I can write > > with open('final.txt', 'w') as outf: > for fname in filenames: > with open(fname) as inf: > for line in inf: > outf.write(line) > > > How can I specify the "filenames"?
Does the *file* or the *filename* end in _chunk_i? If it's the file-name and they come in in-order, you can just skip them: for fname in filenames: *_, chunk, i = filename.split('_') if chunk == "chunk" and i.isdigit(): with open(fname) as inf: for line in inf: outf.write(line) If they're not sorted, you'd have to sort & filter them first. I'd recommend a sorting & filtering generator: import re interesting_re = re.compile('chunk_(\d+)$', re.I) def filter_and_sort(filenames): yield from sorted(( fname for fname in filenames if interesting_re.search(fname) ), key=lambda v: int(v.rsplit('_', 1)[-1]) ) for fname in filter_and_sort(filenames): with open(fname) as inf: for line in inf: outf.write(line) If the "chunk_i" is *content* in the file, it's a good bit more work to search through all the files for the data, note which file contains which tag, then reopen/seek(0) each file and write them out in order (you'd also have to consider the edge where a file has more than one "chunk_i" that straddles other files). -tkc -- https://mail.python.org/mailman/listinfo/python-list