Re: [Tutor] Creating one file out of all the files in a directory

Evert Rol Thu, 11 Nov 2010 04:45:01 -0800

> I'm trying to create a script to do the following. I have a directory
> containing hundreds of text files. I need to create a single file with
> the contents of all the files in the directory. Within that file,
> though, I need to create marks that indicate the division between the
> contents of each file that has wound up in that single file.
> 
> I got this far but I'm stumped to continue:
> 
> ----------------- code--------
> import os
> path = '/Volumes/DATA/MyPath'
> os.chdir(path)
> file_names = glob.glob('*.txt')


You don't use file_names any further. Depending on whether you want files from 
subdirectories or not, you can use the os.walk below or file_names.
In the latter case, your loop just becomes:
for file in file_names:
  f = open(file, 'r')
  etc

Though I would use filename instead of file, since file is a Python built-in:
>>> file
<type 'file'>


> for subdir, dirs, files in os.walk(path):
>    for file in files:
>        f = open(file, 'r')
>        text = f.readlines()

Since you don't care about lines in your files, but just the entire file 
contents, you could also simply use
data = f.read()


>        f.close()
>        f = open(file, 'a')

You're opening the same file from which you were just reading, and append to 
that. Since you do that for every file, that doesn't make much sense, imho.
But see further down.

>        f.write('\n\n' + '________________________________' + '\n')

So close ;-).
What you're missing is the next write statement:
f.write(data)

(or 
f.write(''.join(text))
which shows why read() is nicer in this case: readlines() returns a list, not 
just a single string).

>        f.close()

But actually, you can open and close the output file outside the entire loop; 
just name it differently (eg, before the first loop,
outfile = open('outputfile', 'w')

and in the loop:
    outfile.write(data)

after the loop of course:
outfile.close()

In this case, though, there's one thing to watch out for: glob or os.walk will 
pick up your newly (empty) created file, so you should either put the 
all-containg file in a different directory (best practice) or insert an 
if-statement to check whether file[name] != 'outputfile'


Finally, depending on the version of Python you're using, there are nice things 
you can do with the 'with' statement, which has an incredible advantage in case 
of file I/O errors (since you're not checking for any read errors).
See eg http://effbot.org/zone/python-with-statement.htm (bottom part for 
example) or Google around.

Cheers,

  Evert


> ------------
> 
> What's missing here is obvious. This iterates over all the files and
> creates the mark for the division at the end of each file. There is
> nothing, however, to pipe the output of this loop into a new file.
> I've checked the different manuals I own plus some more on the
> internet but I can't figure out how to do what's left.
> 
> I could get by with a little help from my Tutor friends.
> 
> Josep M.
> _______________________________________________
> Tutor maillist  -  [email protected]
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor

_______________________________________________
Tutor maillist  -  [email protected]
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Creating one file out of all the files in a directory

Reply via email to