Re: exclude binary files from os.walk

2005-02-25 Thread Bengt Richter
On Wed, 26 Jan 2005 18:25:09 -0500, "Dan Perl" <[EMAIL PROTECTED]> wrote: > >"rbt" <[EMAIL PROTECTED]> wrote in message >news:[EMAIL PROTECTED] >> Is there an easy way to exclude binary files (I'm working on Windows XP) >> from the file list returned by os.walk()? >> >> Also, when reading files

Re: exclude binary files from os.walk

2005-01-27 Thread Alex Martelli
Craig Ringer <[EMAIL PROTECTED]> wrote: > That's not really safe when dealing with utf-8 files though, and IIRC > with UCS2 or UCS4 as well. The Unicode BOM its self might (I'm not sure) > qualify as ASCII. Nope, both bytes in the BOM have the high-order bit set -- they're 0xFF and 0xFE -- so the

Re: exclude binary files from os.walk

2005-01-27 Thread Alex Martelli
rbt <[EMAIL PROTECTED]> wrote: > Grant Edwards wrote: > > On 2005-01-26, rbt <[EMAIL PROTECTED]> wrote: > > > >>Is there an easy way to exclude binary files (I'm working on > >>Windows XP) from the file list returned by os.walk()? > > > > Sure, assuming you can provide a rigorous definition of '

Re: exclude binary files from os.walk

2005-01-27 Thread Mark McEahern
The OP wrote: > Is there an easy way to exclude binary files (I'm working on Windows XP) from the file list returned by os.walk()? Sure, piece of cake: #!/usr/bin/env python import os def textfiles(path): include = ('.txt', '.csv',) for root, dirs, files in os.walk(path): for name in

Re: exclude binary files from os.walk

2005-01-26 Thread Craig Ringer
On Wed, 2005-01-26 at 17:32 -0500, rbt wrote: > Grant Edwards wrote: > > On 2005-01-26, rbt <[EMAIL PROTECTED]> wrote: > > > > > >>Is there an easy way to exclude binary files (I'm working on > >>Windows XP) from the file list returned by os.walk()? > > > > > > Sure, assuming you can provide a

Re: exclude binary files from os.walk

2005-01-26 Thread Grant Edwards
On 2005-01-26, Larry Bates <[EMAIL PROTECTED]> wrote: > There's no definitive way of telling a file is "non-ascii". > Bytes in a binary file define perfectly good ascii characters. As long as bit 7 is a 0. Traditional ASCII only allows/defines the values 0x00 through 0x7f. If that's what is m

Re: exclude binary files from os.walk

2005-01-26 Thread Dan Perl
"rbt" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > Is there an easy way to exclude binary files (I'm working on Windows XP) > from the file list returned by os.walk()? > > Also, when reading files and you're unsure as to whether or not they are > ascii or binary, I've always th

Re: exclude binary files from os.walk

2005-01-26 Thread [EMAIL PROTECTED]
you might want to look up the 'isascii' function... i.e. - can be represented using just 7-bits. -- http://mail.python.org/mailman/listinfo/python-list

Re: exclude binary files from os.walk

2005-01-26 Thread Larry Bates
There's no definitive way of telling a file is "non-ascii". Bytes in a binary file define perfectly good ascii characters. Windows depends on file extensions to try to keep track of the "type" of data in a file, but that isn't foolproof. I can rename a plain ascii file with a .EXE extension. We

Re: exclude binary files from os.walk

2005-01-26 Thread rbt
Grant Edwards wrote: On 2005-01-26, rbt <[EMAIL PROTECTED]> wrote: Is there an easy way to exclude binary files (I'm working on Windows XP) from the file list returned by os.walk()? Sure, assuming you can provide a rigorous definition of 'binary files'. :) non-ascii -- http://mail.python.org/mai

Re: exclude binary files from os.walk

2005-01-26 Thread Grant Edwards
On 2005-01-26, rbt <[EMAIL PROTECTED]> wrote: > Is there an easy way to exclude binary files (I'm working on > Windows XP) from the file list returned by os.walk()? Sure, assuming you can provide a rigorous definition of 'binary files'. :) > Also, when reading files and you're unsure as to whet

exclude binary files from os.walk

2005-01-26 Thread rbt
Is there an easy way to exclude binary files (I'm working on Windows XP) from the file list returned by os.walk()? Also, when reading files and you're unsure as to whether or not they are ascii or binary, I've always thought it safer to 'rb' on the read, is this correct... and if so, what's the