subject:"Re\: Stripping ASCII codes when parsing"

Re: Stripping ASCII codes when parsing

2005-10-17 Thread Tony Nelson

In article <[EMAIL PROTECTED]>, David Pratt <[EMAIL PROTECTED]> wrote: > This is very nice :-) Thank you Tony. I think this will be the way to > go. My concern ATM is where it will be best to unicode. The data after > this will go into dict and a few processes and into database. Because

Re: Stripping ASCII codes when parsing

2005-10-17 Thread Erik Max Francis

David Pratt wrote: > I am working with a text format that advises to strip any ascii control > characters (0 - 30) as part of parsing data and also the ascii pipe > character (124) from the data. I think many of these characters are > from a different time. Since I have never seen most of these

Re: Stripping ASCII codes when parsing

2005-10-17 Thread David Pratt

This is very nice :-) Thank you Tony. I think this will be the way to go. My concern ATM is where it will be best to unicode. The data after this will go into dict and a few processes and into database. Because input source if not explicit encoding, I will have to assume ISO-8859-1 I bel

Re: Stripping ASCII codes when parsing

2005-10-17 Thread David Pratt

Hi Steve. My plan is to parse the data removing the control characters and validate to data as records are being added to a dictionary. I am going to Unicode after this step but before it gets into storage (in which case I think the translate method could work well). The encoding itself is

Re: Stripping ASCII codes when parsing

2005-10-17 Thread Tony Nelson

In article <[EMAIL PROTECTED]>, David Pratt <[EMAIL PROTECTED]> wrote: > I am working with a text format that advises to strip any ascii control > characters (0 - 30) as part of parsing data and also the ascii pipe > character (124) from the data. I think many of these characters are > from a

Re: Stripping ASCII codes when parsing

2005-10-17 Thread Steve Holden

David Pratt wrote: [about ord(), chr() and stripping control characters] > Many thanks Steve. This is good information. I think this should work > fine. I was doing a string.replace in a cleanData() method with the > following characters but don't know if that would have done it. This > contains

Re: Stripping ASCII codes when parsing

2005-10-17 Thread David Pratt

Many thanks Steve. This is good information. I think this should work fine. I was doing a string.replace in a cleanData() method with the following characters but don't know if that would have done it. This contains all the control characters that I really know about in normal use. ord(c) < 32

Re: Stripping ASCII codes when parsing

2005-10-17 Thread Steve Holden

David Pratt wrote: > I am working with a text format that advises to strip any ascii control > characters (0 - 30) as part of parsing data and also the ascii pipe > character (124) from the data. I think many of these characters are > from a different time. Since I have never seen most of these

Re: Stripping ASCII codes when parsing

Re: Stripping ASCII codes when parsing

Re: Stripping ASCII codes when parsing

Re: Stripping ASCII codes when parsing

Re: Stripping ASCII codes when parsing

Re: Stripping ASCII codes when parsing

Re: Stripping ASCII codes when parsing

Re: Stripping ASCII codes when parsing

8 matches

Site Navigation

Mail list logo

Footer information