David Pratt wrote:
I am working with a text format that advises to strip any ascii control
characters (0 - 30) as part of parsing data and also the ascii pipe
character (124) from the data. I think many of these characters are
from a different time. Since I have never seen most of these
Many thanks Steve. This is good information. I think this should work
fine. I was doing a string.replace in a cleanData() method with the
following characters but don't know if that would have done it. This
contains all the control characters that I really know about in normal
use. ord(c) 32
David Pratt wrote:
[about ord(), chr() and stripping control characters]
Many thanks Steve. This is good information. I think this should work
fine. I was doing a string.replace in a cleanData() method with the
following characters but don't know if that would have done it. This
contains
In article [EMAIL PROTECTED],
David Pratt [EMAIL PROTECTED] wrote:
I am working with a text format that advises to strip any ascii control
characters (0 - 30) as part of parsing data and also the ascii pipe
character (124) from the data. I think many of these characters are
from a
Hi Steve. My plan is to parse the data removing the control characters
and validate to data as records are being added to a dictionary. I am
going to Unicode after this step but before it gets into storage (in
which case I think the translate method could work well).
The encoding itself is
This is very nice :-) Thank you Tony. I think this will be the way to
go. My concern ATM is where it will be best to unicode. The data after
this will go into dict and a few processes and into database. Because
input source if not explicit encoding, I will have to assume ISO-8859-1
I
David Pratt wrote:
I am working with a text format that advises to strip any ascii control
characters (0 - 30) as part of parsing data and also the ascii pipe
character (124) from the data. I think many of these characters are
from a different time. Since I have never seen most of these
In article [EMAIL PROTECTED],
David Pratt [EMAIL PROTECTED] wrote:
This is very nice :-) Thank you Tony. I think this will be the way to
go. My concern ATM is where it will be best to unicode. The data after
this will go into dict and a few processes and into database. Because
input