On Mon, 31 Oct 2011 20:44:45 -0400, Terry Reedy wrote:
[...]
def is_ascii_text(text):
for c in text:
if c not in LEGAL:
return False
return True
If text is 3.x bytes, this does not work ;-). OP did not specify bytes
or unicode or Python version.
The OP
On Mon, 31 Oct 2011 22:12:26 -0400, Dave Angel wrote:
I would claim that a well-written (in C) translate function, without
using the delete option, should be much quicker than any python loop,
even if it does copy the data.
I think you are selling short the speed of the Python interpreter.
Steven D'Aprano wrote:
On Mon, 31 Oct 2011 22:12:26 -0400, Dave Angel wrote:
I would claim that a well-written (in C) translate function, without
using the delete option, should be much quicker than any python loop,
even if it does copy the data.
I think you are selling short the speed
Steven D'Aprano steve+comp.lang.pyt...@pearwood.info wrote:
LEGAL = ''.join(chr(n) for n in range(32, 128)) + '\n\r\t\f'
MASK = ''.join('\01' if chr(n) in LEGAL else '\0' for n in range(128))
# Untested
def is_ascii_text(text):
for c in text:
n = ord(c)
if n =
On Mon, Oct 31, 2011 at 6:32 PM, Patrick Maupin pmau...@gmail.com wrote:
On Oct 31, 5:52 pm, Ian Kelly ian.g.ke...@gmail.com wrote:
For instance, split() will split on vertical tab,
which is not one of the characters the OP wanted.
That's just the default behavior. You can explicitly specify
On 01/11/2011 18:54, Duncan Booth wrote:
Steven D'Apranosteve+comp.lang.pyt...@pearwood.info wrote:
LEGAL = ''.join(chr(n) for n in range(32, 128)) + '\n\r\t\f'
MASK = ''.join('\01' if chr(n) in LEGAL else '\0' for n in range(128))
# Untested
def is_ascii_text(text):
for c in text:
pyt...@bdurham.com, 31.10.2011 20:54:
Wondering if there's a fast/efficient built-in way to determine
if a string has non-ASCII chars outside the range ASCII 32-127,
CR, LF, or Tab?
I know I can look at the chars of a string individually and
compare them against a set of legal chars using
MRAB pyt...@mrabarnett.plus.com wrote:
On 01/11/2011 18:54, Duncan Booth wrote:
Steven D'Apranosteve+comp.lang.pyt...@pearwood.info wrote:
LEGAL = ''.join(chr(n) for n in range(32, 128)) + '\n\r\t\f'
MASK = ''.join('\01' if chr(n) in LEGAL else '\0' for n in range
(128))
# Untested
def
On 11/1/2011 2:56 AM, Steven D'Aprano wrote:
On Mon, 31 Oct 2011 20:44:45 -0400, Terry Reedy wrote:
[...]
def is_ascii_text(text):
for c in text:
if c not in LEGAL:
return False
return True
If text is 3.x bytes, this does not work ;-). OP did not specify
Wondering if there's a fast/efficient built-in way to determine
if a string has non-ASCII chars outside the range ASCII 32-127,
CR, LF, or Tab?
I know I can look at the chars of a string individually and
compare them against a set of legal chars using standard Python
code (and this works fine),
On 10/31/2011 03:54 PM, pyt...@bdurham.com wrote:
Wondering if there's a fast/efficient built-in way to determine
if a string has non-ASCII chars outside the range ASCII 32-127,
CR, LF, or Tab?
I know I can look at the chars of a string individually and
compare them against a set of legal chars
On 10/31/2011 05:47 PM, Dave Angel wrote:
On 10/31/2011 03:54 PM, pyt...@bdurham.com wrote:
Wondering if there's a fast/efficient built-in way to determine
if a string has non-ASCII chars outside the range ASCII 32-127,
CR, LF, or Tab?
I know I can look at the chars of a string individually
On Mon, Oct 31, 2011 at 4:08 PM, Dave Angel d...@davea.name wrote:
I was wrong once again. But a simple combination of translate() and
split() methods might do it. Here I'm suggesting that the table replace all
valid characters with space, so the split() can use its default behavior.
That
On Mon, 31 Oct 2011 17:47:06 -0400, Dave Angel wrote:
On 10/31/2011 03:54 PM, pyt...@bdurham.com wrote:
Wondering if there's a fast/efficient built-in way to determine if a
string has non-ASCII chars outside the range ASCII 32-127, CR, LF, or
Tab?
I know I can look at the chars of a string
On 10/31/2011 3:54 PM, pyt...@bdurham.com wrote:
Wondering if there's a fast/efficient built-in way to determine if a
string has non-ASCII chars outside the range ASCII 32-127, CR, LF, or Tab?
I presume you also want to disallow the other ascii control chars?
I know I can look at the chars
On 10/31/11 18:02, Steven D'Aprano wrote:
# Define legal characters:
LEGAL = ''.join(chr(n) for n in range(32, 128)) + '\n\r\t\f'
# everybody forgets about formfeed... \f
# and are you sure you want to include chr(127) as a text char?
def is_ascii_text(text):
for c in text:
On Mon, Oct 31, 2011 at 4:08 PM, Dave Angel d...@davea.name wrote:
Yes. Actually, you don't even need the split() -- you can pass an
optional deletechars parameter to translate().
On Oct 31, 5:52 pm, Ian Kelly ian.g.ke...@gmail.com wrote:
That sounds overly complicated and error-prone.
Not
On 10/31/2011 7:02 PM, Steven D'Aprano wrote:
On Mon, 31 Oct 2011 17:47:06 -0400, Dave Angel wrote:
On 10/31/2011 03:54 PM, pyt...@bdurham.com wrote:
Wondering if there's a fast/efficient built-in way to determine if a
string has non-ASCII chars outside the range ASCII 32-127, CR, LF, or
Tab?
On 10/31/2011 08:32 PM, Patrick Maupin wrote:
On Mon, Oct 31, 2011 at 4:08 PM, Dave Angeld...@davea.name wrote:
Yes. Actually, you don't even need the split() -- you can pass an
optional deletechars parameter to translate().
On Oct 31, 5:52 pm, Ian Kellyian.g.ke...@gmail.com wrote:
That
On Oct 31, 9:12 pm, Dave Angel d...@davea.name wrote:
I would claim that a well-written (in C) translate function, without
using the delete option, should be much quicker than any python loop,
even if it does copy the data.
Are you arguing with me? I was agreeing with you, I thought, that
20 matches
Mail list logo