Nick Maclaren wrote:
> You can convert them to things that are sort of NFA/DFA
> hybrids,
If you could express it as an NFA, then you could
(in principle) convert it to a DFA. So whatever it's
using can't be an NFA either.
--
Greg
___
Python-Dev mailing
James Y Knight <[EMAIL PROTECTED]> wrote:
>
> > Firstly, things like backreferences are an absolute no-no. They
> > are not regular, and REs with them in cannot be converted to DFAs.
> > That could be 'solved' by a parser that kicked out such constructions,
> > but it would get screams from many u
=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= <[EMAIL PROTECTED]> wrote:
>
> Your specification was "For Unicode, whatever people agree!"
>
> I would not call that "Unicode-based".
Can we drop this, please? I am happy to agree that I was being unclear
(it is a common failing of mine), but I did prov
James Y Knight wrote:
> On Aug 8, 2007, at 3:47 PM, Nick Maclaren wrote:
> > Firstly, things like backreferences are an absolute no-no. They
> > are not regular, and REs with them in cannot be converted to DFAs.
>
> People keep saying things like this as if GNU grep and tcl's regular
> expressio
On Aug 8, 2007, at 3:47 PM, Nick Maclaren wrote:
> Firstly, things like backreferences are an absolute no-no. They
> are not regular, and REs with them in cannot be converted to DFAs.
> That could be 'solved' by a parser that kicked out such constructions,
> but it would get screams from many user
Martin v. Löwis wrote:
> I know the term "printable character", which is what I read
> in definitions of the isprint() routine. "printing character"
> I never heard before.
Hmmm... I guess this means your brain is using a
part-of-speech-sensitive word->technical_meaning
mapping.
Perhaps this will
Nick Maclaren schrieb:
>> The relevance is that your specification of "printing character"
>> as "isprint returns true" is nearly useless, as it only applies
>> to byte-oriented characters.
>
> Eh? That's ALL I used it to specify! I used a Unicode-based
> specification for Unicode.
Your specifi
=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= <[EMAIL PROTECTED]> wrote:
>
> There is no problem for isalnum: it will just go away if
> byte-oriented characters go away. Fortunately, we have a
> replacement for the Unicode case.
As we do for isprint.
> The relevance is that your specification of "pr
>> In the mediate term, locale-based testing will go away/be not
>> implementable (in particular, Py3k won't have a byte-oriented
>> character string type, so we can't use isprint). In general,
>> isprint is unsuitable since it doesn't support multi-byte
>> character sets.
>
> Well, iswprint isn't
=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= <[EMAIL PROTECTED]> wrote:
>
> >> Before discussing the escape, I'd like to see a specification of
> >> it first - what characters precisely would classify as "printing"?
> >
> > For basic ASCII and locale-based testing, whatever isprint() says.
> > Just
>> Before discussing the escape, I'd like to see a specification of
>> it first - what characters precisely would classify as "printing"?
>
> For basic ASCII and locale-based testing, whatever isprint() says.
> Just as for isalpha().
In the mediate term, locale-based testing will go away/be not
i
In 8-Aug-07, at 12:47 PM, Nick Maclaren wrote:
>
>>> The other approach, which is to stick to true regular expressions,
>>> and wholly or partially convert to DFAs, has already been rendered
>>> impossible by even the limited Perl/PCRE extensions that Python
>>> has adopted.
>>
>> Impossible? Sur
I am not on "Python 3000", so am restricting.
Mike Klaas <[EMAIL PROTECTED]> wrote:
>
> > I have needed to push my stack to teach REs (don't ask), and am
> > taking a look at the RE code. I may be able to extend it to support
> > RFE 694374 and (more importantly) atomic groups and possessive
> >
[ I would appreciate not getting private copies as well. ]
=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= <[EMAIL PROTECTED]> wrote:
>
> Before discussing the escape, I'd like to see a specification of
> it first - what characters precisely would classify as "printing"?
For basic ASCII and locale-bas
On 8-Aug-07, at 2:28 AM, Nick Maclaren wrote:
> I have needed to push my stack to teach REs (don't ask), and am
> taking a look at the RE code. I may be able to extend it to support
> RFE 694374 and (more importantly) atomic groups and possessive
> quantifiers. While I regard such things as revo
> Further to the above, I found the Unicode sources, have rebuilt
> the files, but it involved some fairly serious hacking to the
> building mechanism and I have had to disable the Unicode 3.2
> support. And, of course, that means that 4 of the tests fail.
>
> This area needs addressing, not leas
> My second one is about Unicode. I really, but REALLY regard it as
> a serious defect that there is no escape for printing characters.
> Any code that checks arbitrary text is likely to need them - yes,
> I know why Perl and hence PCRE doesn't have that, but let's skip
> that. That is easy to ad
Nick Maclaren schrieb:
> Further to the above, I found the Unicode sources, have rebuilt
> the files, but it involved some fairly serious hacking to the
> building mechanism and I have had to disable the Unicode 3.2
> support. And, of course, that means that 4 of the tests fail.
>
> This area nee
Further to the above, I found the Unicode sources, have rebuilt
the files, but it involved some fairly serious hacking to the
building mechanism and I have had to disable the Unicode 3.2
support. And, of course, that means that 4 of the tests fail.
This area needs addressing, not least because Py
I have needed to push my stack to teach REs (don't ask), and am
taking a look at the RE code. I may be able to extend it to support
RFE 694374 and (more importantly) atomic groups and possessive
quantifiers. While I regard such things as revolting beyond belief,
they make a HELL of a difference t
20 matches
Mail list logo