Re: Regex Python Help

2015-03-24 Thread Skip Montanaro
On Tue, Mar 24, 2015 at 1:13 PM, gdot...@gmail.com wrote:

 SyntaxError: Missing parentheses in call to 'print'


It appears you are attempting to use a Python 2.x print statement with
Python 3.x Try changing the last line to

print(line.rstrip())

Skip
-- 
https://mail.python.org/mailman/listinfo/python-list


[issue2636] Adding a new regex module (compatible with re)

2015-03-18 Thread Evgeny Kapun

Changes by Evgeny Kapun abacabadabac...@gmail.com:


--
nosy: +abacabadabacaba

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2636
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: regex help

2015-03-13 Thread Thomas 'PointedEars' Lahn
Larry Martell wrote:

 I need to remove all trailing zeros to the right of the decimal point,
 but leave one zero if it's whole number. For example, if I have this:
 
 
14S,5.,4.5686274500,3.7272727272727271,3.3947368421052630,5.7307692307692308,5.7547169811320753,4.9423076923076925,5.7884615384615383,5.13725490196
 
 I want to end up with:
 
 
14S,5.0,4.56862745,3.7272727272727271,3.394736842105263,5.7307692307692308,5.7547169811320753,4.9423076923076925,5.7884615384615383,5.13725490196
 
 I have a regex to remove the zeros:
 
 '0+[,$]', ''
 
 But I can't figure out how to get the 5. to be 5.0.
 I've been messing with the negative lookbehind, but I haven't found
 one that works for this.

First of all, I find it unlikely that you really want to solve your problem 
with regular expressions.  Google “X-Y problem”.

Second, if you must use regular expressions, the most simple approach is to 
use backreferences.

Third, you need to show the relevant (Python) code.

http://www.catb.org/~esr/faqs/smart-questions.html

-- 
PointedEars

Twitter: @PointedEars2
Please do not cc me. / Bitte keine Kopien per E-Mail.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: regex help

2015-03-13 Thread Tim Chase
On 2015-03-13 12:05, Larry Martell wrote:
 I need to remove all trailing zeros to the right of the decimal
 point, but leave one zero if it's whole number. 
 
 But I can't figure out how to get the 5. to be 5.0.
 I've been messing with the negative lookbehind, but I haven't found
 one that works for this.

You can do it with string-ops, or you can resort to regexp.
Personally, I like the clarity of the string-ops version, but use
what suits you.

-tkc

import re
input = [
'14S',
'5.',
'4.5686274500',
'3.7272727272727271',
'3.3947368421052630',
'5.7307692307692308',
'5.7547169811320753',
'4.9423076923076925',
'5.7884615384615383',
'5.13725490196',
]

output = [
'14S',
'5.0',
'4.56862745',
'3.7272727272727271',
'3.394736842105263',
'5.7307692307692308',
'5.7547169811320753',
'4.9423076923076925',
'5.7884615384615383',
'5.13725490196',
]


def fn1(s):
if '.' in s:
s = s.rstrip('0')
if s.endswith('.'):
s += '0'
return s

def fn2(s):
return re.sub(r'(\.\d+?)0+$', r'\1', s)

for fn in (fn1, fn2):
for i, o in zip(input, output):
v = fn(i)
print %s: %s - %s [%s] % (v == o, i, v, o)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: regex help

2015-03-13 Thread MRAB

On 2015-03-13 16:05, Larry Martell wrote:

I need to remove all trailing zeros to the right of the decimal point,
but leave one zero if it's whole number. For example, if I have this:

14S,5.,4.5686274500,3.7272727272727271,3.3947368421052630,5.7307692307692308,5.7547169811320753,4.9423076923076925,5.7884615384615383,5.13725490196

I want to end up with:

14S,5.0,4.56862745,3.7272727272727271,3.394736842105263,5.7307692307692308,5.7547169811320753,4.9423076923076925,5.7884615384615383,5.13725490196

I have a regex to remove the zeros:

'0+[,$]', ''

But I can't figure out how to get the 5. to be 5.0.
I've been messing with the negative lookbehind, but I haven't found
one that works for this.


Search: (\.\d+?)0+\b
Replace: \1

which is:

re.sub(r'(\.\d+?)0+\b', r'\1', string)

--
https://mail.python.org/mailman/listinfo/python-list


Re: regex help

2015-03-13 Thread Larry Martell
On Fri, Mar 13, 2015 at 1:29 PM, MRAB pyt...@mrabarnett.plus.com wrote:
 On 2015-03-13 16:05, Larry Martell wrote:

 I need to remove all trailing zeros to the right of the decimal point,
 but leave one zero if it's whole number. For example, if I have this:


 14S,5.,4.5686274500,3.7272727272727271,3.3947368421052630,5.7307692307692308,5.7547169811320753,4.9423076923076925,5.7884615384615383,5.13725490196

 I want to end up with:


 14S,5.0,4.56862745,3.7272727272727271,3.394736842105263,5.7307692307692308,5.7547169811320753,4.9423076923076925,5.7884615384615383,5.13725490196

 I have a regex to remove the zeros:

 '0+[,$]', ''

 But I can't figure out how to get the 5. to be 5.0.
 I've been messing with the negative lookbehind, but I haven't found
 one that works for this.

 Search: (\.\d+?)0+\b
 Replace: \1

 which is:

 re.sub(r'(\.\d+?)0+\b', r'\1', string)

Thanks! That works perfectly.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: regex help

2015-03-13 Thread Cameron Simpson

On 13Mar2015 12:05, Larry Martell larry.mart...@gmail.com wrote:

I need to remove all trailing zeros to the right of the decimal point,
but leave one zero if it's whole number. For example, if I have this:

14S,5.,4.5686274500,3.7272727272727271,3.3947368421052630,5.7307692307692308,5.7547169811320753,4.9423076923076925,5.7884615384615383,5.13725490196

I want to end up with:

14S,5.0,4.56862745,3.7272727272727271,3.394736842105263,5.7307692307692308,5.7547169811320753,4.9423076923076925,5.7884615384615383,5.13725490196

I have a regex to remove the zeros:

'0+[,$]', ''

But I can't figure out how to get the 5. to be 5.0.
I've been messing with the negative lookbehind, but I haven't found
one that works for this.


Leaving aside the suggested non-greedy match, you can rephrase this: strip 
trailing zeroes _after_ the first decimal digit. Then you can consider a number 
to be:


 digits
 point
 any digit
 other digits to be right-zero stripped

so:

 (\d+\.\d)(\d*[1-9])?0*\b

and keep .group(1) and .group(2) from the match.

Another way of considering the problem.

Or you could two step it. Strip all trailing zeroes. If the result ends in a 
dot, add a single zero.


Cheers,
Cameron Simpson c...@zip.com.au

C'mon. Take the plunge. By the time you go through rehab the first time,
you'll be surrounded by the most interesting people, and if it takes years
off of your life, don't sweat it. They'll be the last ones anyway.
   - Vinnie Jordan, alt.peeves
--
https://mail.python.org/mailman/listinfo/python-list


Re: regex help

2015-03-13 Thread Steven D'Aprano
Larry Martell wrote:

 I need to remove all trailing zeros to the right of the decimal point,
 but leave one zero if it's whole number. 


def strip_zero(s):
if '.' not in s:
return s
s = s.rstrip('0')
if s.endswith('.'):
s += '0'
return s


And in use:

py strip_zero('-10.2500')
'-10.25'
py strip_zero('123000')
'123000'
py strip_zero('123000.')
'123000.0'


It doesn't support exponential format:

py strip_zero('1.230e3')
'1.230e3'

because it isn't clear what you intend to do under those circumstances.


-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


regex help

2015-03-13 Thread Larry Martell
I need to remove all trailing zeros to the right of the decimal point,
but leave one zero if it's whole number. For example, if I have this:

14S,5.,4.5686274500,3.7272727272727271,3.3947368421052630,5.7307692307692308,5.7547169811320753,4.9423076923076925,5.7884615384615383,5.13725490196

I want to end up with:

14S,5.0,4.56862745,3.7272727272727271,3.394736842105263,5.7307692307692308,5.7547169811320753,4.9423076923076925,5.7884615384615383,5.13725490196

I have a regex to remove the zeros:

'0+[,$]', ''

But I can't figure out how to get the 5. to be 5.0.
I've been messing with the negative lookbehind, but I haven't found
one that works for this.
-- 
https://mail.python.org/mailman/listinfo/python-list


[issue22364] Improve some re error messages using regex for hints

2015-03-01 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Could anyone please make a review? This patch is a prerequisite of other 
patches.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22364
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23532] add example of 'first match wins' to regex | documentation?

2015-02-27 Thread Matthew Barnett

Matthew Barnett added the comment:

Not quite all. POSIX regexes will always look for the longest match, so the 
order of the alternatives doesn't matter, i.e. x|xy would give the same result 
as xy|x.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23532
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23532] regex | behavior differs from documentation

2015-02-26 Thread Rick Otten

Changes by Rick Otten rottenwindf...@gmail.com:


--
components: Regular Expressions
nosy: Rick Otten, ezio.melotti, mrabarnett
priority: normal
severity: normal
status: open
title: regex | behavior differs from documentation
type: behavior
versions: Python 2.7

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23532
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23532] regex | behavior differs from documentation

2015-02-26 Thread Mark Shannon

Mark Shannon added the comment:

This looks like the expected behaviour to me.
re.sub matches the leftmost occurence and the regular expression is greedy so 
(x|xy) will always match xy if it can.

--
nosy: +Mark.Shannon

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23532
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23532] regex | behavior differs from documentation

2015-02-26 Thread Rick Otten

Rick Otten added the comment:

Can the documentation be updated to make this more clear?

I see now where the clause As the target string is scanned, ... is describing 
what you have listed here.

I and a coworker both read the description several times and missed that.  I 
thought it first tried incorporated against the whole string, then tried  
inc against the whole string, etc...  When actually it was trying each, 
incorporated and  inc and the others against the first position of the 
string.  And then again for the second position.

Since I want to force the order against the whole string before trying the next 
one for my particular use case, I'll do a series of re.subs instead of trying 
to do them all in one.  It makes sense now and is easy to fix.

Thanks for looking at it and explaining what is happening more clearly.  It was 
really not obvious.  I tried at least 100 variations and wasn't seeing the 
pattern.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23532
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23532] regex | behavior differs from documentation

2015-02-26 Thread Matthew Barnett

Matthew Barnett added the comment:

@Mark is correct, it's not a bug.

In the first example:

It tries to match each alternative at position 0. Failure.
It tries to match each alternative at position 1. Failure.
It tries to match each alternative at position 2. Failure.
It tries to match each alternative at position 3. Success. ' inc' matches.

In the second example:

It tries to match each alternative at position 0. Failure.
It tries to match each alternative at position 1. Failure.
It tries to match each alternative at position 2. Failure.
It tries to match each alternative at position 3. Failure.
It tries to match each alternative at position 4. Success. 'incorporated' 
matches. ('inc' is a later alternative; it's considered only if the earlier 
alternatives have failed to match at that position.)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23532
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23532] regex | behavior differs from documentation

2015-02-26 Thread Rick Otten

New submission from Rick Otten:

The documentation states that | parsing goes from left to right.  This 
doesn't seem to be true when spaces are involved.  (or \s).

Example:

In [40]: mystring
Out[40]: 'rwo incorporated'

In [41]: re.sub('incorporated| inc|llc|corporation|corp| co', '', mystring)
Out[41]: 'rwoorporated'

In this case  inc was processed before incorporated.
If I take the space out:

In [42]: re.sub('incorporated|inc|llc|corporation|corp| co', '', mystring)
Out[42]: 'rwo '

incorporated is processed first.

If I put a space with each, then  incorporated is processed first:

In [43]: re.sub(' incorporated| inc|llc|corporation|corp| co', '', mystring)
Out[43]: 'rwo'

And If use \s instead of a space, it is processed first:

In [44]: re.sub('incorporated|\sinc|llc|corporation|corp| co', '', mystring)
Out[44]: 'rwoorporated'

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23532
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23532] add example of 'first match wins' to regex | documentation?

2015-02-26 Thread R. David Murray

R. David Murray added the comment:

The thing is, what you describe is fundamental to how regular expressions work. 
 I'm not sure it makes sense to add a specific mention of it to the '|' docs, 
since it applies to all regexes.

--
assignee:  - docs@python
components: +Documentation -Regular Expressions
nosy: +docs@python, r.david.murray
title: regex | behavior differs from documentation - add example of 'first 
match wins' to regex |  documentation?

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23532
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22364] Improve some re error messages using regex for hints

2015-02-20 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

 Messages tend to be abbreviated, so I think that it would be better to just
 omit the article.

I agree, but this is came from standard error messages which are not 
consistent. I opened a thread on Python-Dev.

expected a bytes-like object and expected str instance are standard error 
messages raised in bytes.join and str.join, not in re. We could change them 
though.

 I don't think that the error message bad repeat interval is an improvement
 (Why is it bad? What is an interval?). I think that saying that the min
 is greater than the max is clearer.

Agree. I'll change this in re. What message is better in case of overflow: the 
repetition number is too large (in re) or repeat count too big (in regex)?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22364
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22364] Improve some re error messages using regex for hints

2015-02-18 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Here is a patch for regex which makes some error messages be the same as in re 
with re_errors_2.patch. You could apply it to regex if new error messages look 
better than old error messages. Otherwise we could change re error messages to 
match regex, or discuss better variants.

--
Added file: http://bugs.python.org/file38171/regex_errors.diff

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22364
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22364] Improve some re error messages using regex for hints

2015-02-18 Thread Matthew Barnett

Matthew Barnett added the comment:

Some error messages use the indefinite article:

expected a bytes-like object, %.200s found
cannot use a bytes pattern on a string-like object
cannot use a string pattern on a bytes-like object

but others don't:

expected string instance, %.200s found
expected str instance, %.200s found

Messages tend to be abbreviated, so I think that it would be better to just 
omit the article.

I don't think that the error message bad repeat interval is an improvement 
(Why is it bad? What is an interval?). I think that saying that the min is 
greater than the max is clearer.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22364
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22364] Improve some re error messages using regex for hints

2015-02-10 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Updated patch addresses Ezio's comments.

--
Added file: http://bugs.python.org/file38080/re_errors_2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22364
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22364] Improve some re error messages using regex for hints

2015-02-07 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Here is a patch which unify and improves re error messages. Added tests for all 
parsing errors. Now error message always points on the start of affected 
component, i.e. on the start of bad escape, group name or unterminated 
subpattern.

--
stage: needs patch - patch review
Added file: http://bugs.python.org/file38035/re_errors.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22364
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22364] Improve some re error messages using regex for hints

2015-02-07 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

re_errors_diff.txt contains differences for all tested error messages.

--
Added file: http://bugs.python.org/file38036/re_errors_diff.txt

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22364
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23191] fnmatch regex cache use is not threadsafe

2015-01-27 Thread Serhiy Storchaka

Changes by Serhiy Storchaka storch...@gmail.com:


--
resolution:  - fixed
stage: patch review - resolved
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23191
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23191] fnmatch regex cache use is not threadsafe

2015-01-27 Thread Roundup Robot

Roundup Robot added the comment:

New changeset fe12c34c39eb by Serhiy Storchaka in branch '2.7':
Issue #23191: fnmatch functions that use caching are now threadsafe.
https://hg.python.org/cpython/rev/fe12c34c39eb

--
nosy: +python-dev

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23191
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23191] fnmatch regex cache use is not threadsafe

2015-01-27 Thread Serhiy Storchaka

Changes by Serhiy Storchaka storch...@gmail.com:


--
assignee:  - serhiy.storchaka

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23191
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23318] (compiled RegEx).split gives unexpected results if () in pattern

2015-01-25 Thread Dave Notman

New submission from Dave Notman:

# Python 3.3.1 (default, Sep 25 2013, 19:30:50)
# Linux 3.8.0-35-generic #50-Ubuntu SMP Tue Dec 3 01:25:33 UTC 2013 i686 i686 
i686 GNU/Linux

import re

splitter = re.compile( r'(\s*[+/;,]\s*)|(\s+and\s+)' )
ll = splitter.split( 'Dave  Sam, Jane and Zoe' )
print(repr(ll))

print( 'Try again with revised RegEx' )
splitter = re.compile( r'(?:(?:\s*[+/;,]\s*)|(?:\s+and\s+))' )
ll = splitter.split( 'Dave  Sam, Jane and Zoe' )
print(repr(ll))

Results:
['Dave', '  ', None, 'Sam', ', ', None, 'Jane', None, ' and ', 'Zoe']
Try again with revised RegEx
['Dave', 'Sam', 'Jane', 'Zoe']

--
components: Regular Expressions
messages: 234677
nosy: dnotmanj, ezio.melotti, mrabarnett
priority: normal
severity: normal
status: open
title: (compiled RegEx).split gives unexpected results if () in pattern
type: behavior
versions: Python 3.3

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23318
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23318] (compiled RegEx).split gives unexpected results if () in pattern

2015-01-25 Thread SilentGhost

SilentGhost added the comment:

Looks like it works exactly as the docs[1] describe:

 re.split(r'\s*[+/;,]\s*|\s+and\s+', string)
['Dave', 'Sam', 'Jane', 'Zoe']

You're using capturing groups (parentheses) in your original regex which 
returns separators as part of a match.

[1] https://docs.python.org/3/library/re.html#re.split

--
nosy: +SilentGhost
resolution:  - not a bug
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23318
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: Python 3 regex?

2015-01-14 Thread Steven D'Aprano
Thomas 'PointedEars' Lahn wrote:

 wxjmfa...@gmail.com wrote:
[...]
 And, why not? compare Py3.2 and Py3.3+ !
 
 What are you getting at?


Don't waste your time with JMF. He is obsessed with a trivial performance
regression in Python 3.3. Unicode strings can be slightly more expensive to
create in Python 3.3 compared to earlier versions, due to a clever memory
optimization which saves up to 50% if your strings are all in the Basic
Multilingual Plane and up to 75% if they are all in Latin-1. Never mind
that for real-world code, that memory saving can often lead to applications
running faster, JMF is obsessed with an artificial benchmark of his own
devising that involves making, and throwing away, thousands of Unicode
strings as fast as possible in such a way as to exercise the worst-case of
the new Unicode model. From this unimportant performance regression, he has
convinced himself that this means that Python 3.3 and beyond is logically
and mathematically in violation of the Unicode standard.

Any time JMF mentions anything to do with Python versions or Unicode or
ASCII or French, he is in full-blown pi equals 3 exactly crank territory
and is best ignored.



-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3 regex?

2015-01-14 Thread Rustom Mody
On Tuesday, January 13, 2015 at 10:06:50 AM UTC+5:30, Steven D'Aprano wrote:
 On Mon, 12 Jan 2015 19:48:18 +, Ian wrote:
 
  My recommendation would be to write a recursive decent parser for your
  files.
  
  That way will be easier to write,
 
 I know that writing parsers is a solved problem in computer science, and 
 that doing so is allegedly one of the more trivial things computer 
 scientists are supposed to be able to do, but the learning curve to write 
 parsers is if anything even higher than the learning curve to write a 
 regex.
 
 I wish that Python made it as easy to use EBNF to write a parser as it 
 makes to use a regex :-(
 
 http://en.wikipedia.org/wiki/Extended_Backus-Naur_Form
 
 
 
 -- 
 Steven

There appears to be at least one python package for this
https://pypi.python.org/pypi/iscconf

And for those wanting to use regexes to parse CFGs, the requried
reading is:

http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3 regex?

2015-01-14 Thread alister
On Wed, 14 Jan 2015 14:02:27 +0100, Thomas 'PointedEars' Lahn wrote:

 wxjmfa...@gmail.com wrote:
 
 Le mardi 13 janvier 2015 03:53:43 UTC+1, Rick Johnson a écrit :
 [...]
 you should find Python's text processing Nirvana
 [...]
 
 I recommend, you write a small application
 
 I recommend you get a real name and do not post using the troll and
 spam- infested Google Groups, but a newsreader instead.
 
 sorting strings composed of latin characters, a sort based on
 diacritical characters
 
 I do not think you need regular expressions for that: you can use
 Unicode collations.
 
 (and eventually, taking into account linguistic specific aspects).
 
 BTDT.  For a translator application, I used Python to sort a dictionary
 of the Latin phonetic transcription of Golic Vulcan whose alphabet is “S
 T P K R L A Sh O U D V Kh E H G Ch I N Zh M Y F Z Th W B” [1].  re
 helped a lot with that because inversely sorting the list by character
 length and turning it into an alternation allowed me to easily find the
 characters in words, and assign numbers to the letters so that I could
 sort the words according to this alphabet.  If anyone is interested, I
 can post the relevant code.
 
 And, why not? compare Py3.2 and Py3.3+ !
 
 What are you getting at?
 
 [1] http://home.comcast.net/~markg61/vlif.htm

Do not wast you/our time with JMF
he is a resident troll but unlike some of our resident trolls I don't 
think he has ever contributed anything useful 



-- 
An actor's a guy who if you ain't talkin' about him, ain't listening.
-- Marlon Brando
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3 regex?

2015-01-14 Thread Rustom Mody
On Tuesday, January 13, 2015 at 10:06:50 AM UTC+5:30, Steven D'Aprano wrote:
 On Mon, 12 Jan 2015 19:48:18 +, Ian wrote:
 
  My recommendation would be to write a recursive decent parser for your
  files.
  
  That way will be easier to write,
 
 I know that writing parsers is a solved problem in computer science, and 
 that doing so is allegedly one of the more trivial things computer 
 scientists are supposed to be able to do, 

Solved-CS-problem often is showing that the problem is 
unsolvable :-)

http://blog.reverberate.org/2013/08/parsing-c-is-literally-undecidable.html

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3 regex?

2015-01-14 Thread Thomas 'PointedEars' Lahn
wxjmfa...@gmail.com wrote:

 Le mardi 13 janvier 2015 03:53:43 UTC+1, Rick Johnson a écrit :
 [...]
 you should find Python's text processing Nirvana
 [...]
 
 I recommend, you write a small application

I recommend you get a real name and do not post using the troll and spam-
infested Google Groups, but a newsreader instead.

 sorting strings composed of latin characters, a sort based on
 diacritical characters

I do not think you need regular expressions for that: you can use Unicode 
collations.

 (and eventually, taking into account linguistic specific aspects).

BTDT.  For a translator application, I used Python to sort a dictionary of 
the Latin phonetic transcription of Golic Vulcan whose alphabet is “S T P K 
R L A Sh O U D V Kh E H G Ch I N Zh M Y F Z Th W B” [1].  re helped a lot 
with that because inversely sorting the list by character length and turning 
it into an alternation allowed me to easily find the characters in words, 
and assign numbers to the letters so that I could sort the words according 
to this alphabet.  If anyone is interested, I can post the relevant code.

 And, why not? compare Py3.2 and Py3.3+ !

What are you getting at?

[1] http://home.comcast.net/~markg61/vlif.htm

-- 
PointedEars

Twitter: @PointedEars2
Please do not cc me. / Bitte keine Kopien per E-Mail.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3 regex?

2015-01-13 Thread Rick Johnson
On Tuesday, January 13, 2015 at 11:09:17 AM UTC-6, Rick Johnson wrote:
 [...]
 DO YOU NEED ME TO DRAW YOU A PICTURE?

I don't normally do this, but in the interest of education
i feel i must bear the burdens for which all professional
educators like myself are responsible. 

  https://plus.google.com/114883720122692827712/posts/Nxo3rR7TwQS
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3 regex?

2015-01-13 Thread Rick Johnson
On Tuesday, January 13, 2015 at 12:39:55 AM UTC-6, Steven D'Aprano wrote:
 On Mon, 12 Jan 2015 15:47:08 -0800, Rick Johnson wrote:
 [...]
  [...]
  
  #Ironic Twist (Reformatted)#
  
  # Some diabetics, when confronted with hunger, think I#
  # know, I'll eat a box of sugar cookies. -- now they have #
  # # # two problems!'   #
  

 Not the best of analogies, since there are two forms of
 diabetes. Those with Type 2 diabetes can best manage their
 illness by avoiding sugar cookies. Those with Type 1
 should keep a box of sugar cookies (well, perhaps glucose
 lollies are more appropriate) on hand for emergencies.

You seem to misunderstand the basic distinction between
type1 and type2 diabetes, it's not a mere dichotomy between
hyperglycemia and hypoglycemia that defines a diabetes
diagnosis, NO, Type1 can be simplified as insulin
deficiency and Type2 as insulin resistance -- with both
resulting in the inability of glucose (aka: fuel) to nourish
the cells.

YOUR ASSESSMENT OF MY ANALOGY IS JUST AS WEAK.

Both my and Jamie's analogy present an example of the cruel
irony. The only *DIFFERENCE* is that mine utilizes a
subject matter which requires less study to understand.

One can learn enough about diabetes to draw his own factual
conclusions of my statement from a simple Google search,
however, for regexps, a neophyte would need days, weeks, or
even months of serious study to drawn sensible conclusions
of merit.

 In any case, most people with diabetes (or at least those
 who are still alive) are reasonably good at managing their
 illness and wouldn't make the choice you suggest. You have
 missed the point that people who misuse regexes are common
 in programming circles, while diabetics who eat a box of
 sugar cookies instead of a meal are rare.

I believe you could find many diabetics who've eaten poorly
and suffered from the result -- even died! I'm not missing
the point, you are! 

HECK, *I'M* THE ONE WHO *DEFINED* THE POINT.

 To take your analogy to an extreme:

   Some people, when faced with a problem, say I know, I'll cut
   my arm off with a pocketknife! Now they have two problems.

 This is not insightful or useful. Except in the most
 specialized and extreme circumstances, such as being
 trapped in the wilderness with a boulder on your arm,
 nobody would consider this to be good advice.

I'm not giving *advice*, i'm merely drawing parallels. I
think your repeated failures to understand me are are a
result of your superficiality. When reading my posts, you
need to learn to: read between the lines. Many of the
writings i author are implicit philosophical statements,
musings, and/or explorations. For me, everything has deeper
meanings, just begging to be *plundered*!

 But using regexes to validate email addresses or parse
 HTML? The internet is full of people who thought that was
 a good idea.

Again, i did not suggested that people have never done
anything stupid with regexps, on the contrary, this list has
bear witness to many of them. My only intention was to point
out the damaging (albeit interesting) effects of propaganda.

MY WHOLE POINT IS ABOUT PROPAGANDA!

THAT'S IT!

DO YOU NEED ME TO DRAW YOU A PICTURE?

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3 regex?

2015-01-13 Thread Jussi Piitulainen
alister alister.nospam.w...@ntlworld.com writes:

 On Tue, 13 Jan 2015 04:36:38 +, Steven D'Aprano wrote:
 
  On Mon, 12 Jan 2015 19:48:18 +, Ian wrote:
  
  My recommendation would be to write a recursive decent parser for
  your files.
  
  That way will be easier to write,
  
  I know that writing parsers is a solved problem in computer
  science, and that doing so is allegedly one of the more trivial
  things computer scientists are supposed to be able to do, but the
  learning curve to write parsers is if anything even higher than
  the learning curve to write a regex.
  
  I wish that Python made it as easy to use EBNF to write a parser as it
  makes to use a regex :-(
  
  http://en.wikipedia.org/wiki/Extended_Backus–Naur_Form
 
 I would not say that writing parsers is a solved problem.  there may
 be solutions for a number of specific cases but many cases still
 cause difficulty, as an example I do not think there is a 100%
 complete parser for English (even native English speakers don't
 always get it)

There is no complete characterization of English as a set of character
strings, nor will there ever be. Linguists have a slogan for this: All
Grammars Leak. (They used to write formal grammars to characterize
all and only the well-formed sentences of a language, or to capture
necessary and sufficient conditions, and those grammars turned out
to both over-generate and under-generate.)

Ambiguity doesn't help. In practice, it's not enough to find a parse.
One wants a contextually appropriate parse. Sometimes this requires
genuine understanding and knowledge. Also in practice, one may not be
in the business of rejecting ill-formed sentences: one wants to make
partial sense of even those. So, no, never 100 percent complete or 100
percent correct :)

The solved problem is the unambiguous parsing of formal languages that
are defined by a formal grammar to begin with, like the configuration
file format at hand.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3 regex?

2015-01-13 Thread alister
On Tue, 13 Jan 2015 04:36:38 +, Steven D'Aprano wrote:

 On Mon, 12 Jan 2015 19:48:18 +, Ian wrote:
 
 My recommendation would be to write a recursive decent parser for your
 files.
 
 That way will be easier to write,
 
 I know that writing parsers is a solved problem in computer science, and
 that doing so is allegedly one of the more trivial things computer
 scientists are supposed to be able to do, but the learning curve to
 write parsers is if anything even higher than the learning curve to
 write a regex.
 
 I wish that Python made it as easy to use EBNF to write a parser as it
 makes to use a regex :-(
 
 http://en.wikipedia.org/wiki/Extended_Backus–Naur_Form



I would not say that writing parsers is a solved problem.
there may be solutions for a number of specific cases but many cases 
still cause difficulty, as an example I do not think there is a 100% 
complete parser for English (even native English speakers don't always 
get it)

-- 
Keep the number of passes in a compiler to a minimum.
-- D. Gries
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3 regex woes (parsing ISC DHCPD config)

2015-01-13 Thread Thomas 'PointedEars' Lahn
Jason Bailey wrote:

 My script first reads the DHCPD configuration file into memory -
 variable filebody. It then utilizes the re module to find the
 configuration details for the wanted shared network.
 
 The config file might look something like this:
 
 ##
 
 shared-network My-Network-MOHE {
subnet 192.168.0.0 netmask 255.255.248.0 {
  option routers 192.168.0.1;
  option tftp-server-name 192.168.90.12;
  pool {
deny dynamic bootp clients;
range 192.168.0.20 192.168.7.254;
  }
}
 }
 
 shared-network My-Network-CDCO {
subnet 192.168.8.0 netmask 255.255.248.0 {
  option routers 10.101.8.1;
  option tftp-server-name 192.168.90.12;
  pool {
deny dynamic bootp clients;
range 192.168.8.20 192.168.15.254;
  }
}
 }
 
 shared-network My-Network-FECO {
subnet 192.168.16.0 netmask 255.255.248.0 {
  option routers 192.168.16.1;
  option tftp-server-name 192.168.90.12;
  pool {
deny dynamic bootp clients;
range 192.168.16.20 192.168.23.254;
  }
}
 }
 
 ##
 
 Suppose I'm trying to grab the shared network called My-Network-FECO
 from the above config file stored in the variable 'filebody'.
 
 First I have my variable 'shared_network' which contains the string
 My-Network-FECO.
 
 I compile my regex:
 m = re.compile(r^(shared\-network ( + re.escape(shared_network) + r)
 \{((\n|.|\r\n)*?)(^\})), re.MULTILINE|re.UNICODE)

This code does not run as posted.  Applying Occam’s Razor, I think you meant 
to post

m = re.compile(r^(shared\-network (
  + re.escape(shared_network)
  + r) \{((\n|.|\r\n)*?)(^\})), re.MULTILINE|re.UNICODE)

(If you post long lines, know where your automatic word wrap happens.)

 I search for regex matches in my config file:
 m.search(filebody)

I find using the identifier “m” for the expression very strange.  Usually I 
reserve “m” to hold the *matches* for an expression on a string.
Consider “r” or “rx” or something else instead of “m” for the expression.

 Unfortunately, I get no matches. From output on the command line, I can
 see that Python is adding extra backslashes to my re.compile string. I
 have added the raw 'r' in front of the strings to prevent it, but to no
 avail.

Python is adding the extra backslashes because you used “r”.  Note that the 
console-printed string representations of strings do not have an “r” in 
front of them.  What you see is what you would have needed to write for 
equivalent code if you had not used “r”.  (Different from some other 
languages, Python does not distinguish between single-quoted and double-
quoted strings with regard to parsing.  Hence the r'…' feature, the triple-
quoted string, and the .format() method.)

You get no matches because you have escaped the HYPHEN-MINUSes (“-”).  You 
never need to escape those characters, in fact you must not do that here 
because r'\-' is not an (unnecessarily) escaped HYPHEN-MINUS, it is a 
literal backslash followed by a HYPHEN-MINUS, a character sequence that does 
not occur in your string.  Outside of a character class you do not need to 
do that, and in a character class you can put it as first or last character 
instead (“[-…]” or “[…-]”).

You have escaped the first HYPHEN-MINUS; re.escape() has escaped the other 
two for you:

|  re.escape('-')
| '\\-'

I presume this behavior is because of character classes, and the idea that 
the return value should work at any position in a character class.

ISTM that you cannot use re.escape() here, and you must escape special 
characters yourself (using re.sub()), should they be possible in the file.

I do not see a reason for making the entire expression a group (but for 
making the network name a group).  

You should refrain from parsing non-regular languages with a *single* 
regular expression (multiple expressions or expressions with alternation in 
a loop are usually fine; this can be used for building efficient parsers), 
even though Python’s regular expressions, which are not an exception there, 
are not exactly “regular” in the theoretical computer science sense.  See 
the Chomsky hierarchy and Jeffrey E. F. Friedl’s insightful textbook 
“Mastering Regular Expressions”.

It is possible that there is a Python module for parsing ISC dhcpd 
configuration files already.  If so, you should use that instead.

-- 
PointedEars

Twitter: @PointedEars2
Please do not cc me. / Bitte keine Kopien per E-Mail.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3 regex woes (parsing ISC DHCPD config)

2015-01-13 Thread Thomas 'PointedEars' Lahn
Thomas 'PointedEars' Lahn wrote:

 Jason Bailey wrote:
 shared-network My-Network-MOHE {
[…] {

 I compile my regex:
 m = re.compile(r^(shared\-network ( + re.escape(shared_network) + r)
 \{((\n|.|\r\n)*?)(^\})), re.MULTILINE|re.UNICODE)
 
 This code does not run as posted.  Applying Occam’s Razor, I think you
 meant to post
 
 m = re.compile(r^(shared\-network (
   + re.escape(shared_network)
   + r) \{((\n|.|\r\n)*?)(^\})), re.MULTILINE|re.UNICODE)
 
 […]
 You get no matches because you have escaped the HYPHEN-MINUSes (“-”).  You
 never need to escape those characters, in fact you must not do that here
 because r'\-' is not an (unnecessarily) escaped HYPHEN-MINUS, it is a
 literal backslash followed by a HYPHEN-MINUS, a character sequence that
 does not occur in your string.  Outside of a character class you do not
 need to do that, and in a character class you can put it as first or last
 character instead (“[-…]” or “[…-]”).
 
 You have escaped the first HYPHEN-MINUS; re.escape() has escaped the other
 two for you:
 
 |  re.escape('-')
 | '\\-'
 
 I presume this behavior is because of character classes, and the idea that
 the return value should work at any position in a character class.

It would appear that while my answer is not entirely wrong, the first 
sentence of that section is.  You may escape the HYPHEN-MINUS there, and may 
use re.escape(); it has no effect on the expression because of what I said 
following that sentence.  One must consider that the string is first parsed 
by Python’s string parser and then by Python’s re parser.

So I have presently no specific idea why you get no matches, however

  r'\{((\n|.|\r\n)*?)(^\}'

is not a proper way to match matching braces and everything in-between.

To begin with, the proper expression to match any newline is r'(\r?\n|\r)' 
because the first matching alternative in an alternation, not the longest 
one, wins.  But if you specify re.DOTALL, you can simply use “.” for any 
character (including any newline combination).
 
 […]
 You should refrain from parsing non-regular languages with a *single*
 regular expression (multiple expressions or expressions with alternation
 in a loop are usually fine; this can be used for building efficient
 parsers), even though Python’s regular expressions, which are not an
 exception there,
 are not exactly “regular” in the theoretical computer science sense.  See
 the Chomsky hierarchy and Jeffrey E. F. Friedl’s insightful textbook
 “Mastering Regular Expressions”.

And for matching matching braces (sic!) with regular expressions, you need a 
recursive one (which is another extension of regular expressions as they are 
discussed in CS).  Or a parser in the first place.  Otherwise you match too 
much with greedy matching

  { { } } { { } }
  ^-^

or too little with non-greedy matching

  { { } } { { } }
  ^---^

CS regular expressions can be used to describe *regular* languages (Chomsky-
type 3).  Bracket languages are, in general, not regular (see “pumping lemma 
for regular languages”), so for them you need an PDA¹-like extension of CS 
regular expressions (the aforementioned recursive ones), or a PDA 
implementation in the first place.  Such a PDA implementation is part of a 
parser.


¹  https://en.wikipedia.org/wiki/Pushdown_automaton
-- 
PointedEars

Twitter: @PointedEars2
Please do not cc me. / Bitte keine Kopien per E-Mail.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3 regex?

2015-01-13 Thread Rick Johnson
On Monday, January 12, 2015 at 11:34:57 PM UTC-6, Mark Lawrence wrote:
 You snipped the bit that says [normal cobblers snipped].

Oh my, where are my *manners* today? Tell you what, next time when
your sneaking up behind me with a knife in hand, do be a
friend and tap me on the shoulder first, so i can take the
knife and stab *myself* in the back!

Your pal, Rick.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3 regex?

2015-01-12 Thread Ian

On 12/01/2015 18:03, Jason Bailey wrote:

Hi all,

I'm working on a Python _3_ project that will be used to parse ISC 
DHCPD configuration files for statistics and alarming purposes (IP 
address pools, etc). Anyway, I'm hung up on this one section and was 
hoping someone could provide me with some insight.


My script first reads the DHCPD configuration file into memory - 
variable filebody. It then utilizes the re module to find the 
configuration details for the wanted shared network.



Hi Jason,

If you actually look at the syntax of what you are parsing, it is very 
simple.


My recommendation would be to write a recursive decent parser for your 
files.


That way will be easier to write, much easier to modify and almost 
certainly faster that a RE solution - and it can easily give you all the 
information in the file thus future proofing it.


'Some people, when confronted with a problem, think I know, I'll use 
regular expressions. Now they have two problems.' -  Jamie Zawinski.


Regards

Ian


--
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3 regex?

2015-01-12 Thread Chris Angelico
On Tue, Jan 13, 2015 at 5:03 AM, Jason Bailey jbai...@emerytelcom.com wrote:
 Unfortunately, I get no matches. From output on the command line, I can see
 that Python is adding extra backslashes to my re.compile string. I have
 added the raw 'r' in front of the strings to prevent it, but to no avail.


Regexes are notoriously hard to debug. Is there any particular reason
you _have_ to use one here? ISTM you could simplify it enormously by
just looking for the opening string:

shared_network = My-Network-FECO
network = filebody.split(\nshared-network +shared_network+
{,1)[1].split(\n}\n)[0]

Assuming your file is always correctly indented, and assuming you
don't have any other instances of that header string, you should be
fine.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3 regex woes (parsing ISC DHCPD config)

2015-01-12 Thread Dave Angel

On 01/12/2015 01:20 PM, Jason Bailey wrote:

Hi all,



What changed between 1:03 and 1:20 that made you post a nearly identical 
second message, as a new thread?




Unfortunately, I get no matches. From output on the command line, I can
see that Python is adding extra backslashes to my re.compile string. I
have added the raw 'r' in front of the strings to prevent it, but to no
avail.



What makes you think that?  Please isolate this part of your problem 
with a simple short program, so we can diagnose it.  You're probably 
getting confused between str() and repr().  The latter adds backslash 
escape sequences for good reason, and if you don't understand it, you 
might think the strings are getting corrupted.



--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3 regex?

2015-01-12 Thread Chris Angelico
On Tue, Jan 13, 2015 at 6:48 AM, Ian hobso...@gmail.com wrote:
 My recommendation would be to write a recursive decent parser for your
 files.

 That way will be easier to write, much easier to modify and almost certainly
 faster that a RE solution - and it can easily give you all the information
 in the file thus future proofing it.

Generally, even a recursive descent parser will be overkill. It's
pretty easy to do simple string manipulation to get the info you want;
maybe that means restricting the syntax some, but for a personal-use
script, that's usually no big cost. The example I gave requires that
the indentation be correct, and on this mailing list, I think people
agree that that's not a deal-breaker :)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3 regex woes (parsing ISC DHCPD config)

2015-01-12 Thread Albert-Jan Roskam


- Original Message -

 From: Jason Bailey jbai...@emerytelcom.com
 To: python-list@python.org
 Cc: 
 Sent: Monday, January 12, 2015 7:20 PM
 Subject: Python 3 regex woes (parsing ISC DHCPD config)
 
 Hi all,
 
 I'm working on a Python _3_ project that will be used to parse ISC DHCPD 
 configuration files for statistics and alarming purposes (IP address 
 pools, etc). Anyway, I'm hung up on this one section and was hoping 
 someone could provide me with some insight.
 
 My script first reads the DHCPD configuration file into memory - 
 variable filebody. It then utilizes the re module to find the 
 configuration details for the wanted shared network.
 
 The config file might look something like this:
 
 ##
 
 shared-network My-Network-MOHE {
subnet 192.168.0.0 netmask 255.255.248.0 {
  option routers 192.168.0.1;
  option tftp-server-name 192.168.90.12;
  pool {
deny dynamic bootp clients;
range 192.168.0.20 192.168.7.254;
  }
}
 }
 
 shared-network My-Network-CDCO {
subnet 192.168.8.0 netmask 255.255.248.0 {
  option routers 10.101.8.1;
  option tftp-server-name 192.168.90.12;
  pool {
deny dynamic bootp clients;
range 192.168.8.20 192.168.15.254;
  }
}
 }
 
 shared-network My-Network-FECO {
subnet 192.168.16.0 netmask 255.255.248.0 {
  option routers 192.168.16.1;
  option tftp-server-name 192.168.90.12;
  pool {
deny dynamic bootp clients;
range 192.168.16.20 192.168.23.254;
  }
}
 }
 
 ##
 
 Suppose I'm trying to grab the shared network called 
 My-Network-FECO 
 from the above config file stored in the variable 'filebody'.
 
 First I have my variable 'shared_network' which contains the string 
 My-Network-FECO.
 
 I compile my regex:
 m = re.compile(r^(shared\-network ( + re.escape(shared_network) 
 + r) 
 \{((\n|.|\r\n)*?)(^\})), re.MULTILINE|re.UNICODE)
 
 I search for regex matches in my config file:
 m.search(filebody)
 
 Unfortunately, I get no matches. From output on the command line, I can 
 see that Python is adding extra backslashes to my re.compile string. I 
 have added the raw 'r' in front of the strings to prevent it, but to no 
 avail.
 
 Thoughts on this?


Will the following work for you? My brain shuts down when I try to read your 
regex, but I believe you also used a non-greedy match.


Python 3.4.2 (default, Nov 20 2014, 13:01:11) 
[GCC 4.7.2] on linux
Type help, copyright, credits or license for more information.
 cfg = shared-network My-Network-MOHE {
...   subnet 192.168.0.0 netmask 255.255.248.0 {
... option routers 192.168.0.1;
... option tftp-server-name 192.168.90.12;
... pool {
...   deny dynamic bootp clients;
...   range 192.168.0.20 192.168.7.254;
... }
...   }
... }
... 
... shared-network My-Network-CDCO {
...   subnet 192.168.8.0 netmask 255.255.248.0 {
... option routers 10.101.8.1;
... option tftp-server-name 192.168.90.12;
... pool {
...   deny dynamic bootp clients;
...   range 192.168.8.20 192.168.15.254;
... }
...   }
... }
... 
... shared-network My-Network-FECO {
...   subnet 192.168.16.0 netmask 255.255.248.0 {
... option routers 192.168.16.1;
... option tftp-server-name 192.168.90.12;
... pool {
...   deny dynamic bootp clients;
...   range 192.168.16.20 192.168.23.254;
... }
...   }
... }
 import re
 re.findall(rshared\-network (.+) \{?, cfg)
['My-Network-MOHE', 'My-Network-CDCO', 'My-Network-FECO']
 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3 regex?

2015-01-12 Thread Rick Johnson
 'Some people, when confronted with a problem, think I
 know, I'll use regular expressions. Now they have two
 problems.' -  Jamie Zawinski.

This statement is one of my favorite examples of powerful
propaganda, which has scared more folks away from regexps
than even the Upright Citizens Brigade could manage with
their Journey through the center of gas giant #7 and it's
resulting aggravated assault on American coinage!

I wonder if Jamie's conclusions are a result of careful
study, or merely, an attempt to resolve his own cognitive
dissonance? Of course, if the latter is true, then i give
him bonus points for his use of the third person to veil his
own inadequacies -- nice Jamie, *very* nice!

Rick it sounds like you're accusing Jamie of cowardice
resulting in sour grapes?

Indeed! The problem with statements like his is that, the
ironic humor near the end *fortifies* the argument so much
that the reader forgets the limits of the harm (quantified
as: some people) and immediately accepts the consequences
as effecting all people who choose to use regexps, or more
generally, accepts the argument as a universal unbounded
truth. Besides, who would want to be a member of a group
for which the individuals are too stupid to know good
choices from bad choices?

HA, PEER PRESSURE, IT'S A POWERFUL THING!

But there is more going on here than just mere slight of
forked tongue my friends, because, even the most
accomplished propagandist cannot fool *most* of the people.
No, this type of powerful propaganda only succeeds when
the subject matter is both cryptic *AND* esoteric.

For instance, in the following example, i contrive a
similarly ironic statement to showcase the effects of such
propaganda, but one that covers a subject matter in which
laymen either: already understand, or, can easily attain
enough knowledge to appreciate the humor.


#   Ironic Twist   #

# Some diabetics, when confronted with hunger, think I#
# know, I'll eat a box of sugar cookies. -- now they have #
# two problems!'   #


Wait a minute Rick! After eating the cookies the
diabetic would not longer be hungry, so how could he
have two problems? Your logic is flawed!

Au Contraire! Read the statement carefully, I said:
When *CONFRONTED* with hunger -- the two problems
(and the eventual side effect) exist at the *MOMENT* the
diabetic considers eating the cookies.

PROBLEM1: Need to eat!
PROBLEM2: Cookies raise glucose too quickly

In this example, even a layman would understand that the
statement is meant to showcase the irony of resolving a
problem (hunger) with a solution (eating a box of cookies)
that results in the undesirable outcome of (hyperglycemia).

And while this statement, and the one about regexps, both
contain a factual underlying truth (basically that
negative side effects should be avoided) the layman will
lack the esoteric knowledge of regexps to confirm the
factual basis for himself, and will instead blindly adopt
the propagandist assertion as truth, based solely on the
humorous prowess of the propagandist.

The most effective propaganda utilizes the sub-conscience.
You see, the role of propaganda is to modify behavior, and
it is a more prevalent and powerful tool than most people
realize! The propagandist will seek to control you; he'll
use your ignorance against you; but you didn't notice
because he made you laugh!

WHO'S LAUGHING NOW? -- YOU MINDLESS ROBOTS!

But what's so evil about that Rick? He scared away a
few feeble minded folks. SO WHAT!

I argue that we are all feeble minded in any subject we
have not yet mastered. His propaganda (be it intentional or
not) is so powerful that it defeats the neophyte before they
can even begin. Because it gives them the false impression
that regexps are only used by foolish people.

Yes, i'll admit, regexps are very cryptic, but once you
grasp their intricacies, you appreciate the succinctness of
there syntax, because, what makes them so powerful is not
only the extents of their pattern matching abilities, but
their conciseness.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3 regex?

2015-01-12 Thread Chris Angelico
On Tue, Jan 13, 2015 at 10:47 AM, Rick Johnson
rantingrickjohn...@gmail.com wrote:
 WHO'S LAUGHING NOW? -- YOU MINDLESS ROBOTS!

It's very satisfying when mindless robots laugh.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3 regex?

2015-01-12 Thread Rick Johnson
On Monday, January 12, 2015 at 7:55:32 PM UTC-6, Mark Lawrence wrote:
 On 12/01/2015 23:47, Rick Johnson wrote:
  'Some people, when confronted with a problem, think I
  know, I'll use regular expressions. Now they have two
  problems.' -  Jamie Zawinski.
 

 [snip]

 If you wish to use a hydrogen bomb instead of a tooth pick
 feel free, I won't lose any sleep over it.  Meanwhile I'll
 get on with writing code, and for the simple jobs that can
 be completed with string methods I'll carry on using them.
 When that gets too complicated I'll reach for the regex
 manual, knowing full well that there's enough data in
 books and online to help even a novice such as myself get
 over all the hurdles. If that isn't good enough then maybe
 a full blown parser, such as the pile listed here [snip]

Mark, if you're going to quote me, then at least quote me in
a manner that does not confuse the content of my post. The
snippet you posted was not a statement of mine, rather, it
was a quote that i was responding to, and without any
context of my response, what is the point of quoting
anything at all? It would be better to quote nothing and
just say @Rick, then to quote something which does not have
any context.

Every python programmer worth his *SALT* should master the
following three text processing capabilities of Python, and
he should know how and when to apply them (for they all have
strengths and weaknesses):

(1) String methods: Simplistic API, but with limited
capabilities -- but never underestimate the
possibilities!

(2) Regexps: Great for pattern matching with a powerful
and concise syntax, but highly cryptic and unintuitive
for the neophyte (and sometimes even the guru! *wink*).

(3) Parsers: Great for extracting deeper meaning from text,
but if pattern matching is all you need, then why not
use (1) or (2) -- are you scared or uninformed?

We can easily forgive a child who is afraid of the
dark; the real tragedy of life is when men are afraid of
the light. -- Plato

IMHO, if you seek to only match patterns, then string
methods should be your first choice, however, if the pattern
is too difficult for string methods, then graduate to
regexps. If you need to extract deeper meaning from
text, by all means utilize a parser.

But above all, don't fall for these religious teachings
about how regexps are too difficult for mortals -- that's
just hysteria. If you follow the outline i provided above,
you should find Python's text processing Nirvana.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3 regex?

2015-01-12 Thread Mark Lawrence

On 12/01/2015 23:47, Rick Johnson wrote:

'Some people, when confronted with a problem, think I
know, I'll use regular expressions. Now they have two
problems.' -  Jamie Zawinski.




[normal cobblers snipped]

If you wish to use a hydrogen bomb instead of a tooth pick feel free, I 
won't lose any sleep over it.  Meanwhile I'll get on with writing code, 
and for the simple jobs that can be completed with string methods I'll 
carry on using them.  When that gets too complicated I'll reach for the 
regex manual, knowing full well that there's enough data in books and 
online to help even a novice such as myself get over all the hurdles. 
If that isn't good enough then maybe a full blown parser, such as the 
pile listed here http://nedbatchelder.com/text/python-parsers.html ?


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3 regex?

2015-01-12 Thread Steven D'Aprano
On Mon, 12 Jan 2015 19:48:18 +, Ian wrote:

 My recommendation would be to write a recursive decent parser for your
 files.
 
 That way will be easier to write,

I know that writing parsers is a solved problem in computer science, and 
that doing so is allegedly one of the more trivial things computer 
scientists are supposed to be able to do, but the learning curve to write 
parsers is if anything even higher than the learning curve to write a 
regex.

I wish that Python made it as easy to use EBNF to write a parser as it 
makes to use a regex :-(

http://en.wikipedia.org/wiki/Extended_Backus–Naur_Form



-- 
Steven
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3 regex?

2015-01-12 Thread Mark Lawrence

On 13/01/2015 02:53, Rick Johnson wrote:

On Monday, January 12, 2015 at 7:55:32 PM UTC-6, Mark Lawrence wrote:

On 12/01/2015 23:47, Rick Johnson wrote:

'Some people, when confronted with a problem, think I
know, I'll use regular expressions. Now they have two
problems.' -  Jamie Zawinski.




[snip]

If you wish to use a hydrogen bomb instead of a tooth pick
feel free, I won't lose any sleep over it.  Meanwhile I'll
get on with writing code, and for the simple jobs that can
be completed with string methods I'll carry on using them.
When that gets too complicated I'll reach for the regex
manual, knowing full well that there's enough data in
books and online to help even a novice such as myself get
over all the hurdles. If that isn't good enough then maybe
a full blown parser, such as the pile listed here [snip]


Mark, if you're going to quote me, then at least quote me in
a manner that does not confuse the content of my post. The
snippet you posted was not a statement of mine, rather, it
was a quote that i was responding to, and without any
context of my response, what is the point of quoting
anything at all? It would be better to quote nothing and
just say @Rick, then to quote something which does not have
any context.


You snipped the bit that says [normal cobblers snipped].



Every python programmer worth his *SALT* should master the
following three text processing capabilities of Python, and
he should know how and when to apply them (for they all have
strengths and weaknesses):

 (1) String methods: Simplistic API, but with limited
 capabilities -- but never underestimate the
 possibilities!

 (2) Regexps: Great for pattern matching with a powerful
 and concise syntax, but highly cryptic and unintuitive
 for the neophyte (and sometimes even the guru! *wink*).

 (3) Parsers: Great for extracting deeper meaning from text,
 but if pattern matching is all you need, then why not
 use (1) or (2) -- are you scared or uninformed?



String methods, regexes, parsers, isn't that what I've already said 
above?  Why repeat it?



 We can easily forgive a child who is afraid of the
 dark; the real tragedy of life is when men are afraid of
 the light. -- Plato

IMHO, if you seek to only match patterns, then string
methods should be your first choice, however, if the pattern
is too difficult for string methods, then graduate to
regexps. If you need to extract deeper meaning from
text, by all means utilize a parser.



I feel humbled that a great such as yourself is again repeating what 
I've already said.



But above all, don't fall for these religious teachings
about how regexps are too difficult for mortals -- that's
just hysteria. If you follow the outline i provided above,
you should find Python's text processing Nirvana.



My favourite things in programming all go along the lines of DRY and 
KISS, with Although practicality beats purity being the most important 
of the lot.  So called religious teachings never enter into my way of 
doing things.  For example I can't stand code which jumps through hoops 
to avoid using GOTO, whereas nothing is cleaner than (say) GOTO ERROR. 
You'll (plural) find loads of them in cPython.


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3 regex?

2015-01-12 Thread Steven D'Aprano
On Mon, 12 Jan 2015 15:47:08 -0800, Rick Johnson wrote:

 'Some people, when confronted with a problem, think I know, I'll use
 regular expressions. Now they have two problems.' -  Jamie Zawinski.
 
 I wonder if Jamie's conclusions are a result of careful study, or
 merely, an attempt to resolve his own cognitive dissonance? 

Zawinski is one of the pantheon of geek demi-gods, with Linus, Larry, 
Guido, RMS, and a few others. (Just don't ask me to rank them. I'm not 
qualified.) His comment isn't based on a failure to grok regular 
expressions, but on an understanding that many people use regular 
expressions inappropriately.

Here is more on the context of the famous quote:

http://regex.info/blog/2006-09-15/247


(By the way, the quote actually wasn't original to JZ, he stole it from 
an all but identical quote about awk.)


[...]
 For instance, in the following example, i contrive a similarly ironic
 statement to showcase the effects of such propaganda, but one that
 covers a subject matter in which laymen either: already understand, or,
 can easily attain enough knowledge to appreciate the humor.
 
  #  
 Ironic Twist   #
  # Some
 diabetics, when confronted with hunger, think I# # know, I'll
 eat a box of sugar cookies. -- now they have # # two problems!'

Not the best of analogies, since there are two forms of diabetes. Those 
with Type 2 diabetes can best manage their illness by avoiding sugar 
cookies. Those with Type 1 should keep a box of sugar cookies (well, 
perhaps glucose lollies are more appropriate) on hand for emergencies.

http://www.betterhealth.vic.gov.au/bhcv2/bhcarticles.nsf/pages/Diabetes_explained?open

In any case, most people with diabetes (or at least those who are still 
alive) are reasonably good at managing their illness and wouldn't make 
the choice you suggest. You have missed the point that people who misuse 
regexes are common in programming circles, while diabetics who eat a box 
of sugar cookies instead of a meal are rare.

To take your analogy to an extreme:

  Some people, when faced with a problem, say I know, I'll cut 
  my arm off with a pocketknife! Now they have two problems.

This is not insightful or useful. Except in the most specialised and 
extreme circumstances, such as being trapped in the wilderness with a 
boulder on your arm, nobody would consider this to be good advice. But 
using regexes to validate email addresses or parse HTML? The internet is 
full of people who thought that was a good idea.


[...]
 Yes, i'll admit, regexps are very cryptic, but once you grasp their
 intricacies, you appreciate the succinctness of there syntax, because,
 what makes them so powerful is not only the extents of their pattern
 matching abilities, but their conciseness.

Even Larry Wall says that regexes are too concise and cryptic:

http://perl6.org/archive/doc/design/apo/A05.html



-- 
Steve
-- 
https://mail.python.org/mailman/listinfo/python-list


Python 3 regex?

2015-01-12 Thread Jason Bailey

Hi all,

I'm working on a Python _3_ project that will be used to parse ISC DHCPD 
configuration files for statistics and alarming purposes (IP address 
pools, etc). Anyway, I'm hung up on this one section and was hoping 
someone could provide me with some insight.


My script first reads the DHCPD configuration file into memory - 
variable filebody. It then utilizes the re module to find the 
configuration details for the wanted shared network.


The config file might look something like this:

##

shared-network My-Network-MOHE {
  subnet 192.168.0.0 netmask 255.255.248.0 {
option routers 192.168.0.1;
option tftp-server-name 192.168.90.12;
pool {
  deny dynamic bootp clients;
  range 192.168.0.20 192.168.7.254;
}
  }
}

shared-network My-Network-CDCO {
  subnet 192.168.8.0 netmask 255.255.248.0 {
option routers 10.101.8.1;
option tftp-server-name 192.168.90.12;
pool {
  deny dynamic bootp clients;
  range 192.168.8.20 192.168.15.254;
}
  }
}

shared-network My-Network-FECO {
  subnet 192.168.16.0 netmask 255.255.248.0 {
option routers 192.168.16.1;
option tftp-server-name 192.168.90.12;
pool {
  deny dynamic bootp clients;
  range 192.168.16.20 192.168.23.254;
}
  }
}

##

Suppose I'm trying to grab the shared network called My-Network-FECO 
from the above config file stored in the variable 'filebody'.


First I have my variable 'shared_network' which contains the string 
My-Network-FECO.


I compile my regex:
m = re.compile(r^(shared\-network ( + re.escape(shared_network) + r) 
\{((\n|.|\r\n)*?)(^\})), re.MULTILINE|re.UNICODE)


I search for regex matches in my config file:
m.search(filebody)

Unfortunately, I get no matches. From output on the command line, I can 
see that Python is adding extra backslashes to my re.compile string. I 
have added the raw 'r' in front of the strings to prevent it, but to no 
avail.


Thoughts on this?

Thanks


--
https://mail.python.org/mailman/listinfo/python-list


Python 3 regex woes (parsing ISC DHCPD config)

2015-01-12 Thread Jason Bailey

Hi all,

I'm working on a Python _3_ project that will be used to parse ISC DHCPD 
configuration files for statistics and alarming purposes (IP address 
pools, etc). Anyway, I'm hung up on this one section and was hoping 
someone could provide me with some insight.


My script first reads the DHCPD configuration file into memory - 
variable filebody. It then utilizes the re module to find the 
configuration details for the wanted shared network.


The config file might look something like this:

##

shared-network My-Network-MOHE {
  subnet 192.168.0.0 netmask 255.255.248.0 {
option routers 192.168.0.1;
option tftp-server-name 192.168.90.12;
pool {
  deny dynamic bootp clients;
  range 192.168.0.20 192.168.7.254;
}
  }
}

shared-network My-Network-CDCO {
  subnet 192.168.8.0 netmask 255.255.248.0 {
option routers 10.101.8.1;
option tftp-server-name 192.168.90.12;
pool {
  deny dynamic bootp clients;
  range 192.168.8.20 192.168.15.254;
}
  }
}

shared-network My-Network-FECO {
  subnet 192.168.16.0 netmask 255.255.248.0 {
option routers 192.168.16.1;
option tftp-server-name 192.168.90.12;
pool {
  deny dynamic bootp clients;
  range 192.168.16.20 192.168.23.254;
}
  }
}

##

Suppose I'm trying to grab the shared network called My-Network-FECO 
from the above config file stored in the variable 'filebody'.


First I have my variable 'shared_network' which contains the string 
My-Network-FECO.


I compile my regex:
m = re.compile(r^(shared\-network ( + re.escape(shared_network) + r) 
\{((\n|.|\r\n)*?)(^\})), re.MULTILINE|re.UNICODE)


I search for regex matches in my config file:
m.search(filebody)

Unfortunately, I get no matches. From output on the command line, I can 
see that Python is adding extra backslashes to my re.compile string. I 
have added the raw 'r' in front of the strings to prevent it, but to no 
avail.


Thoughts on this?

Thanks


--
https://mail.python.org/mailman/listinfo/python-list


[issue23191] fnmatch regex cache use is not threadsafe

2015-01-08 Thread M. Schmitzer

New submission from M. Schmitzer:

The way the fnmatch module uses its regex cache is not threadsafe. When 
multiple threads use the module in parallel, a race condition between 
retrieving a - presumed present - item from the cache and clearing the cache 
(because the maximum size has been reached) can lead to KeyError being raised.

The attached script demonstrates the problem. Running it will (eventually) 
yield errors like the following.

Exception in thread Thread-10:
Traceback (most recent call last):
  File /usr/lib/python2.7/threading.py, line 810, in __bootstrap_inner
self.run()
  File /usr/lib/python2.7/threading.py, line 763, in run
self.__target(*self.__args, **self.__kwargs)
  File fnmatch_thread.py, line 12, in run
fnmatch.fnmatchcase(name, pat)
  File /home/marc/.venv/modern/lib/python2.7/fnmatch.py, line 79, in 
fnmatchcase
return _cache[pat].match(name) is not None
KeyError: 'lYwrOCJtLU'

--
components: Library (Lib)
files: fnmatch_thread.py
messages: 233650
nosy: mschmitzer
priority: normal
severity: normal
status: open
title: fnmatch regex cache use is not threadsafe
type: crash
versions: Python 2.7
Added file: http://bugs.python.org/file37642/fnmatch_thread.py

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23191
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23191] fnmatch regex cache use is not threadsafe

2015-01-08 Thread STINNER Victor

STINNER Victor added the comment:

I guess that a lot of stdlib modules are not thread safe :-/ A workaround is to 
protect calls to fnmatch with your own lock.

--
nosy: +haypo

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23191
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23191] fnmatch regex cache use is not threadsafe

2015-01-08 Thread M. Schmitzer

M. Schmitzer added the comment:

Ok, if that is the attitude in such cases, feel free to close this.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23191
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23191] fnmatch regex cache use is not threadsafe

2015-01-08 Thread STINNER Victor

STINNER Victor added the comment:

It would be nice to fix the issue, but I don't know how it is handled in other 
stdlib modules.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23191
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23191] fnmatch regex cache use is not threadsafe

2015-01-08 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

It is easy to make fnmatch caching thread safe without locks. Here is a patch.

The problem with fnmatch is that the caching is implicit and a user don't know 
that any lock are needed. So either the need of the lock should be explicitly 
documented, or fnmatch should be made thread safe. The second option looks more 
preferable to me.

In 3.x fnmatch is thread safe because thread safe lru_cache is used.

--
keywords: +patch
nosy: +serhiy.storchaka
stage:  - patch review
Added file: http://bugs.python.org/file37643/fnmatch_threadsafe.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23191
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue23191] fnmatch regex cache use is not threadsafe

2015-01-08 Thread M. Schmitzer

M. Schmitzer added the comment:

@serhiy.storchaka: My thoughts exactly, especially regarding the caching being 
implicit. From the outside, fnmatch really doesn't look like it could have 
threading issues.
The patch also looks exactly like what I had in mind.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue23191
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



regex tool in the python source tree

2014-12-19 Thread Rustom Mody
I remember seeing here (couple of weeks ago??) a mention of a regex
debugging/editing tool hidden away in the python source tree.

Does someone remember the name/path?

There are of course dozens of online ones... 
Looking for a python native tool
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: regex tool in the python source tree

2014-12-19 Thread Rustom Mody
On Saturday, December 20, 2014 12:01:10 PM UTC+5:30, Rustom Mody wrote:
 I remember seeing here (couple of weeks ago??) a mention of a regex
 debugging/editing tool hidden away in the python source tree.
 
 Does someone remember the name/path?
 
 There are of course dozens of online ones... 
 Looking for a python native tool

Ok I found redemo here https://docs.python.org/3/howto/regex.html

Should also mention that that link mentions kodos as though it works.
The kodos site http://sourceforge.net/projects/kodos/files/
shows old as 2002 new as 2006.
Last I tried it did not work for python2.7 even leave aside 3.x.

If theres anything more uptodate, I'd like to know
-- 
https://mail.python.org/mailman/listinfo/python-list


[issue2636] Adding a new regex module (compatible with re)

2014-11-14 Thread Mateon1

Mateon1 added the comment:

Well, I am reporting it here, is this not the correct place? Sorry if it is.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2636
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2636] Adding a new regex module (compatible with re)

2014-11-14 Thread Brett Cannon

Changes by Brett Cannon br...@python.org:


--
nosy: +brett.cannon

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2636
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2636] Adding a new regex module (compatible with re)

2014-11-14 Thread Matthew Barnett

Matthew Barnett added the comment:

The page on PyPI says where the project's homepage is located:

Home Page: https://code.google.com/p/mrab-regex-hg/

The bug was fixed in the last release.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2636
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2636] Adding a new regex module (compatible with re)

2014-11-13 Thread Mateon1

Mateon1 added the comment:

Well, I found a bug with this module, on Python 2.7(.5), on Windows 7 64-bit 
when you try to compile a regex with the flags V1|DEBUG, the module crashes as 
if it wanted to call a builtin called ascii.

The bug happened to me several times, but this is the regexp when the last one 
happened. http://paste.ubuntu.com/8993680/

I hope it's fixed, I really love the module and found it very useful to have 
PCRE regexes in Python.

--
nosy: +Mateon1
versions:  -Python 3.5

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2636
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2636] Adding a new regex module (compatible with re)

2014-11-13 Thread Matthew Barnett

Matthew Barnett added the comment:

@Mateon1: I hope it's fixed? Did you report it?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2636
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22364] Improve some re error messages using regex for hints

2014-11-11 Thread Terry J. Reedy

Terry J. Reedy added the comment:

I already said we should either stick with what we have if better (and gave 
examples, including sticking with 'cannot') or possibly combine the best of 
both if we can improve on both.  13 should use 'bytes-like' (already changed?). 
There is no review button.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22364
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22364] Improve some re error messages using regex for hints

2014-11-10 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Here is a patch which makes re error messages match regex. It doesn't look to 
me that all these changes are enhancements.

--
keywords: +patch
Added file: http://bugs.python.org/file37167/re_errors_regex.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22364
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2636] Adding a new regex module (compatible with re)

2014-11-09 Thread Nick Coghlan

Nick Coghlan added the comment:

Thanks for pushing this one forward Serhiy! Your approach sounds like a
fine plan to me.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2636
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2636] Adding a new regex module (compatible with re)

2014-11-09 Thread Jeffrey C. Jacobs

Jeffrey C. Jacobs added the comment:

If I recall, I started this thread with a plan to update re itself with 
implementations of various features listed in the top post.  If you look at the 
list of files uploaded by me there are seme complete patches for Re to add 
various features like Atomic Grouping.  If we wish to therefore bring re to 
regex standard we could start with those features.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2636
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2636] Adding a new regex module (compatible with re)

2014-11-08 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

Here is my (slowly implemented) plan:

0. Recommend regex as advanced replacement of re (issue22594).

1. Fix all obvious bugs in the re module if this doesn't break backward 
compatibility (issue12728, issue14260, and many already closed issues).

2. Deprecate and then forbid behavior which looks as a bug, doesn't match regex 
in V1 mode and can't be fixed without breaking backward compatibility 
(issue22407, issue22493, issue22818).

3. Unify minor details with regex (issue22364, issue22578).

4. Fork regex and drop all advanced nonstandard features (such as fuzzy 
matching). Too many features make learning and using the module more hard. They 
should be in advanced module (regex).

5. Write benchmarks which cover all corner cases and compare re with regex case 
by case. Optimize slower module. Currently re is faster regex for all simple 
examples which I tried (may be as result of issue18685), but in total results 
of benchmarks (msg109447) re is slower.

6. May be implement some standard features which were rejected in favor of this 
issue (issue433028, issue433030). re should conform at least Level 1 of UTS #18 
(http://www.unicode.org/reports/tr18/#Basic_Unicode_Support).

In best case in 3.7 or 3.8 we could replace re with simplified regex. Or at 
this time re will be free from bugs and warts.

--
nosy: +serhiy.storchaka

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2636
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2636] Adding a new regex module (compatible with re)

2014-11-08 Thread Antoine Pitrou

Antoine Pitrou added the comment:

 Here is my (slowly implemented) plan:

Exciting. Perhaps you should post your plan on python-dev.

In any case, huge thanks for your work on the re module.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2636
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2636] Adding a new regex module (compatible with re)

2014-11-08 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

 Exciting. Perhaps you should post your plan on python-dev.

Thank you Antoine. I think all interested core developers are already aware 
about this issue. A disadvantage of posting on python-dev is that this would 
require manually copy links and may be titles of all mentioned issues, while 
here they are available automatically. Oh, I'm lazy.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2636
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2636] Adding a new regex module (compatible with re)

2014-11-08 Thread Ezio Melotti

Ezio Melotti added the comment:

So you are suggesting to fix bugs in re to make it closer to regex, and then 
replace re with a forked subset of regex that doesn't include advanced 
features, or just to fix/improve re until it matches the behavior of regex?
If you are suggesting the former, I would also suggest checking the coverage 
and bringing it as close as possible to 100%.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2636
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2636] Adding a new regex module (compatible with re)

2014-11-08 Thread Serhiy Storchaka

Serhiy Storchaka added the comment:

 So you are suggesting to fix bugs in re to make it closer to regex, and then
 replace re with a forked subset of regex that doesn't include advanced
 features, or just to fix/improve re until it matches the behavior of regex?

Depends on what will be easier. May be some bugs are so hard to fix that 
replacing re with regex is only solution. But if fixed re will be simpler and 
faster than lightened regex and will contain all necessary features, there 
will be no need in the replacing. Currently the code of regex looks more high 
level and better structured, but the code of re looks simpler and is much 
smaller. In any case the closer will be re and regex the easier will be the 
migration.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2636
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2636] Adding a new regex module (compatible with re)

2014-11-08 Thread Ezio Melotti

Ezio Melotti added the comment:

Ok, regardless of what will happen, increasing test coverage is a worthy goal.  
We might start by looking at the regex test suite to see if we can import some 
tests from there.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue2636
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: Regex substitution trouble

2014-11-05 Thread Eugene
massi_...@msn.com wrote:

 Hi everyone,
 
 I'm not really sure if this is the right place to ask about regular
 expressions, but since I'm usin python I thought I could give a try 
:-)
 Here is the problem, I'm trying to write a regex in order to 
substitute
 all the occurences in the form $somechars with another string. This 
is
 what I wrote:
 
 newstring = re.sub(ur(?u)(\$\[\s\w]+\), subst, oldstring)
 
 This works pretty well, but it has a problem, I would need it also to
 handle the case in which the internal string contains the double 
quotes,
 but only if preceeded by a backslash, that is something like
 $somechars_with\\doublequotes. Can anyone help me to correct it?
 
 Thanks in advance!
Hi!

Next snippet works for me:

re.sub(r'\$([\s\w]+(\\)*[\s\w]+)+', 'noop', r'$te\sts\tri\ng')

-- 
https://mail.python.org/mailman/listinfo/python-list


[issue22364] Improve some re error messages using regex for hints

2014-11-02 Thread Raymond Hettinger

Raymond Hettinger added the comment:

+1

--
nosy: +rhettinger

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22364
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22364] Improve some re error messages using regex for hints

2014-11-02 Thread Serhiy Storchaka

Changes by Serhiy Storchaka storch...@gmail.com:


--
dependencies: +Add additional attributes to re.error, Other mentions of the 
buffer protocol

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22364
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22364] Improve some re error messages using regex for hints

2014-11-01 Thread Ezio Melotti

Ezio Melotti added the comment:

+1 on the idea.

--
stage:  - needs patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22364
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Regex substitution trouble

2014-10-28 Thread massi_srb
Hi everyone,

I'm not really sure if this is the right place to ask about regular 
expressions, but since I'm usin python I thought I could give a try :-)
Here is the problem, I'm trying to write a regex in order to substitute all the 
occurences in the form $somechars with another string. This is what I wrote:

newstring = re.sub(ur(?u)(\$\[\s\w]+\), subst, oldstring)

This works pretty well, but it has a problem, I would need it also to handle 
the case in which the internal string contains the double quotes, but only if 
preceeded by a backslash, that is something like 
$somechars_with\\doublequotes.
Can anyone help me to correct it?

Thanks in advance!
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Regex substitution trouble

2014-10-28 Thread Chris Angelico
On Tue, Oct 28, 2014 at 10:02 PM,  massi_...@msn.com wrote:
 I'm not really sure if this is the right place to ask about regular 
 expressions, but since I'm usin python I thought I could give a try :-)

Yeah, that sort of thing is perfectly welcome here. Same with
questions about networking in Python, or file I/O in Python, or
anything like that. Not a problem!

 Here is the problem, I'm trying to write a regex in order to substitute all 
 the occurences in the form $somechars with another string. This is what I 
 wrote:

 newstring = re.sub(ur(?u)(\$\[\s\w]+\), subst, oldstring)

 This works pretty well, but it has a problem, I would need it also to handle 
 the case in which the internal string contains the double quotes, but only if 
 preceeded by a backslash, that is something like 
 $somechars_with\\doublequotes.
 Can anyone help me to correct it?

But this is a problem. You can use look-ahead assertions and such to
allow the string \ inside your search string, but presumably the
backslash ought itself to be escapable, in order to make it possible
to have a loose backslash legal at the end of the string. I suggest
that, instead of a regex, you look for a different way of parsing.
What's the surrounding text like?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Regex substitution trouble

2014-10-28 Thread massi_srb
Hi Chris, thanks for the reply. I tried to use look ahead assertions, in 
particular I modified the regex this way:

newstring = re.sub(ur(?u)(\$\[\s\w(?=\\)\]+\), subst, oldstring) 

but it does not work. I'm absolutely not a regex guru so I'm surely missing 
something. The strings I'm dealing with are similar to formulas, let's say 
something like:

'$[simple_input]+$[messed_\\_input]+10'

Thanks for any help!
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Regex substitution trouble

2014-10-28 Thread Chris Angelico
(Please quote enough of the previous text to provide context, and
write your replies underneath the quoted text - don't assume that
everyone's read the previous posts. Thanks!)

On Tue, Oct 28, 2014 at 11:28 PM,  massi_...@msn.com wrote:
 Hi Chris, thanks for the reply. I tried to use look ahead assertions, in 
 particular I modified the regex this way:

 newstring = re.sub(ur(?u)(\$\[\s\w(?=\\)\]+\), subst, oldstring)

 but it does not work. I'm absolutely not a regex guru so I'm surely missing 
 something.

Yeah, I'm not a high-flying regex programmer either, so I'll leave the
specifics for someone else to answer. Tip, though: Print out your
regex, to see if it's really what you think it is. When you get
backslashes and quotes coming through, sometimes you can get tangled,
even in a raw string literal; sometimes, one quick print(some_re) can
save hours of hair-pulling.

 The strings I'm dealing with are similar to formulas, let's say something 
 like:

 '$[simple_input]+$[messed_\\_input]+10'

 Thanks for any help!

Hmm. This looks like a job for ast.literal_eval with an actual
dictionary. All you'd have to do is replace every instance of $ with a
dict literal; it mightn't be efficient, but it would be safe. Using
Python 2.7.8 as you appear to be on 2.x:

 expr = '$[simple_input]+$[messed_\\_input]+10'
 values = {simple_input:123, messed_\_input:75}
 ast.literal_eval(expr.replace($,repr(values)))
Traceback (most recent call last):
  File pyshell#4, line 1, in module
ast.literal_eval(expr.replace($,repr(values)))
  File C:\Python27\lib\ast.py, line 80, in literal_eval
return _convert(node_or_string)
  File C:\Python27\lib\ast.py, line 79, in _convert
raise ValueError('malformed string')
ValueError: malformed string

Unfortunately, it doesn't appear to work, as evidenced by the above
message. It works with the full (and dangerous) eval, though:

 eval(expr.replace($,repr(values)))
208

Can someone who better knows ast.literal_eval() explain what's
malformed about this? The error message in 3.4 is a little more
informative, but not much more helpful:
ValueError: malformed node or string: _ast.BinOp object at 0x0169BAF0
My best theory is that subscripting isn't allowed, though this seems odd.

In any case, it ought in theory to be possible to use Python's own
operations on this. You might have to do some manipulation, but it'd
mean you can leverage a full expression evaluator that already exists.
I'd eyeball the source code for ast.literal_eval() and see about
making an extended version that allows the operations you want.

If you can use something other than a dollar sign - something that's
syntactically an identifier - you'll be able to skip the textual
replace() operation, which is risky (might change the wrong thing). Do
that, and you could have your own little evaluator that uses the ast
module for most of its work, and simply runs a little recursive walker
that deals with the nodes as it finds them.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Regex substitution trouble

2014-10-28 Thread TP
On Tue, Oct 28, 2014 at 4:02 AM, massi_...@msn.com wrote:

 Hi everyone,

 I'm not really sure if this is the right place to ask about regular
 expressions, but since I'm usin python I thought I could give a try :-)
 Here is the problem, I'm trying to write a regex in order to substitute
 all the occurences in the form $somechars with another string. This is
 what I wrote:

 newstring = re.sub(ur(?u)(\$\[\s\w]+\), subst, oldstring)

 This works pretty well, but it has a problem, I would need it also to
 handle the case in which the internal string contains the double quotes,
 but only if preceeded by a backslash, that is something like
 $somechars_with\\doublequotes.
 Can anyone help me to correct it?

 Thanks in advance!
 --
 https://mail.python.org/mailman/listinfo/python-list


Carefully reading the Strings section of Example Regexes to Match Common
Programming Language Constructs [1] should (with a bit of effort), solve
your problem I think. Note the use of the negated character class for one
thing.

[1] http://www.regular-expressions.info/examplesprogrammer.html
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Regex substitution trouble

2014-10-28 Thread Tim
On Tuesday, October 28, 2014 7:03:00 AM UTC-4, mass...@msn.com wrote:
 Hi everyone,
 I'm not really sure if this is the right place to ask about regular 
 expressions, but since I'm usin python I thought I could give a try :-)
 Here is the problem, I'm trying to write a regex in order to substitute all 
 the occurences in the form $somechars with another string. This is what I 
 wrote:
 newstring = re.sub(ur(?u)(\$\[\s\w]+\), subst, oldstring)
 
 This works pretty well, but it has a problem, I would need it also to handle 
 the case in which the internal string contains the double quotes, but only if 
 preceeded by a backslash, that is something like 
 $somechars_with\\doublequotes.
 Can anyone help me to correct it?
 Thanks in advance!

You have some good answers already, but I wanted to let you know about a tool 
you may already have which is useful for experimenting with regexps. On 
windows, the file `redemo.py` is in the Tools/Scripts folder. 

If you're on a Mac, see 
http://stackoverflow.com/questions/1811236/how-can-i-run-redemo-py-or-equivalent-on-a-mac

It has really helped me work on some tough regexps.
good luck,
--Tim
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Regex substitution trouble

2014-10-28 Thread MRAB

On 2014-10-28 12:28, massi_...@msn.com wrote:

Hi Chris, thanks for the reply. I tried to use look ahead assertions, in 
particular I modified the regex this way:

newstring = re.sub(ur(?u)(\$\[\s\w(?=\\)\]+\), subst, oldstring)

but it does not work. I'm absolutely not a regex guru so I'm surely missing 
something. The strings I'm dealing with are similar to formulas, let's say 
something like:

'$[simple_input]+$[messed_\\_input]+10'

Thanks for any help!


Your original post said you wanted to match strings like:

$somechars_with\\doublequotes.

This regex will do that:

ur'\$[^\\]*(?:\\.[^\\]*)*'

However, now you say you want to match:

'$[simple_input]'

This is different; it has '[' immediately after the '$' instead of ''.

--
https://mail.python.org/mailman/listinfo/python-list


Re: Regex substitution trouble

2014-10-28 Thread Cameron Simpson

On 28Oct2014 04:02, massi_...@msn.com massi_...@msn.com wrote:
I'm not really sure if this is the right place to ask about regular 
expressions, but since I'm usin python I thought I could give a try :-)

Here is the problem, I'm trying to write a regex in order to substitute all the 
occurences in the form $somechars with another string. This is what I wrote:

newstring = re.sub(ur(?u)(\$\[\s\w]+\), subst, oldstring)

This works pretty well, but it has a problem, I would need it also to handle the case in which 
the internal string contains the double quotes, but only if preceeded by a backslash, that is 
something like $somechars_with\\doublequotes.
Can anyone help me to correct it?


People seem to be making this harder than it should be.

I'd just be fixing up your definition of what's inside the quotes. There seem 
to be 3 kinds of things:


  - not a double quote or backslash
  - a backslash followed by a double quote
  - a backslash followed by not a double quote

Kind 3 is a policy call - take the following character or not? I would go with 
treating it like kind 2 myself.


So you have:

  1 [^\\]
  2 \\
  3 \\[^]

and fold 2 and 3 into:

  2+3 \\.

So your regexp inner becomes:

  ([^\\]|\\.)*

and the whole thing becomes:

  \$(([^\\]|\\.)*)

and as a raw string:
  
  ur'\$(([^\\]|\\.)*)'


choosing single quotes to be more readable given the double quotes in the 
regexp.


Cheers,
Cameron Simpson c...@zip.com.au
--
cat: /Users/cameron/rc/mail/signature.: No such file or directory

Language... has created the word loneliness to express the pain of
being alone. And it has created the word solitude to express the glory
of being alone. - Paul Johannes Tillich
--
https://mail.python.org/mailman/listinfo/python-list


[issue22594] Add a link to the regex module in re documentation

2014-10-28 Thread anupama srinivas murthy

anupama srinivas murthy added the comment:

I have modified the patch and listed the points I know. Could you review it?

--
versions:  -Python 3.4, Python 3.5
Added file: http://bugs.python.org/file37052/regex-link.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22594
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22594] Add a link to the regex module in re documentation

2014-10-23 Thread Serhiy Storchaka

Changes by Serhiy Storchaka storch...@gmail.com:


--
components: +Regular Expressions

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22594
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue10076] Regex objects became uncopyable in 2.5

2014-10-23 Thread Serhiy Storchaka

Changes by Serhiy Storchaka storch...@gmail.com:


--
components: +Regular Expressions
versions: +Python 3.4, Python 3.5 -Python 3.1

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue10076
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22594] Add a link to the regex module in re documentation

2014-10-13 Thread anupama srinivas murthy

anupama srinivas murthy added the comment:

I have added the link and attached the patch below. Could you review it?

Thank you

--
components:  -Regular Expressions
keywords: +patch
nosy: +anupama.srinivas.murthy
Added file: http://bugs.python.org/file36900/regex-link.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22594
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22594] Add a link to the regex module in re documentation

2014-10-13 Thread Georg Brandl

Georg Brandl added the comment:

currently more bugfree and intended to replace re

The first part is spreading FUD if not explained in more detail.  The second is 
probably never going to happend :(

--
nosy: +georg.brandl

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22594
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22594] Add a link to the regex module in re documentation

2014-10-11 Thread Ezio Melotti

Ezio Melotti added the comment:

+1

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22594
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22594] Add a link to the regex module in re documentation

2014-10-11 Thread Berker Peksag

Changes by Berker Peksag berker.pek...@gmail.com:


--
stage:  - needs patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22594
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22594] Add a link to the regex module in re documentation

2014-10-11 Thread Tshepang Lekhonkhobe

Changes by Tshepang Lekhonkhobe tshep...@gmail.com:


--
nosy: +tshepang

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22594
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22594] Add a link to the regex module in re documentation

2014-10-10 Thread Serhiy Storchaka

New submission from Serhiy Storchaka:

The regex module is purposed as a replacement of standard re module. Of course 
we fix re bugs, but for now regex is more bugfree. Even after fixing all open 
re bugs, regex will remain more featured. It would be good to add a link to 
regex in re documentation (as there are links to other GUI libraries in Tkinter 
documentation).

--
assignee: docs@python
components: Documentation, Regular Expressions
keywords: easy
messages: 228961
nosy: docs@python, ezio.melotti, mrabarnett, pitrou, serhiy.storchaka
priority: normal
severity: normal
status: open
title: Add a link to the regex module in re documentation
type: enhancement
versions: Python 2.7, Python 3.4, Python 3.5

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue22594
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: help with regex

2014-10-08 Thread Peter Otten
James Smith wrote:

 I want the last 1
 I can't this to work:
 
 pattern=re.compile( (\d+)$ )
 match=pattern.match( LINE: 235 : Primary Shelf Number (attempt 1): 1)
 print match.group()

 pattern = re.compile((\d+)$)
 match = pattern.search( LINE: 235 : Primary Shelf Number (attempt 1): 
1)
 match.group()
'1'


See https://docs.python.org/dev/library/re.html#search-vs-match

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: help with regex

2014-10-08 Thread Ben Finney
Peter Otten __pete...@web.de writes:

  pattern = re.compile((\d+)$)
  match = pattern.search( LINE: 235 : Primary Shelf Number (attempt 1): 1)
  match.group()
 '1'

An alternative way to accomplish the above using the ‘match’ method::

 import re
 pattern = re.compile(^.*:(? *)(\d+)$)
 match = pattern.match(LINE: 235 : Primary Shelf Number (attempt 1): 1)
 match.groups()
('1',)

 See https://docs.python.org/dev/library/re.html#search-vs-match

Right. Always refer to the API documentation for the API you're
attempting to use.

-- 
 \“Without cultural sanction, most or all of our religious |
  `\  beliefs and rituals would fall into the domain of mental |
_o__) disturbance.” —John F. Schumaker |
Ben Finney

-- 
https://mail.python.org/mailman/listinfo/python-list


<    2   3   4   5   6   7   8   9   10   11   >