[issue42473] re.sub ignores flag re.M

2020-11-26 Thread Jérôme Laurens

New submission from Jérôme Laurens :

Test code:
```
import re
test='''012345678
 
012345678
'''
pattern = r'^\s+?$'
m = re.search(pattern, test, re.M)
if m:
print(f'TEST FOUND "{m.span()}"')

def replace(m):
print(f'TEST REMOVE {m.span()}')
return ''

test = re.sub(pattern, replace, test, re.M)
m = re.search(pattern, test, re.M)
if m:
print(f'TEST STILL THERE "{m.span()}"')

print('COMPILE PATTERN FIRST')
pattern_re = re.compile(pattern, re.M)
m = re.search(pattern_re, test)
if m:
print(f'TEST FOUND "{m.span()}"')

def replace(m):
print(f'TEST REMOVE {m.span()}')
return ''

test = re.sub(pattern_re, replace, test)
m = re.search(pattern_re, test)
if m:
print(f'TEST STILL THERE "{m.span()}"')
```

Actual output:

TEST FOUND "(10, 19)"
TEST STILL THERE "(10, 19)"
COMPILE PATTERN FIRST
TEST FOUND "(10, 19)"
TEST REMOVE (10, 19)

This is an inconsistency between re.search and re.sub. Either this is a bug in 
the code or in the documentation.

--
components: IO
messages: 381901
nosy: jlaurens
priority: normal
severity: normal
status: open
title: re.sub ignores flag re.M
type: behavior
versions: Python 3.8

___
Python tracker 
<https://bugs.python.org/issue42473>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38514] pathlib's mkdir documentation improvement

2019-10-18 Thread Jérôme Laurens

New submission from Jérôme Laurens :

There are some inconsistencies in the actual documentation of path lib's mkdir 
doc.

Here is the 3.7 version, annotated and followed by a change proposal

Path.mkdir(mode=0o777, parents=False, exist_ok=False)
Create a new directory at this given path. If mode is given, it is combined 
with the process’ umask value to determine the file mode and access flags. If 
the path already exists, FileExistsError is raised. <<<<<<<<<<<<<<<<<< NOT 
ALWAYS due to exist_ok.

If parents is true, any missing parents of this path are created as needed; 
they are created with the default permissions without taking mode into account 
(mimicking the POSIX mkdir -p command).

If parents is false (the default), a missing parent raises FileNotFoundError.

If exist_ok is false (the default), FileExistsError is raised if the target 
directory already exists.

If exist_ok is true, FileExistsError exceptions will be ignored (same behavior 
as the POSIX mkdir -p command), but only if the last path component is not an 
existing non-directory file. <<<<<<<<<<<<<<<<<< UNCLEAR: 1) what is an ignored 
exception ? 2) The reference to POSIX should appear at the end, like above, 3) 
the last path component is a string 4) usage of a double negation ignore/is not

- CHANGE 

Path.mkdir(mode=0o777, parents=False, exist_ok=False)
Create a new directory in the file system at this given path.

If mode is given, it is combined with the process’ umask value to determine the 
file mode and access flags.

If parents is false (the default), a missing parent raises FileNotFoundError.

If parents is true, any missing parents of this path are created as needed; 
they are created with the default permissions without taking mode into account 
(mimicking the POSIX mkdir -p command).

If exist_ok is false (the default), FileExistsError is raised if the given path 
already exists in the file system, whether a directory or not.

If exist_ok is true, FileExistsError is raised only if the given path already 
exists in the file system and is not a directory (same behavior as the POSIX 
mkdir -p command).

Thanks for reading

JL

--
assignee: docs@python
components: Documentation
messages: 354874
nosy: docs@python, jlaurens
priority: normal
severity: normal
status: open
title: pathlib's mkdir documentation improvement
type: enhancement
versions: Python 3.7

___
Python tracker 
<https://bugs.python.org/issue38514>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35957] Indentation explanation is unclear

2019-02-12 Thread Jérôme LAURENS

Jérôme LAURENS  added the comment:

To be more precise, consider code

def f(x):
   \tx=0 # 7 spaces + one tab
return x # 8 spaces

In cpython, both indentation levels are 8 and no indentation error is reported 
(this is the case where both tab size and alt tab size are equal)

If instead of 8 the tab would count for 6 spaces, then we would have 12 and 8 
as indentation level, resulting in a mismatch and an indentation error being 
reported, according to the documentation. This is inconsistent.
Then either the documentation is faulty or cpython is.

Actually, cpython accepts a mix of space and tabs only when tabs are in 8, 16, 
24... positions.

--

___
Python tracker 
<https://bugs.python.org/issue35957>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35957] Indentation explanation is unclear

2019-02-10 Thread Jérôme LAURENS

New submission from Jérôme LAURENS :

https://docs.python.org/3/reference/lexical_analysis.html#indentation reads

Point 1:
"Tabs are replaced (from left to right) by one to eight spaces such that the 
total number of characters up to and including the replacement is a multiple of 
eight"

and in the next paragraph

Point 2:
"Indentation is rejected as inconsistent if a source file mixes tabs and spaces 
in a way that makes the meaning dependent on the worth of a tab in spaces"

In point 1, each tab has definitely a unique space counterpart, in point 2, 
tabs may have different space counterpart, which one is reliable ?

The documentation should state that Point 1 concerns cPython, or at least 
indicate that the 8 may depend on the implementation, which then gives sense to 
point 2.

--
assignee: docs@python
components: Documentation
messages: 335165
nosy: Jérôme LAURENS, docs@python
priority: normal
severity: normal
status: open
title: Indentation explanation is unclear
type: enhancement
versions: Python 3.7

___
Python tracker 
<https://bugs.python.org/issue35957>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24079] xml.etree.ElementTree.Element.text does not conform to the documentation

2015-04-30 Thread Jérôme Laurens

Jérôme Laurens added the comment:

Since the text and tail notions seem tightly coupled, I would vote for a more 
detailed explanation in the text doc and a forward link in the tail 
documentation.



text

The text attribute holds the text between the element's begin tag and the 
next tag or None. The tail attribute holds the text between the element's end 
tag and the next tag or None. For ab1c2d/3/c/b4/a xml data, the 
a element has None for both text and tail attributes, the b element has text 
'1' and tail '4', the c element has text '2' and tail None, the d element hast 
text None and tail '3'.

To collect the inner text of an element, see `tostring` with method 'text'.

Applications may store arbitrary objects in this attribute.

tail

The tail attribute holds the text between the element's end tag and the 
next tag or None. See `text` for more details.

Applications may store arbitrary objects in this attribute.


It is very important to mention that the 'text' attribute does not always hold 
a string contrary to what would suggest its name.

BTW, I was not aware of the tostring method with 'text' argument. The fact is 
that the documentation reads Returns an (optionally) encoded string containing 
the XML data. which is misleading because the text is not xml data in general. 
This also needs to be rephrased or simply removed.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24079
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24079] xml.etree.ElementTree.Element.text does not conform to the documentation

2015-04-30 Thread Jérôme Laurens

Jérôme Laurens added the comment:

Erratum

def innertext(elt):
return (elt.text or '') +''.join(innertext(e)+(e.tail or '') for e in elt)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24079
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24079] xml.etree.ElementTree.Element.text does not conform to the documentation

2015-04-30 Thread Jérôme Laurens

Jérôme Laurens added the comment:

The totsstring(..., method='text') is not suitable for the inner text because 
it adds the tail of the top element.

A proper implementation would be

def innertext(elt):
return (elt.text or '') +''.join(innertext(e)+e.tail for e in elt)

that can be included in the doc instead of the mention of the to string trick

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24079
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24079] xml.etree.ElementTree.Element.text does not conform to the documentation

2015-04-29 Thread Jérôme Laurens

New submission from Jérôme Laurens:

The documentation for xml.etree.ElementTree.Element.text reads If the element 
is created from an XML file the attribute will contain any text found between 
the element tags.

import xml.etree.ElementTree as ET
root3 = ET.fromstring('ab/TEXT/a')
print(root3.text)

CURRENT OUTPUT

None

TEXT is between the elements tags but does not appear in the output

BTW : this is well formed xml and has nothing to do with tail.

--
components: XML
messages: 242256
nosy: jlaurens
priority: normal
severity: normal
status: open
title: xml.etree.ElementTree.Element.text does not conform to the documentation
type: behavior
versions: Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24079
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24072] xml.etree.ElementTree.Element does not catch text

2015-04-28 Thread Jérôme Laurens

New submission from Jérôme Laurens:

text is not catcher in case 3 below

INPUT

import xml.etree.ElementTree as ET
root1 = ET.fromstring('aTEXT/a')
print(root1.text)
root2 = ET.fromstring('aTEXTb//a')
print(root2.text)
root3 = ET.fromstring('ab/TEXT/a')
print(root3.text)

CURRENT OUTPUT

TEXT
TEXT
None -- ERROR HERE

EXPECTED OUTPUT

TEXT
TEXT
TEXT

--
messages: 242207
nosy: jlaurens
priority: normal
severity: normal
status: open
title: xml.etree.ElementTree.Element does not catch text
type: behavior
versions: Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24072
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com