[issue24079] xml.etree.ElementTree.Element.text does not conform to the documentation

2015-08-18 Thread Stefan Behnel

Stefan Behnel added the comment:

The can store arbitrary objects sentence is now duplicated, and still way too 
visible. I have to read three sentences until it tells me what I need to know.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24079
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24079] xml.etree.ElementTree.Element.text does not conform to the documentation

2015-08-18 Thread Stefan Behnel

Stefan Behnel added the comment:

I think the first two sentences can simply be removed to fix this, without loss 
of readability or information.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24079
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24079] xml.etree.ElementTree.Element.text does not conform to the documentation

2015-08-17 Thread Ned Deily

Ned Deily added the comment:

Thanks for all of your contributions on this.  I've committed a version along 
the lines I suggested along with Martin's example.

--
resolution:  - fixed
stage: commit review - resolved
status: open - closed
type: behavior - 

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24079
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24079] xml.etree.ElementTree.Element.text does not conform to the documentation

2015-08-17 Thread Roundup Robot

Roundup Robot added the comment:

New changeset d3cda8cf4d42 by Ned Deily in branch '2.7':
Issue #24079: Improve description of the text and tail attributes for
https://hg.python.org/cpython/rev/d3cda8cf4d42

New changeset ad0491f85050 by Ned Deily in branch '3.4':
Issue #24079: Improve description of the text and tail attributes for
https://hg.python.org/cpython/rev/ad0491f85050

New changeset 17ce3486fd8f by Ned Deily in branch '3.5':
Issue #24079: merge from 3.4
https://hg.python.org/cpython/rev/17ce3486fd8f

New changeset 3c94ece57c43 by Ned Deily in branch 'default':
Issue #24079: merge from 3.5
https://hg.python.org/cpython/rev/3c94ece57c43

--
nosy: +python-dev

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24079
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24079] xml.etree.ElementTree.Element.text does not conform to the documentation

2015-08-12 Thread Robert Collins

Robert Collins added the comment:

So it is downplayed but it is still documented as being application usable.

I'll give this another week for Ned to reply, then commit it in the absence of 
a reply: I think its ok as is. I'd be ok with a tweaked version along the lines 
Ned proposed too: both ways are better than whats in tree today.

--
nosy: +rbcollins

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24079
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24079] xml.etree.ElementTree.Element.text does not conform to the documentation

2015-07-31 Thread Martin Panter

Martin Panter added the comment:

I think Ned’s version is an acceptable solution (modulo some punctuation) to 
the original problem, although I do agree with Stefan that downplaying the 
generality would be even better.

Perhaps we could add a qualifier, like “The *text* attribute [normally] holds . 
. .”

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24079
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24079] xml.etree.ElementTree.Element.text does not conform to the documentation

2015-07-31 Thread Stefan Behnel

Stefan Behnel added the comment:

could we apply this patch, please?

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24079
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24079] xml.etree.ElementTree.Element.text does not conform to the documentation

2015-07-31 Thread Ned Deily

Ned Deily added the comment:

I note that the current wording for both text and tail are careful to allow 
for the most general use of the Element class, that is, that it may be used in 
non-XML contexts, for example:

The text attribute can be used to hold additional data associated with the
element. As the name implies this attribute is usually a string but may be any
application-specific object. If the element is created from an XML file the
attribute will contain any text found between the element tags.

The proposed patch downplays that generality.  How about modifying the original 
wording so that the description starts something like:

These attributes can be used to hold additional [...] application-specific 
object.  If the element is created from an XML file, the *text* attribute holds 
either the text between the element'sstart tag and its first child or end tag, 
or ``None``and the *tail* attribute holds either the text [...].

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24079
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24079] xml.etree.ElementTree.Element.text does not conform to the documentation

2015-07-31 Thread Stefan Behnel

Stefan Behnel added the comment:

 The proposed patch downplays that generality.

That is completely intentional. Almost all readers of the documentation will 
first need to understand the difference between text and tail before they can 
go and think about any more advanced use cases that will almost certainly fail 
on their first serialisation attempts. The most important aim of the new 
phrasing is therefore to make that difference clear. Everything else is 
secondary, although still worth mentioning.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24079
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24079] xml.etree.ElementTree.Element.text does not conform to the documentation

2015-07-06 Thread Martin Panter

Changes by Martin Panter vadmium...@gmail.com:


--
stage: patch review - commit review

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24079
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24079] xml.etree.ElementTree.Element.text does not conform to the documentation

2015-06-05 Thread Stefan Behnel

Stefan Behnel added the comment:

Looks good to me.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24079
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24079] xml.etree.ElementTree.Element.text does not conform to the documentation

2015-06-03 Thread Martin Panter

Martin Panter added the comment:

Okay, here is a version with most of the wording reverted to Jérôme’s 
suggestion. I only left my itertext() example, and the grouping of text and 
tail together. If there are any more bits that are incorrect or unclear please 
identify them.

--
Added file: http://bugs.python.org/file39606/etree-text.v2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24079
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24079] xml.etree.ElementTree.Element.text does not conform to the documentation

2015-05-29 Thread Stefan Behnel

Stefan Behnel added the comment:

IMHO less clear and less correct than the previous suggestions.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24079
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24079] xml.etree.ElementTree.Element.text does not conform to the documentation

2015-05-29 Thread Stefan Behnel

Stefan Behnel added the comment:

Seems like a good idea to explain text and tail in one section, though. 
That makes tail easier to find for those who are not used to this kind of 
split (and that's basically everyone who needs to read the docs in the first 
place).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24079
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24079] xml.etree.ElementTree.Element.text does not conform to the documentation

2015-05-12 Thread Martin Panter

Martin Panter added the comment:

Another problem with tostring() is that it seems you have to call it with 
encoding=unicode. Perhaps it would be better to suggest code like 
.join(element.itertext())?

I would also improve on Jérôme’s version by making the None case more explicit. 
And perhaps both attributes can be defined together, rather than giving a 
half-hearted definition linking between them:

.. attribute:: text
.. attribute:: tail

   The *text* attribute holds any text between the element's begin tag and the 
next tag. The *tail* attribute holds any text between the element's end tag and 
the next tag. These attributes are set to ``None`` if there is no text. For 
example, in the XML data ``ab1c2d/3/c/b4/a``, the *a* element has 
``None`` for both *text* and *tail* attributes, the *b* element has *text* 
``1`` and *tail* ``4``, the *c* element has *text* ``2`` and *tail* 
``None``, the *d* element has *text* ``None`` and *tail* ``3``.
   
   To collect the inner text of an element, use :meth:`itertext`, for example 
``.join(element.itertext())``.
   
   Applications may store arbitrary objects in these attributes.

--
nosy: +vadmium

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24079
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24079] xml.etree.ElementTree.Element.text does not conform to the documentation

2015-04-30 Thread Raymond Hettinger

Raymond Hettinger added the comment:

 this is well formed xml and has nothing to do with tail.

In fact, it does have something to do with tail.
The 'TEXT' is a captured as the tail of element b:

 root3 = ET.fromstring('ab/TEXT/a')
 root3[0].tail
'TEXT'

--
nosy: +eli.bendersky, rhettinger, scoder

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24079
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24079] xml.etree.ElementTree.Element.text does not conform to the documentation

2015-04-30 Thread Stefan Behnel

Stefan Behnel added the comment:

I agree that the wording in the documentation isn't great:


text

The text attribute can be used to hold additional data associated with the 
element. As the name implies this attribute is usually a string but may be any 
application-specific object. If the element is created from an XML file the 
attribute will contain any text found between the element tags.

tail

The tail attribute can be used to hold additional data associated with the 
element. This attribute is usually a string but may be any application-specific 
object. If the element is created from an XML file the attribute will contain 
any text found after the element’s end tag and before the next tag.


Special cases that no-one uses (sticking non-string objects into text/tail) are 
given too much space and the difference isn't explained as needed.

Since the distinction between text and tail is a (great but) rather special 
feature of ElementTree, it needs to be given more room in the docs.

Proposal:


text

The text attribute holds the immediate text content of the element. It 
contains any text found up to either the closing tag if the element has no 
children, or the next opening child tag within the element. For text following 
an element, see the `tail` attribute. To collect the entire text content of a 
subtree, see `tostring`. Applications may store arbitrary objects in this 
attribute.

tail

The tail attribute holds any text that directly follows the element. For 
example, in a document like ``aTextb/BTailc/CTail/a``, the `text` 
attribute of the ``a`` element holds the string Text, and the tail attributes 
of ``b`` and ``c`` hold the strings BTail and CTail respectively. 
Applications may store arbitrary objects in this attribute.


--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24079
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24079] xml.etree.ElementTree.Element.text does not conform to the documentation

2015-04-30 Thread Jérôme Laurens

Jérôme Laurens added the comment:

Since the text and tail notions seem tightly coupled, I would vote for a more 
detailed explanation in the text doc and a forward link in the tail 
documentation.



text

The text attribute holds the text between the element's begin tag and the 
next tag or None. The tail attribute holds the text between the element's end 
tag and the next tag or None. For ab1c2d/3/c/b4/a xml data, the 
a element has None for both text and tail attributes, the b element has text 
'1' and tail '4', the c element has text '2' and tail None, the d element hast 
text None and tail '3'.

To collect the inner text of an element, see `tostring` with method 'text'.

Applications may store arbitrary objects in this attribute.

tail

The tail attribute holds the text between the element's end tag and the 
next tag or None. See `text` for more details.

Applications may store arbitrary objects in this attribute.


It is very important to mention that the 'text' attribute does not always hold 
a string contrary to what would suggest its name.

BTW, I was not aware of the tostring method with 'text' argument. The fact is 
that the documentation reads Returns an (optionally) encoded string containing 
the XML data. which is misleading because the text is not xml data in general. 
This also needs to be rephrased or simply removed.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24079
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24079] xml.etree.ElementTree.Element.text does not conform to the documentation

2015-04-30 Thread Jérôme Laurens

Jérôme Laurens added the comment:

Erratum

def innertext(elt):
return (elt.text or '') +''.join(innertext(e)+(e.tail or '') for e in elt)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24079
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24079] xml.etree.ElementTree.Element.text does not conform to the documentation

2015-04-30 Thread Jérôme Laurens

Jérôme Laurens added the comment:

The totsstring(..., method='text') is not suitable for the inner text because 
it adds the tail of the top element.

A proper implementation would be

def innertext(elt):
return (elt.text or '') +''.join(innertext(e)+e.tail for e in elt)

that can be included in the doc instead of the mention of the to string trick

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24079
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24079] xml.etree.ElementTree.Element.text does not conform to the documentation

2015-04-29 Thread Jérôme Laurens

New submission from Jérôme Laurens:

The documentation for xml.etree.ElementTree.Element.text reads If the element 
is created from an XML file the attribute will contain any text found between 
the element tags.

import xml.etree.ElementTree as ET
root3 = ET.fromstring('ab/TEXT/a')
print(root3.text)

CURRENT OUTPUT

None

TEXT is between the elements tags but does not appear in the output

BTW : this is well formed xml and has nothing to do with tail.

--
components: XML
messages: 242256
nosy: jlaurens
priority: normal
severity: normal
status: open
title: xml.etree.ElementTree.Element.text does not conform to the documentation
type: behavior
versions: Python 3.4

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24079
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24079] xml.etree.ElementTree.Element.text does not conform to the documentation

2015-04-29 Thread Ned Deily

Ned Deily added the comment:

(This issue is a followup to your Issue24072.)  Again, while the ElementTree 
documentation is certainly not nearly as complete as it should be, I don't 
think this is a documentation error per se.  The key issue is: with which 
element is each text string associated?  Perhaps this example will help:

 root4 = ET.fromstring('aATEXTbBTEXT/bBTAIL/a')
 root4
Element 'a' at 0x10224c228
 root4.text
'ATEXT'
 root4.tail
 root4[0]
Element 'b' at 0x1022ab278
 root4[0].text
'BTEXT'
 root4[0].tail
'BTAIL'

As in your original example, any text following the element b is associated 
with b's tail attribute until a new tag is found, pushing or popping the tree 
stack.  While the description of the text attribute does not explicitly state 
this, the tail attribute description immediately following it does.  This is 
also explained in more detail in the ElementTree resources on effbot.org that 
are linked to from the Python Standard Library documentation.  Nevertheless, it 
probably would be helpful to expand the documentation on this point if someone 
is willing to put together a documentation patch for review.

With regard to your comment about well formed xml, I don't think there is 
anything in the documentation that implies (or should imply) that the 
distinction between the text attribute and the tail attribute has anything 
to do with whether it is well-formed XML.  The tutorial for the third-party 
lxml package, which provides another implementation of ElementTree, goes into 
more detail about why, in general, both text and tail are necessary.

https://docs.python.org/3/library/xml.etree.elementtree.html#additional-resources
http://effbot.org/zone/element.htm#text-content
http://lxml.de/tutorial.html#elements-contain-text

--
assignee:  - docs@python
components: +Documentation -XML
nosy: +docs@python, ned.deily
stage:  - needs patch
versions: +Python 2.7, Python 3.5

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue24079
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com