[issue14762] ElementTree memory leak

2012-05-09 Thread Giuseppe Attardi

New submission from Giuseppe Attardi atta...@di.unipi.it:

I confirm the presence of a serious memory leak in ElementTree, using the 
iterparse() function.
Memory grows disproportionately to dozens of GB when parsing a large XML file.

For further information, see discussion in:
  
http://www.gossamer-threads.com/lists/python/bugs/912164?do=post_view_threaded#912164
but notice that the comments attributing the problem to the OS are quite off 
the mark.

To replicate the problem, try this on a Wikipedia dump:

iterparse = ElementTree.iterparse(file)
id = None
for event, elem in iterparse:
if elem.tag.endswith(title):
title = elem.text
elif elem.tag.endswith(id) and not id:
id = elem.text
elif elem.tag.endswith(text):
   print id, title, elem.text[:20]

--
messages: 160266
nosy: Giuseppe.Attardi
priority: normal
severity: normal
status: open
title: ElementTree memory leak
type: resource usage
versions: Python 2.7

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14762
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14762] ElementTree memory leak

2012-05-09 Thread Antoine Pitrou

Changes by Antoine Pitrou pit...@free.fr:


--
nosy: +eli.bendersky, flox

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14762
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14762] ElementTree memory leak

2012-05-09 Thread Eli Bendersky

Eli Bendersky eli...@gmail.com added the comment:

Can you specify how you import ET? I.e. from the pure Python or the C 
accelerator?

Also, do you realize that the element iterparse returns should be discarded 
with 'clear'? [see tutorial here: 
http://eli.thegreenplace.net/2012/03/15/processing-xml-in-python-with-elementtree/]

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14762
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14762] ElementTree memory leak

2012-05-09 Thread Jesús Cea Avión

Jesús Cea Avión j...@jcea.es added the comment:

Can this be reproduced in 3.2/3.3?

--
nosy: +jcea

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14762
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14762] ElementTree memory leak

2012-05-09 Thread Giuseppe Attardi

Giuseppe Attardi atta...@di.unipi.it added the comment:

You are right, I should discard the elements.

Thank you.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14762
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14762] ElementTree memory leak

2012-05-09 Thread Giuseppe Attardi

Changes by Giuseppe Attardi atta...@di.unipi.it:


--
resolution:  - invalid
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue14762
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com