[Pywikipedia-bugs] [Maniphest] [Commented On] T199615: filter_unique leaks memory

2018-08-24 Thread gerritbot
gerritbot added a comment. Change 454198 merged by jenkins-bot: [pywikibot/core@master] pagegenerators.py: Avoid applying two uniquifying filters https://gerrit.wikimedia.org/r/454198TASK DETAILhttps://phabricator.wikimedia.org/T199615EMAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T199615: filter_unique leaks memory

2018-08-21 Thread gerritbot
gerritbot added a comment. Change 454198 had a related patch set uploaded (by Dalba; owner: dalba): [pywikibot/core@master] pagegenerators.py: Avoid applying two uniquifying filters https://gerrit.wikimedia.org/r/454198TASK DETAILhttps://phabricator.wikimedia.org/T199615EMAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T199615: filter_unique leaks memory

2018-08-21 Thread Dalba
Dalba added a comment. In T199615#4511856, @matej_suchanek wrote: Another problem I can see is that filter_unique (inside GenFact it is self._filter_unique) is used twice: it's provided to RecentChangesPageGenerator (line 821) and also used for dupfiltergen = self._filter_unique(gensList) (line

[Pywikipedia-bugs] [Maniphest] [Commented On] T199615: filter_unique leaks memory

2018-08-18 Thread gerritbot
gerritbot added a comment. Change 445854 abandoned by Xqt: [IMPR] Use hash key for unique filter by default Reason: https://gerrit.wikimedia.org/r/#/c/pywikibot/core/ /451824/ https://gerrit.wikimedia.org/r/445854TASK DETAILhttps://phabricator.wikimedia.org/T199615EMAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T199615: filter_unique leaks memory

2018-08-18 Thread gerritbot
gerritbot added a comment. Change 451824 merged by jenkins-bot: [pywikibot/core@master] Use a key for filter_unique where appropriate https://gerrit.wikimedia.org/r/451824TASK DETAILhttps://phabricator.wikimedia.org/T199615EMAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T199615: filter_unique leaks memory

2018-08-09 Thread gerritbot
gerritbot added a comment. Change 451824 had a related patch set uploaded (by Dalba; owner: dalba): [pywikibot/core@master] Use a key for filter_unique where appropriate https://gerrit.wikimedia.org/r/451824TASK DETAILhttps://phabricator.wikimedia.org/T199615EMAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T199615: filter_unique leaks memory

2018-07-15 Thread gerritbot
gerritbot added a comment. Change 445854 had a related patch set uploaded (by Xqt; owner: Xqt): [pywikibot/core@master] [IMPR] Use hash key for filter_unique by default https://gerrit.wikimedia.org/r/445854TASK DETAILhttps://phabricator.wikimedia.org/T199615EMAIL

[Pywikipedia-bugs] [Maniphest] [Commented On] T199615: filter_unique leaks memory

2018-07-15 Thread zhuyifei1999
zhuyifei1999 added a comment. In T199615#4425487, @Xqt wrote: We could use hash function for the filter_unique key use hash function for the filter_unique key by default use a GeneratorFactory Container attribute to hold the seen pages which could be reused when we have more than one duplicate

[Pywikipedia-bugs] [Maniphest] [Commented On] T199615: filter_unique leaks memory

2018-07-14 Thread Xqt
Xqt added a comment. We could use hash function for the filter_unique key use hash function for the filter_unique key by default use a GeneratorFactory Container attribute to hold the seen pages which could be reused when we have more than one duplicate filter use an container which uses disk

[Pywikipedia-bugs] [Maniphest] [Commented On] T199615: filter_unique leaks memory

2018-07-14 Thread zhuyifei1999
zhuyifei1999 added a comment. In T199615#4425451, @Xqt wrote: I see the getsizeof() counts the pointers only but not the Page objects itself. The underlying implementation of set.__sizeof__ looks weird to me. It doesn't seem to be iterative or recursive on a first glance. But yes, the problem

[Pywikipedia-bugs] [Maniphest] [Commented On] T199615: filter_unique leaks memory

2018-07-14 Thread Xqt
Xqt added a comment. No glue where the Memory leakage might come from. I see the getsizeof() counts the pointers only but not the Page objects itself.TASK DETAILhttps://phabricator.wikimedia.org/T199615EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: XqtCc:

[Pywikipedia-bugs] [Maniphest] [Commented On] T199615: filter_unique leaks memory

2018-07-14 Thread Xqt
Xqt added a comment. Long-running tasks may end on MemoryError to filter_unique leaks Memory Why do you assume that? Try: from sys import getsizeof import pwb, pywikibot as py from pywikibot.tools import filter_unique as f s = py.Site() p = py.Page(s, 'Hydraulik') container = set() gen =