[issue11740] difflib html diff takes extremely long
Changes by Benjamin Peterson benja...@python.org: -- resolution: - duplicate status: open - closed superseder: - dreadful performance in difflib: ndiff and HtmlDiff ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue11740 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11740] difflib html diff takes extremely long
Filip Gruszczyński grusz...@gmail.com added the comment: The culprit seems to be Differ._fancy_replace. There is a nasty quadratic loop there, that has pretty complex internal code. I have done a quick a fix, that makes example run below a second at the expense of not calling _fancy_replace for longer chunks and using _plain_replace instead. Another solution for long chunks would be to split them into smaller parts and process separately. This way quadratic time will be smaller and we still can benefit from _fancy_helper logic. -- keywords: +patch Added file: http://bugs.python.org/file21501/11740.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue11740 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11740] difflib html diff takes extremely long
New submission from Michael O'Rourke mkoro...@adobe.com: If you try to difference the attached files with difflib and a html difference it take 10 minutes or more. In comparison other differencing tools like windiff and araxis merge will show the diff within a second. Example code I'm using is: sourceText = open(source.xml, rU).readlines() targetText = open(target.xml, rU).readlines() html_diff = difflib.HtmlDiff(tabsize=4) result = html_diff.make_file(sourceText, targetText, Source, Target, context=True, numlines=10) f = open('c:/libdiff_html.html', 'w') f.write(result) finish() -- components: None files: Example.zip messages: 132767 nosy: mkoro...@adobe.com priority: normal severity: normal status: open title: difflib html diff takes extremely long type: performance versions: Python 2.7 Added file: http://bugs.python.org/file21500/Example.zip ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue11740 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11740] difflib html diff takes extremely long
Changes by Filip Gruszczyński grusz...@gmail.com: -- nosy: +gruszczy ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue11740 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11740] difflib html diff takes extremely long
ysj.ray ysj@gmail.com added the comment: Reproduced in 3.3 -- components: +Library (Lib) -None nosy: +ysj.ray versions: +Python 3.1, Python 3.2, Python 3.3 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue11740 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com