Thanks for review! I'm still interested in getting patch in, but I'm not quick - it takes some time to make fun improving my python skill to get code fast AND readable AND more selective in tags pulling.
Unfortunately the patch now is two times longer, but otherwise at least one of the 3 goals above was not reached. I fixed your issues and also reformat code to 80-col limit. Filed this as https://bugzilla.gnome.org/show_bug.cgi?id=761608 Though this patch is "svn-specific step" to making meld comfortably usable with huge svn repositories - it doesn't achieve this goal yet: in my repo meld is still even more slow (many minutes with high cpu usage) in iterating over all files in filesystem in huge repositories while loading "repository status". I'm not filing bug about this problem now, because it is too general - I'm even not sure if it is system-dependent or not. > On 19 December 2015 at 06:52, Vasily Galkin <[email protected]> wrote: > >> I'm using meld with 100000-file svn repository which has svn status -v --xml >> generating ~25MB output running 18 seconds on ~10-year-old machine. (linux, >> svn 1.9.2, python 2.7.10 32 bit) >> Current version of meld uses quite slow xml.etree.ElementTree which >> constructs entire xml tree in python, >> so svn.py _update_tree_state_cache with ElementTree takes >> wall clock 130 seconds, 360 MB resident memory. >> Simply replacing xml.etree.ElementTree with xml.etree.cElementTree reduces >> that to >> wall clock 35 seconds, 195 MB resident memory >> More optimizing with streaming parser based on xml.parsers.expat (attached >> patch) gives >> wall clock 27 seconds, 84 MB resident memory >> >> Nearly same wall clock improvements with 400000-file svn repo on ~4-year-old >> machine (windows7 x64, svn 1.9.2, python 2.7.10 32 bit) >> >> From the user experience reducing for _update_tree_state_cache time reduces >> the time while meld is initially totally unresponsive (UI thread is >> performing task). >> >> Unfortunately I've not pre-1.9 svn, so only svn 1.9 was tested. > > I'm sorry, but I still haven't found the time to test this on older > SVN, so I'm just putting down some very quick notes from looking at > the patch. > > First up, this generally looks good and I'd like to be able to merge it. > > The patch removes our EAGAIN handling, and I'd rather keep it. > Granted, there are excellent chances that it's never hit, and if it is > then there's not much reason to believe that the failure is actually > temporary... but I value keeping it for consistency with the other > plugins. Maybe we should pull that out in to a helper? > > This patch is also less selective when pulling out tags from the XML, > due to the nature of the streaming parser. It would be nice if it > actually only checked under target and changelist tags for the > relevant entry + status lines. > > Finally, there's a few bits where the handling has changed subtly, in > ways that I'm not 100% sure are correct. For example, we now handle > empty path and item tags differently than we did, and the > _tree_meta_cache behaves slightly differently. > > I hope you're still interested in getting this patch in, and I > apologise for the time it's taken for me to get to it. If you'd like > to file a bug in bugzilla, I think we can probably work through any > issues pretty quickly. > > cheers, > Kai _______________________________________________ meld-list mailing list [email protected] https://mail.gnome.org/mailman/listinfo/meld-list
