Re: [Development] Build Hotspots in the Qt build process
Hi Thiago, On Feb 5, 2014, at 6:18 PM, Thiago Macieira thiago.macie...@intel.com wrote: Em qua 05 fev 2014, às 11:58:50, Simon Hausmann escreveu: I'm with Simon, this is very interesting stuff! Thank you! :-) I'm wondering if we could make use of your data to figure out what headers we should precompile. This is a fascinating idea actually. We primarily motivate our work based on refactoring, but as you mentioned, some really hot files just cannot be refactored due to architectural concerns. Perhaps precompiling the hot headers is another way these hotspots can be addressed. You said you extract a graph from the build: is there a way to parse that graph to produce a list of headers most commonly included in a given module? Yes, this should be possible. Within our graph, each header file is a node that is connected by an edge to each object file (another node) that needs to be recompiled when the header changes. A simple measure like the sum of all the edges coming out of a header file will tell you how many objects are recompiled when it changes, which is a kind of proxy for the number of files including a header. We've had precompilation support in qmake for a decade, but apparently we use that in exactly 3 modules... Hmm, it could also be interesting to look into these modules to see if having them precompiled is justified by its hotness? Kind regards, -Shane ___ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development
Re: [Development] Build Hotspots in the Qt build process
Em qui 06 fev 2014, às 18:37:43, Shane McIntosh escreveu: We've had precompilation support in qmake for a decade, but apparently we use that in exactly 3 modules... Hmm, it could also be interesting to look into these modules to see if having them precompiled is justified by its hotness? Well, think of me. I'm the QtCore maintainer. Every time I change qstring.h or qbasicatomic.h, EVERYTHING recompiles. Not even ccache can help me there. So if I can get stuff precompiled, it should be a win. The very least we can do is auto precompilation: for every Qt module, we know which modules it depends on, so we could generate a precompile header that #includes the dependent's master header. But I was hoping for the next step: which headers of the *current* module are most in need of precompilation. For example, almost everything in QtWidgets will #include qwidget.h. -- Thiago Macieira - thiago.macieira (AT) intel.com Software Architect - Intel Open Source Technology Center ___ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development
Re: [Development] Build Hotspots in the Qt build process
Shane McIntosh [mcint...@cs.queensu.ca] wrote: Hi Qt developers! My name is Shane. I’m a PhD student at Queen’s University in Canada. I’ve been working on an approach for detecting build hotspots, i.e., files that not only take a long time to rebuild, but also change often. We think that these files are ideal candidates for refactoring that could shave time off of incremental builds that are really impacting software teams. That sounds like a good idea in general. Looking at your list at http://sailhome.cs.queensu.ca/~shane/content/qt_hotspots.txt I wonder a bit how e.g. xmlpatterns can be considered a hotspot that according to your definition changes often. The actual code has not been changed much for a while except for occasional merges and the annual bumps in copyright headers. My gut feeling is that any refactoring there is not worthwhile. I’m happy to provide a more detailed Qt dataset when I return to my lab next week. It would be nice to see some reason (numbers) why files ended up on the list. Best regards, Andre' ___ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development
Re: [Development] Build Hotspots in the Qt build process
Hi Shane, thanks again for the presentation at FOSDEM. I really enjoyed it:-) The list of yours seems to be ordered alphabetically. I guess that is not necessarily the order in which we should look at the files:-) This list is rather long and slide 36 of your presentation shows that the files listed trigger between 90s and more than 8000s of rebuild time! That is two orders of magnitude difference, so if we look at this we should start by looking at those in the 8000s range:-) Could you generate a list of files sorted by hotness when you get back to your lab? Maybe the distance of the file's datapoint from the origin would be a good measure for that? That might need some normalizing though. I just went over your list and have not been overly surprised with which files are causing long rebuild times: All of them are header files that are widely used in Qt and by Qt applications. E.g. almost the complete set of non-private headers in qtbase/src/corelib is listed. There actually is one .cpp file in the list, which is rather unexpected. Seeing the same data for Qt webkit would also be cool. That is the part of code that feels like it is taking the longest to build in all of Qt:-) Best Regards, Tobias On Mon, Feb 3, 2014 at 10:39 PM, Shane McIntosh mcint...@cs.queensu.ca wrote: Hi Qt developers! My name is Shane. I’m a PhD student at Queen’s University in Canada. I’ve been working on an approach for detecting build hotspots, i.e., files that not only take a long time to rebuild, but also change often. We think that these files are ideal candidates for refactoring that could shave time off of incremental builds that are really impacting software teams. We came up with an approach that I presented last weekend at FOSDEM ( slides are available here: http://www.slideshare.net/shanemcintosh/identifying-hotspots-in-software-build-process ). One of the projects that we analyzed was Qt. I bumped into Tobias at FOSDEM and he suggested that I post the list of Qt hotspots here. So, I’ve made the hotspot list available here: http://sailhome.cs.queensu.ca/~shane/content/qt_hotspots.txt I’m happy to provide a more detailed Qt dataset when I return to my lab next week. Kind regards, -Shane P.S.: We are conducting a survey on how build performance is impacting developers ( http://is.gd/DbMRTr ). If you could spare 5 minutes to fill out our survey, we’d really appreciate it! ___ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development ___ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development
Re: [Development] Build Hotspots in the Qt build process
Hi Shane, On Monday 3. February 2014 22.39.27 Shane McIntosh wrote: Hi Qt developers! My name is Shane. I’m a PhD student at Queen’s University in Canada. I’ve been working on an approach for detecting build hotspots, i.e., files that not only take a long time to rebuild, but also change often. We think that these files are ideal candidates for refactoring that could shave time off of incremental builds that are really impacting software teams. We came up with an approach that I presented last weekend at FOSDEM ( slides are available here: http://www.slideshare.net/shanemcintosh/identifying-hotspots-in-software-bu ild-process ). One of the projects that we analyzed was Qt. I bumped into Tobias at FOSDEM and he suggested that I post the list of Qt hotspots here. So, I’ve made the hotspot list available here: http://sailhome.cs.queensu.ca/~shane/content/qt_hotspots.txt I’m happy to provide a more detailed Qt dataset when I return to my lab next week. Interesting stuff :) How do you determine that a particular header file is the hot spot? (I didn't understand that part from the slide set) Is that header file commonly included in other files that result in long-building object files? Simon ___ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development
Re: [Development] Build Hotspots in the Qt build process
Hi Tobias, On Feb 5, 2014, at 11:45 AM, Tobias Hunger tobias.hun...@gmail.com wrote: The list of yours seems to be ordered alphabetically. I guess that is not necessarily the order in which we should look at the files:-) This list is rather long and slide 36 of your presentation shows that the files listed trigger between 90s and more than 8000s of rebuild time! That is two orders of magnitude difference, so if we look at this we should start by looking at those in the 8000s range:-) Could you generate a list of files sorted by hotness when you get back to your lab? Maybe the distance of the file's datapoint from the origin would be a good measure for that? That might need some normalizing though. D’oh! You’re correct! Sorting by “hotness” is a great idea. I will try this out and share the results next week. I just went over your list and have not been overly surprised with which files are causing long rebuild times: All of them are header files that are widely used in Qt and by Qt applications. E.g. almost the complete set of non-private headers in qtbase/src/corelib is listed. True. It really isn’t too surprising that the corelib header files trigger slow rebuilds. Maybe it will be more useful when we sort by hotness. Perhaps there are specific corelib headers that will pop out. There actually is one .cpp file in the list, which is rather unexpected. Yes, that was an interesting one. My notes seem to suggest that that file is used in several test binaries. Perhaps that’s why its so slow to rebuild? Seeing the same data for Qt webkit would also be cool. That is the part of code that feels like it is taking the longest to build in all of Qt:-) Thanks for pointing this out. I will kick off a new analysis with Qt webkit included when I return to the lab :-) Kind regards, -Shane On Mon, Feb 3, 2014 at 10:39 PM, Shane McIntosh mcint...@cs.queensu.ca wrote: Hi Qt developers! My name is Shane. I’m a PhD student at Queen’s University in Canada. I’ve been working on an approach for detecting build hotspots, i.e., files that not only take a long time to rebuild, but also change often. We think that these files are ideal candidates for refactoring that could shave time off of incremental builds that are really impacting software teams. We came up with an approach that I presented last weekend at FOSDEM ( slides are available here: http://www.slideshare.net/shanemcintosh/identifying-hotspots-in-software-build-process ). One of the projects that we analyzed was Qt. I bumped into Tobias at FOSDEM and he suggested that I post the list of Qt hotspots here. So, I’ve made the hotspot list available here: http://sailhome.cs.queensu.ca/~shane/content/qt_hotspots.txt I’m happy to provide a more detailed Qt dataset when I return to my lab next week. Kind regards, -Shane P.S.: We are conducting a survey on how build performance is impacting developers ( http://is.gd/DbMRTr ). If you could spare 5 minutes to fill out our survey, we’d really appreciate it! ___ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development ___ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development
Re: [Development] Build Hotspots in the Qt build process
Hi Simon, On Feb 5, 2014, at 11:58 AM, Simon Hausmann simon.hausm...@digia.com wrote: On Monday 3. February 2014 22.39.27 Shane McIntosh wrote: Hi Qt developers! My name is Shane. I’m a PhD student at Queen’s University in Canada. I’ve been working on an approach for detecting build hotspots, i.e., files that not only take a long time to rebuild, but also change often. We think that these files are ideal candidates for refactoring that could shave time off of incremental builds that are really impacting software teams. We came up with an approach that I presented last weekend at FOSDEM ( slides are available here: http://www.slideshare.net/shanemcintosh/identifying-hotspots-in-software-bu ild-process ). One of the projects that we analyzed was Qt. I bumped into Tobias at FOSDEM and he suggested that I post the list of Qt hotspots here. So, I’ve made the hotspot list available here: http://sailhome.cs.queensu.ca/~shane/content/qt_hotspots.txt I’m happy to provide a more detailed Qt dataset when I return to my lab next week. Interesting stuff :) Thanks! :-) How do you determine that a particular header file is the hot spot? (I didn't understand that part from the slide set) Is that header file commonly included in other files that result in long-building object files? Specifically, we have a tool for extracting the build dependency graph from a concrete run of the build system. We then annotate the graph with timing information from each build command. Using this graph, we can figure out how long a rebuild will take if a developer changes any source file (add up the times of the edges triggered by changing a source file). We combine this rebuild time info with change frequency data we extract from the version control system (e.g., Git). The hotspots are then the files that change frequently (info from Git) and also rebuild slowly (info from the build dependency graph). Of course, we need to pick thresholds to identify the files that are too slow and change too much. For this experiment, we used the median number of changes and 90 second rebuild time. These thresholds might need some tweaking based on your development culture. For example, maybe 90 seconds is too low for your system? Kind regards, -Shane ___ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development
Re: [Development] Build Hotspots in the Qt build process
Hi Andre, On Feb 5, 2014, at 12:11 PM, Poenitz Andre andre.poen...@digia.com wrote: Shane McIntosh [mcint...@cs.queensu.ca] wrote: Hi Qt developers! My name is Shane. I’m a PhD student at Queen’s University in Canada. I’ve been working on an approach for detecting build hotspots, i.e., files that not only take a long time to rebuild, but also change often. We think that these files are ideal candidates for refactoring that could shave time off of incremental builds that are really impacting software teams. That sounds like a good idea in general. Looking at your list at http://sailhome.cs.queensu.ca/~shane/content/qt_hotspots.txt I wonder a bit how e.g. xmlpatterns can be considered a hotspot that according to your definition changes often. The actual code has not been changed much for a while except for occasional merges and the annual bumps in copyright headers. My gut feeling is that any refactoring there is not worthwhile. Thanks for your perspective! I’m certainly not an expert in Qt development and appreciate your perspective. This could perhaps be due to our use of the median number of changes as the threshold for rate of change. I’ll make the full dataset available when I return to my lab shortly, so we can explore together to find the most appropriate thresholds for Qt. I’m happy to provide a more detailed Qt dataset when I return to my lab next week. It would be nice to see some reason (numbers) why files ended up on the list. I’ll definitely include that data :-) Kind regards, -Shane ___ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development
Re: [Development] Build Hotspots in the Qt build process
Em qua 05 fev 2014, às 14:17:20, Shane McIntosh escreveu: I just went over your list and have not been overly surprised with which files are causing long rebuild times: All of them are header files that are widely used in Qt and by Qt applications. E.g. almost the complete set of non-private headers in qtbase/src/corelib is listed. True. It really isn’t too surprising that the corelib header files trigger slow rebuilds. Maybe it will be more useful when we sort by hotness. Perhaps there are specific corelib headers that will pop out. Well, qglobal.h and everything that that file includes are, of course, hot but there isn't much we can do. Those files are: qconfig.h \ qfeatures.h \ qprocessordetection.h \ qglobal.h \ qsystemdetection.h \ qcompilerdetection.h \ qtypeinfo.h \ qsysinfo.h \ qlogging.h \ qflags.h \ qtypetraits.h \ qatomic.h \ qbasicatomic.h \ qatomic_bootstrap.h \ qgenericatomic.h \ qatomic_msvc.h \ qatomic_armv7.h \ qatomic_armv6.h \ qatomic_armv5.h \ qatomic_ia64.h \ qatomic_mips.h \ qatomic_x86.h \ qatomic_cxx11.h \ qatomic_gcc.h \ qatomic_unix.h \ qglobalstatic.h \ qmutex.h \ -- Thiago Macieira - thiago.macieira (AT) intel.com Software Architect - Intel Open Source Technology Center ___ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development
Re: [Development] Build Hotspots in the Qt build process
Em qua 05 fev 2014, às 11:58:50, Simon Hausmann escreveu: Hi Shane, Interesting stuff :) How do you determine that a particular header file is the hot spot? (I didn't understand that part from the slide set) Is that header file commonly included in other files that result in long-building object files? Hello Shane I'm with Simon, this is very interesting stuff! I'm wondering if we could make use of your data to figure out what headers we should precompile. You said you extract a graph from the build: is there a way to parse that graph to produce a list of headers most commonly included in a given module? We've had precompilation support in qmake for a decade, but apparently we use that in exactly 3 modules... -- Thiago Macieira - thiago.macieira (AT) intel.com Software Architect - Intel Open Source Technology Center ___ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development
[Development] Build Hotspots in the Qt build process
Hi Qt developers! My name is Shane. I’m a PhD student at Queen’s University in Canada. I’ve been working on an approach for detecting build hotspots, i.e., files that not only take a long time to rebuild, but also change often. We think that these files are ideal candidates for refactoring that could shave time off of incremental builds that are really impacting software teams. We came up with an approach that I presented last weekend at FOSDEM ( slides are available here: http://www.slideshare.net/shanemcintosh/identifying-hotspots-in-software-build-process ). One of the projects that we analyzed was Qt. I bumped into Tobias at FOSDEM and he suggested that I post the list of Qt hotspots here. So, I’ve made the hotspot list available here: http://sailhome.cs.queensu.ca/~shane/content/qt_hotspots.txt I’m happy to provide a more detailed Qt dataset when I return to my lab next week. Kind regards, -Shane P.S.: We are conducting a survey on how build performance is impacting developers ( http://is.gd/DbMRTr ). If you could spare 5 minutes to fill out our survey, we’d really appreciate it! ___ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development