Hi Christian. Thanks for answer. Your first question is interesting. Usually this is the natural reason why we changed two files. We are always expecting some kind of structural connection between classes (eg implements, extends, instantiation, etc.). However we found many cases (issues) with commits where files are not "structurally connected".
For example: JMSConduit.java and JMSOldConfigHolder.java are not structurally connected, despite being in the same package. We found that 15 commits they changed together, but in other 18 commits only JMSConduit changed without the presence of JMSOldConfigHolder.java. If you consider a "natural" reason you can make 18 mistakes, or at least, you will lost your time inspecting JMSOldConfigHolder.java 18 times. Our assumption is that "this real reason" can be, in fact, "many different reason". Because of this, using only structural dependencies can be not good in all situations, and can misleading the developers. A simple scenario: - You are working in a issue, and committed the file JMSConduit.java. What other files you could change to complete this issue? - Based on the past issues/commits when JMSConduit was changed, we collect contextual information that describe the situations when JMSConduit changed or not with JMSOldConfigHolder.java, and then we can recommend you to inspect this file to change or not. We collect data from all possible combinations envolving JMSConduit and other files of the system. - What we are reporting is that in 86% of the cases that we tested this combinations (you can check all of combination in the website), we correctly predicted when both files will change together in an specific issue/commit. About the practical aspects (what can be done). A researcher from our research group interviewed newcomers and they said that it is difficult to find right files to change in their first contributions. In this case, as a newcomer is difficult to complete the issues/pull requests because they don't understand much the code or the architecture. Debugging tasks are also not trivial in all projects. In such cases newcomers could use our approach (we are building a tool) to receive recommendations while performing the task. In the other hand, let's suppose that you are a core member and you are reviewing the Pull Request, we could give you a list of files to check, if all of them are in the set of commits made to the issue/pull request. Of course we are not claiming that you need to stop the test cases or the continuous integration. It is another tool to help during the code review tasks. We are working in a prototype.. we don't know yeat if we will build a "monitor" as a web service that you could integrate inside the Issue tracker, or as a plugin to some IDE. So the main ideia here is "avoid" the incomplete change that could causes a new bug appearing, or avoid waisting time to inspect files/debugging system to find files to change in a issue. -- View this message in context: http://cxf.547215.n5.nabble.com/Feedback-of-my-Phd-work-in-CXF-project-tp5763765p5763780.html Sent from the cxf-dev mailing list archive at Nabble.com.
