Re: duplicate class names in ASF Java projects
Ralph, Thanks for the explanation. Is there a strategy for projects that have both brought in transitively? Hunting down class name conflicts in a multi-layered dependency tree was one of the reasons that I got interested in the subject, and I haven't found a satisfying solution to it yet. Paul On 3/21/14, 10:24 AM, Ralph Goers wrote: > In the case of logging-log4j2 the package and class names are duplicated with > log4j to provide a bridge so that code does not need to be rewritten to > upgrade. However, if you look at the line counts you will see that they are > not the same as the classes are very different. > > Ralph > - To unsubscribe, e-mail: community-unsubscr...@apache.org For additional commands, e-mail: community-h...@apache.org
Re: duplicate class names in ASF Java projects
On 3/21/14, 8:58 AM, sebb wrote: > Note that sanselan was renamed as commons imaging. However the package > names were also changed so I'm not sure why they are shown as > duplicates. sanselan: org.apache.sanselan imaging: > org.apache.commons.imaging Perhaps the information has been derived > from SVN rather than the published releases. In which case I suspect > there are a lot of false positives. Not all SVN (or Git) source code > is part of a release, and source code may go through various name > changes. It looks like the rename was committed to sanselan in the source code repo before the project was decomissioned. Glad the rename didn't make it to a release jar. Thanks for the explanation. Paul - To unsubscribe, e-mail: community-unsubscr...@apache.org For additional commands, e-mail: community-h...@apache.org
Re: duplicate class names in ASF Java projects
You may want to filter out small files, or common file name conventions: e.g. https://github.com/apache/accumulo/blob/trunk/maven-plugin/src/it/plugin-test/postbuild.groovy and https://github.com/apache/maven-plugins/blob/trunk/maven-invoker-plugin/src/it/script-additional-vars/src/it/groovy/postbuild.groovy are not the same, but probably were both built from the same example template. -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Fri, Mar 21, 2014 at 12:49 AM, Pawel Slusarz wrote: > Greetings, > > When looking at the Apache SF Java projects as a group, I noticed that a > large number of projects have duplicate class names, ie > both openejb and tomee have a class named > jug.client.command.api.AbstractCommand > > When edge cases, ie test.Foo and tomcat55, tomcat60, tomcat70 get > eliminated, it still appears that the practice of code sharing by > drag-drop-modify is quite prevalent. Over 14,000 (out of 165,000) > classes were shared that way in the ecosystem, and 103 projects (out of > 300) are affected. > > Sometimes a measurement and visualization is all it takes to realize a > problem and begin fixing it. Below is raw data that can help understand > better what and how is happening: > > http://pslusarz.github.io/archeology3d/research/apache/conflicting-classes/index.html > > Hope this is the right place to engage in this sort of conversation. > > Paul Slusarz > > PS: Who am I and what's my agenda? I am interested in looking at large > codebases in search of patterns. I picked Apache SF, because, unlike my > company code, the data can be independently verified. The issue with > conflicting class names became apparent as I was trying to identify and > understand classes that are shared in this ecosystem. Some more > background on this approach can be found on my blog: > http://10kftcode.blogspot.com/ > > - > To unsubscribe, e-mail: community-unsubscr...@apache.org > For additional commands, e-mail: community-h...@apache.org > - To unsubscribe, e-mail: community-unsubscr...@apache.org For additional commands, e-mail: community-h...@apache.org
Re: duplicate class names in ASF Java projects
In the case of logging-log4j2 the package and class names are duplicated with log4j to provide a bridge so that code does not need to be rewritten to upgrade. However, if you look at the line counts you will see that they are not the same as the classes are very different. Ralph On Mar 20, 2014, at 9:49 PM, Pawel Slusarz wrote: > Greetings, > > When looking at the Apache SF Java projects as a group, I noticed that a > large number of projects have duplicate class names, ie > both openejb and tomee have a class named > jug.client.command.api.AbstractCommand > > When edge cases, ie test.Foo and tomcat55, tomcat60, tomcat70 get > eliminated, it still appears that the practice of code sharing by > drag-drop-modify is quite prevalent. Over 14,000 (out of 165,000) > classes were shared that way in the ecosystem, and 103 projects (out of > 300) are affected. > > Sometimes a measurement and visualization is all it takes to realize a > problem and begin fixing it. Below is raw data that can help understand > better what and how is happening: > > http://pslusarz.github.io/archeology3d/research/apache/conflicting-classes/index.html > > Hope this is the right place to engage in this sort of conversation. > > Paul Slusarz > > PS: Who am I and what's my agenda? I am interested in looking at large > codebases in search of patterns. I picked Apache SF, because, unlike my > company code, the data can be independently verified. The issue with > conflicting class names became apparent as I was trying to identify and > understand classes that are shared in this ecosystem. Some more > background on this approach can be found on my blog: > http://10kftcode.blogspot.com/ > > - > To unsubscribe, e-mail: community-unsubscr...@apache.org > For additional commands, e-mail: community-h...@apache.org > - To unsubscribe, e-mail: community-unsubscr...@apache.org For additional commands, e-mail: community-h...@apache.org
Re: duplicate class names in ASF Java projects
On 21 March 2014 04:49, Pawel Slusarz wrote: > Greetings, > > When looking at the Apache SF Java projects as a group, I noticed that a > large number of projects have duplicate class names, ie > both openejb and tomee have a class named > jug.client.command.api.AbstractCommand > > When edge cases, ie test.Foo and tomcat55, tomcat60, tomcat70 get > eliminated, it still appears that the practice of code sharing by > drag-drop-modify is quite prevalent. Over 14,000 (out of 165,000) > classes were shared that way in the ecosystem, and 103 projects (out of > 300) are affected. Note that sanselan was renamed as commons imaging. However the package names were also changed so I'm not sure why they are shown as duplicates. sanselan: org.apache.sanselan imaging: org.apache.commons.imaging Perhaps the information has been derived from SVN rather than the published releases. In which case I suspect there are a lot of false positives. Not all SVN (or Git) source code is part of a release, and source code may go through various name changes. > Sometimes a measurement and visualization is all it takes to realize a > problem and begin fixing it. Below is raw data that can help understand > better what and how is happening: > > http://pslusarz.github.io/archeology3d/research/apache/conflicting-classes/index.html > > Hope this is the right place to engage in this sort of conversation. > > Paul Slusarz > > PS: Who am I and what's my agenda? I am interested in looking at large > codebases in search of patterns. I picked Apache SF, because, unlike my > company code, the data can be independently verified. The issue with > conflicting class names became apparent as I was trying to identify and > understand classes that are shared in this ecosystem. Some more > background on this approach can be found on my blog: > http://10kftcode.blogspot.com/ > > - > To unsubscribe, e-mail: community-unsubscr...@apache.org > For additional commands, e-mail: community-h...@apache.org > - To unsubscribe, e-mail: community-unsubscr...@apache.org For additional commands, e-mail: community-h...@apache.org