When looking at the Apache SF Java projects as a group, I noticed that a
large number of projects have duplicate class names, ie
both openejb and tomee have a class named

When edge cases, ie test.Foo and tomcat55, tomcat60, tomcat70 get
eliminated, it still appears that the practice of code sharing by
drag-drop-modify is quite prevalent. Over 14,000 (out of 165,000)
classes were shared that way in the ecosystem, and 103 projects (out of
300) are affected.

Sometimes a measurement and visualization is all it takes to realize a
problem and begin fixing it. Below is raw data that can help understand
better what and how is happening:

Hope this is the right place to engage in this sort of conversation.

Paul Slusarz

PS: Who am I and what's my agenda? I am interested in looking at large
codebases in search of patterns. I picked Apache SF, because, unlike my
company code, the data can be independently verified. The issue with
conflicting class names became apparent as I was trying to identify and
understand classes that are shared in this ecosystem. Some more
background on this approach can be found on my blog:

To unsubscribe, e-mail:
For additional commands, e-mail:

Reply via email to