Yoram Arnon wrote:
Would it make sense to create separate components in jira for the
applications that run on top of hadoop, augmenting dfs, mapred, io etc.?
I'm refering to streaming and distcp for starters, perhaps dfsshell too, and
maybe other components in the future.
It will facilitate tracking issues that relate to these components, that are
somewhat external to the core of hadoop.
We already have a Jira category for streaming (contrib/streaming).
All categories (except "documentation") correspond to Java packages. So
CopyFiles issues should be in the "util" category, until we decide to
move CopyFiles to another package (fs?). DfsShell is in the dfs
package, so its bugs should go in the dfs component. Perhaps dfs should
be broken into sub-packages.
I think this is a good approach, as it should help to encourage us to
keep a well-structured package tree. If something deserves a separate
issue category, then it probably deserves a separate package. Util is
the most abused package. I think it should only have stuff that's
analagous to the stuff in java.util--generic datastructures, etc. But
it currently has a mish-mash of stuff, including some fs stuff (like
CopyFiles and DiskChecker).
Javadoc also facilitates per-package documentation, by placing a
package.html in the source tree. Thus a well-organized package tree
also facilitates well-organized documentation.
Doug