(Back from vacation now. Thanks again for everyone's thoughts and suggestions.)
On Aug 9, 2011, at 7:34 PM, Jonathan Rochkind wrote: > Just to play Simplicity Devil's Advocate, and admittedly not having followed > this whole thread or your whole design. > > What if the model was nothing but two entities: > > Software > Person/Group (Yes, used either for an individual or a group of any sort). > > With a directed 'related' relationship between each entity and reflexive. > (Software -> Person/Group ; Software -> Software; Person/Group -> Software ; > Person/Group-> Person/Group ). > > That 'related' relationship can be annotated with a relationship type from a > controlled vocabulary, as well as free-entered user tags. Controlled > vocabulary would include Person/Group *uses* Software; Person/Group > *develops* Software; Software *component of* Software. Person/Group *member > of* Person/Group. > > People could enter 'tags' on the relationship for anything else they wanted. > You could develop the controlled vocabulary further organically as you get > more data and see what's actually needed -- and what people free tag, if they > do so. > > Additional attributes are likely needed on Software; probably not too many > more on Person/Group. But to encourage 'crowd source', you can enter a > Software without filling out all of those attributes, it's as easy as filling > out a simple form, and if you want making a couple relationships to other > Software or Person/Group, or those can be made later by other people, if it > catches on and people actually edit this. > > Things like URLs to software (or people!) home pages can really just be > entered in a big free text field -- using wiki syntax, or better yet, > Markdown. > > I think if the success of the project depends on volunteer crowd sourcing, > you've got to keep things simple and make it as easy as possible to enter > data in seconds. Really, even without the entering, keeping it simple will > lead you to a simple interface, which will be useable and more likely to > catch on. Interesting model. I'd like to think this through a little more; my first thoughts are that while it might make the user interface and the data model simpler, enforcement of consistency of the data itself would diminish, which might cause a hodgepodge data that would be difficult to page through. Simplicity in data entry might be sacrificed for simplicity in search/browse. On Aug 9, 2011, at 7:50 PM, stuart yeates wrote: > You may also be interested in the (older?) work at > http://projects.apache.org/ and http://trac.usefulinc.com/doap For example: > > http://projects.apache.org/projects/xindice.html / > http://svn.apache.org/repos/asf/xml/xindice/trunk/doap_Xindice.rdf > > Interoperability with RDF/DOAP lets you build on others work and lets > others in turn pick your work over. > > At the very least if allows you to get suck in the latest and greatest > releases automatically. Ah, yes! That is the sort of linked data interoperability I was thinking would be possible. Thanks for the pointers to those efforts. On Aug 9, 2011, at 8:23 PM, Matt Jones wrote: > On Tue, Aug 9, 2011 at 3:50 PM, stuart yeates <stuart.yea...@vuw.ac.nz>wrote: >> ... >> Ohloh is great. However it relies almost completely on metrics which are >> easily gamed by the technically competent. Use of these kinds of metrics in >> ways which encouraging gaming will only be productive in the short term, >> perhaps the very short term. >> >> For example: it's easy to set up dummy version control accounts and there >> can be good technical reasons for doing so. It's easy to set up a build/test >> suite to update a file in the version control after it's daily run and there >> can be good technical reasons for doing so. But doing these things can also >> transform a very-low activity single user project into a high-activity dual >> user project, in the eyes of ohloh. >> >> Turning on template-derived comments in the next big migration handles the >> "is the code commented?" metric. >> >> The more metrics are used, the more motivation there is to use tools (which >> admittedly have other motivations) which make a project look good. >> > I agree the ohloh metrics are easily gamed. What metrics do you recommend > that can't be gamed but still provide a synopsis of the project for > evaluation, comparison, and selection? I think there is some utility even > though they can be gamed. The metrics are not a substitute for critical > evaluation, but provide a nice synopsis as a jumping off point. For > example, if I am interested in projects that have a demonstrable lifespan > > 5 years, and that have had more than 10 developers contribute, I can find > that via these metrics. I can then assess for myself if any of the > resulting projects are false positives (e.g., the commit log will give some > idea of the types of commits made by each person). > > If you're concerned about the system being gamed via metrics, then you > should also be concerned about user-submitted project descriptions. > Projects have a tendency to over-generalize on what their software does, > under-report defects, and generally paint a rosy picture. Will there be > some sort of quality control/editing/verification of the claims made by > submitters? Will it matter if some of the projects are described more > generously than in reality? Won't the system still be useful even if they > are? I'm interested to hear more about what others think would be good metrics. I agree with Matt that they serve as a useful rough sorting mechanism (perhaps as a way to cull projects which clearly have no active community, or at least not one that is actively gaming the metrics -- but even gaming shows some activity, doesn't it?). Peter -- Peter Murray peter.mur...@lyrasis.org tel:+1-678-235-2955 Ass't Director, Technology Services Development http://dltj.org/about/ LYRASIS -- Great Libraries. Strong Communities. Innovative Answers. The Disruptive Library Technology Jester http://dltj.org/ Attrib-Noncomm-Share http://creativecommons.org/licenses/by-nc-sa/2.5/