Re: Tool to generate disclaimer, NOTICE, etc. files

2014-02-17 Thread sebgoa

 -- Forwarded message --
 From: Marvin Humphrey mar...@rectangular.com
 Date: Feb 10, 2014 3:49 PM
 Subject: Re: Tool to generate disclaimer, NOTICE, etc. files
 To: general@incubator.apache.org general@incubator.apache.org
 Cc: 
 
 On Sat, Feb 8, 2014 at 8:49 AM, Alan Cabrera l...@toolazydogs.com wrote:
  Do you think it would be helpful if we had a tool that generated these
  files?  It could work like a command line wizard that prompts the person for
  licensing information and then generates a valid disclaimer, notice, etc.
  files.
 
  If this is a good idea, what files should we generate?  Currently, all I can
  think of is disclaimer and notice.
 

chiming in here (without much context so it may not be what you are looking 
for), but in cloudstack we have a CHANGES file with list of bugs fixed.
While we can argue if this is needed or not, I created a quick script to pull 
jira information from a filter:

https://git-wip-us.apache.org/repos/asf?p=cloudstack-docs-rn.git;a=blob_plain;f=utils/jira.py;hb=refs/heads/master

it then automatically populates our doc:

http://apache-cloudstack-release-notes.readthedocs.org/en/latest/about.html#issues-fixed-in-4-3-0


  Maybe it could add the info into the project's DOAP file.  If we worked out
  the kinks then we could create sbt/gradle/mvn plugins to read the DOAP file
  and insert these files into the correct places in the distributions.  Apache
  RAT could also use this info as well.
 
  WDYT?
 
 I concur with sebb on the extreme challenges of auto-generating NOTICE --
 which is a shame because I'd really love it if the Incubator could benefit
 when you're inspired to write tooling.
 
 One thing I think we could use is a tool which parses mbox archives in
 people.apache.org:/home/apmail/ and generates statistics:
 
 *   Total emails sent per list (a measure of activity)
 *   Unique addresses participating (a measure of diversity)
 *   Emails sent by Mentors (a measure of Mentor engagement)
 

I have been doing this for CloudStack, I download all mbox files at the end of 
the month, parse and stick in mongodb.
I than run couple queries. You can see examples at:

http://sebgoa.blogspot.ch/2013/05/update-on-apache-cloudstack-community.html

The code is a bit of a mess so I did not put it on github. however I did put 
asf-mail-spider:

https://github.com/runseb/asf-mail-spider

It crawls the apache mail archive site and collects the names of all the asf 
making lists, then goes into each one of them and extracts the number of 
messages each month.
Then plots a very messy graph. 

You could easily focus on the incubator projects and plot that same graph that 
would give you Total emails sent per list.


 We expend a lot of volunteer energy on shepherding each month, and I think
 such an automated tool could help to free up some of that energy for other
 tasks.  There are two main functions for shepherding:
 
 1.  Alert the IPMC to podlings which have gone adrift.
 2.  Provide an outsider view.
 
 (I'd add a #3: being exposed to new communities benefits the shepherd, but
 that varies by individual.)
 
 I think that purpose #1 could largely be accomplished using email stats.
 
 I don't know if this is something you'd feel motivated to work on, but I
 thought it was worth mentioning because your original proposal would save time
 and energy and so would this.
 
 Marvin Humphrey
 
 -
 To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
 For additional commands, e-mail: general-h...@incubator.apache.org
 


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: Tool to generate disclaimer, NOTICE, etc. files

2014-02-11 Thread Alan Cabrera
The goal that I’m hoping to realistically attain is to have tooling 
automatically generate these files, thus alleviating a lot of scut work done by 
podlings and mentors.  Maybe not a realistic goal but a pretty good dream.  :)

On to the details:

On Feb 8, 2014, at 7:42 PM, sebb seb...@gmail.com wrote:

 On 8 February 2014 16:49, Alan Cabrera l...@toolazydogs.com wrote:
 Do you think it would be helpful if we had a tool that generated these 
 files?  It could work like a command line wizard that prompts the person for 
 licensing information and then generates a valid disclaimer, notice, etc. 
 files.
 
 AIUI the disclaimer file is the same for every project - only the
 project name changes.
 Seems unnecessary to automate this, though it should be trivial to implement.

Agreed, though it would be handy for podlings that are starting out, imo.

 The NOTICE file is much harder to automate, as it depends on knowing
 what is actually going to be shipped and reading and interpreting all
 the relevant licenses.
 However it might be possible to create a sort of expert system that
 asked the right questions and guided the user to create the NOTICE
 file.

That’s my thinking.  We would have relevant metadata on 3rd party projects

 If this is a good idea, what files should we generate?  Currently, all I can 
 think of is disclaimer and notice.
 
 Maybe it could add the info into the project's DOAP file.  If we worked out 
 the kinks then we could create sbt/gradle/mvn plugins to read the DOAP file 
 and insert these files into the correct places in the distributions.  Apache 
 RAT could also use this info as well.
 
 The DOAP could certainly be used to create the DISCLAIMER.
 
 It seems wrong to include any dependency information in the DOAP.
 
 Dependencies must be present somewhere in the build scripts, however
 even in Maven (which has very structured info) it's not at all easy to
 determine which dependencies are actually included in the release
 artifacts (and remember that source and binary artifacts may need
 different NOTICE files)
 
 Note also that some source files may require attribution in the NOTICE file.
 These won't be documented in any build system; the info has to be
 added to the NOTICE file manually when the code is added to SCM.
 
 WDYT?
 
 Non-trivial; maintaining the meta-data needed to accurately generate
 the NOTICE file is likely to require more effort than writing the
 NOTICE file itself.
 
 
 Regards,
 Alan
 
 
 -
 To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
 For additional commands, e-mail: general-h...@incubator.apache.org
 
 
 -
 To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
 For additional commands, e-mail: general-h...@incubator.apache.org
 


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: Tool to generate disclaimer, NOTICE, etc. files

2014-02-11 Thread Alan Cabrera
Whoops, accidentally prematurely sent my reply.

The goal that I’m hoping to realistically attain is to have tooling 
automatically generate these files, thus alleviating a lot of scut work done by 
podlings and mentors. Maybe not a realistic goal but a pretty good dream.  :)

On to the details:

On Feb 8, 2014, at 7:42 PM, sebb seb...@gmail.com wrote:

 On 8 February 2014 16:49, Alan Cabrera l...@toolazydogs.com wrote:
 Do you think it would be helpful if we had a tool that generated these 
 files?  It could work like a command line wizard that prompts the person for 
 licensing information and then generates a valid disclaimer, notice, etc. 
 files.
 
 AIUI the disclaimer file is the same for every project - only the
 project name changes.
 Seems unnecessary to automate this, though it should be trivial to implement.

Agreed, though it would be handy for podlings that are starting out, imo.

 The NOTICE file is much harder to automate, as it depends on knowing
 what is actually going to be shipped and reading and interpreting all
 the relevant licenses.
 However it might be possible to create a sort of expert system that
 asked the right questions and guided the user to create the NOTICE
 file.

That’s my thinking.  The tool would ask various questions about their project.  
Maybe even introspect maven POMs, gradle files, etc. to collect information.  
Then generate appropriate NOTICE files.  

I think that the real work would be creating and maintaining a database of 
licenses for 3rd party products.  This could be incrementally created by the 
above tool.

 If this is a good idea, what files should we generate?  Currently, all I can 
 think of is disclaimer and notice.
 
 Maybe it could add the info into the project's DOAP file.  If we worked out 
 the kinks then we could create sbt/gradle/mvn plugins to read the DOAP file 
 and insert these files into the correct places in the distributions.  Apache 
 RAT could also use this info as well.
 
 The DOAP could certainly be used to create the DISCLAIMER.
 
 It seems wrong to include any dependency information in the DOAP.
 
 Dependencies must be present somewhere in the build scripts, however
 even in Maven (which has very structured info) it's not at all easy to
 determine which dependencies are actually included in the release
 artifacts (and remember that source and binary artifacts may need
 different NOTICE files)

All the more reason to having tooling take care of this.

 Note also that some source files may require attribution in the NOTICE file.
 These won't be documented in any build system; the info has to be
 added to the NOTICE file manually when the code is added to SCM.
 
 WDYT?
 
 Non-trivial; maintaining the meta-data needed to accurately generate
 the NOTICE file is likely to require more effort than writing the
 NOTICE file itself

Yeah but that effort is amortized over many projects and release reviewers.


Regards,
Alan



Re: Tool to generate disclaimer, NOTICE, etc. files

2014-02-11 Thread Alan Cabrera

On Feb 8, 2014, at 10:47 PM, Ted Dunning ted.dunn...@gmail.com wrote:

 On Sat, Feb 8, 2014 at 7:42 PM, sebb seb...@gmail.com wrote:
 
 WDYT?
 
 Non-trivial; maintaining the meta-data needed to accurately generate
 the NOTICE file is likely to require more effort than writing the
 NOTICE file itself.
 
 
 
 Actually, this sounds like a case where the effort for any individual
 project outweighs the benefit of saving the time, but the total effort
 across many projects would make the effort somewhat worth while.  In
 particular, something that enhances the communication between projects and
 between the author of the dependency and the users of that dependency would
 be very helpful.
 
 As such, this sounds like a great thing to add to maven.  If it provided me
 with a way to look at all my dependencies and examine my notices file for
 missing acknowledgments, that would be helpful.  Moreover, if there were a
 way for the community of users to upload sample acknowledgements and for
 the owners of such packages to reject or accept these acknowledgments, this
 would crowdsource the effort of approval.

Maven and other build tools.  If we had a canonical database then people would 
be free to create any kind of tooling they wished.

 My guess is that since getting NOTIFICATIONS right is a largely one shot
 deal, it wouldn't be enough appeal to encourage somebody to actually
 implement the system, but it would certainly make one's life easier in
 small installments.

That’s the predicament of a tool smith.


Regards,
Alan


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: Tool to generate disclaimer, NOTICE, etc. files

2014-02-11 Thread Alan Cabrera

On Feb 10, 2014, at 12:48 PM, Marvin Humphrey mar...@rectangular.com wrote:

 On Sat, Feb 8, 2014 at 8:49 AM, Alan Cabrera l...@toolazydogs.com wrote:
 Do you think it would be helpful if we had a tool that generated these
 files?  It could work like a command line wizard that prompts the person for
 licensing information and then generates a valid disclaimer, notice, etc.
 files.
 
 If this is a good idea, what files should we generate?  Currently, all I can
 think of is disclaimer and notice.
 
 Maybe it could add the info into the project's DOAP file.  If we worked out
 the kinks then we could create sbt/gradle/mvn plugins to read the DOAP file
 and insert these files into the correct places in the distributions.  Apache
 RAT could also use this info as well.
 
 WDYT?
 
 I concur with sebb on the extreme challenges of auto-generating NOTICE --
 which is a shame because I'd really love it if the Incubator could benefit
 when you're inspired to write tooling.

Yeah, when we were sorting things out in a podling the idea hit me in the head.

 One thing I think we could use is a tool which parses mbox archives in
 people.apache.org:/home/apmail/ and generates statistics:
 
 *   Total emails sent per list (a measure of activity)
 *   Unique addresses participating (a measure of diversity)
 *   Emails sent by Mentors (a measure of Mentor engagement)

Yeah, I’m working on that as we speak.  I have a project called Panopticon in 
Apache Labs.  I’m currently working on mailing list moderator tools that allow 
moderation at the command line.  Go check it out! 

 We expend a lot of volunteer energy on shepherding each month, and I think
 such an automated tool could help to free up some of that energy for other
 tasks.  There are two main functions for shepherding:
 
 1.  Alert the IPMC to podlings which have gone adrift.
 2.  Provide an outsider view.
 
 (I'd add a #3: being exposed to new communities benefits the shepherd, but
 that varies by individual.)
 
 I think that purpose #1 could largely be accomplished using email stats.
 
 I don't know if this is something you'd feel motivated to work on, but I
 thought it was worth mentioning because your original proposal would save time
 and energy and so would this.
 
 Marvin Humphrey
 
 -
 To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
 For additional commands, e-mail: general-h...@incubator.apache.org
 


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: Tool to generate disclaimer, NOTICE, etc. files

2014-02-11 Thread Marvin Humphrey
On Tue, Feb 11, 2014 at 7:42 AM, Alan Cabrera l...@toolazydogs.com wrote:

 One thing I think we could use is a tool which parses mbox archives in
 people.apache.org:/home/apmail/ and generates statistics:

 *   Total emails sent per list (a measure of activity)
 *   Unique addresses participating (a measure of diversity)
 *   Emails sent by Mentors (a measure of Mentor engagement)

 Yeah, I'm working on that as we speak.  I have a project called Panopticon
 in Apache Labs.  I'm currently working on mailing list moderator tools that
 allow moderation at the command line.  Go check it out!

What would be ideal is if Panopticon could generate these stats and then the
Incubator's clutch2report.py script could embed them in the report wiki
template for each month.  I see that Panopticon is envisioned as a web UI, but
for our purposes it would probably need to run on a cron on people.apache.org
and cache the stats, because reading all those mboxes on the fly each request
would be too expensive.

Is that feature request compatible with your work?

Marvin Humphrey

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: Tool to generate disclaimer, NOTICE, etc. files

2014-02-11 Thread Alan Cabrera

On Feb 11, 2014, at 9:01 AM, Marvin Humphrey mar...@rectangular.com wrote:

 On Tue, Feb 11, 2014 at 7:42 AM, Alan Cabrera l...@toolazydogs.com wrote:
 
 One thing I think we could use is a tool which parses mbox archives in
 people.apache.org:/home/apmail/ and generates statistics:
 
 *   Total emails sent per list (a measure of activity)
 *   Unique addresses participating (a measure of diversity)
 *   Emails sent by Mentors (a measure of Mentor engagement)
 
 Yeah, I'm working on that as we speak.  I have a project called Panopticon
 in Apache Labs.  I'm currently working on mailing list moderator tools that
 allow moderation at the command line.  Go check it out!
 
 What would be ideal is if Panopticon could generate these stats and then the
 Incubator's clutch2report.py script could embed them in the report wiki
 template for each month.  I see that Panopticon is envisioned as a web UI, but
 for our purposes it would probably need to run on a cron on people.apache.org
 and cache the stats, because reading all those mboxes on the fly each request
 would be too expensive.
 
 Is that feature request compatible with your work?


A lot of data would have to be collected incrementally in the background.

All data that Panopticon collects will be available via a JSON REST API.


Regards,
Alan


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: Tool to generate disclaimer, NOTICE, etc. files

2014-02-10 Thread Marvin Humphrey
On Sat, Feb 8, 2014 at 8:49 AM, Alan Cabrera l...@toolazydogs.com wrote:
 Do you think it would be helpful if we had a tool that generated these
 files?  It could work like a command line wizard that prompts the person for
 licensing information and then generates a valid disclaimer, notice, etc.
 files.

 If this is a good idea, what files should we generate?  Currently, all I can
 think of is disclaimer and notice.

 Maybe it could add the info into the project's DOAP file.  If we worked out
 the kinks then we could create sbt/gradle/mvn plugins to read the DOAP file
 and insert these files into the correct places in the distributions.  Apache
 RAT could also use this info as well.

 WDYT?

I concur with sebb on the extreme challenges of auto-generating NOTICE --
which is a shame because I'd really love it if the Incubator could benefit
when you're inspired to write tooling.

One thing I think we could use is a tool which parses mbox archives in
people.apache.org:/home/apmail/ and generates statistics:

*   Total emails sent per list (a measure of activity)
*   Unique addresses participating (a measure of diversity)
*   Emails sent by Mentors (a measure of Mentor engagement)

We expend a lot of volunteer energy on shepherding each month, and I think
such an automated tool could help to free up some of that energy for other
tasks.  There are two main functions for shepherding:

1.  Alert the IPMC to podlings which have gone adrift.
2.  Provide an outsider view.

(I'd add a #3: being exposed to new communities benefits the shepherd, but
that varies by individual.)

I think that purpose #1 could largely be accomplished using email stats.

I don't know if this is something you'd feel motivated to work on, but I
thought it was worth mentioning because your original proposal would save time
and energy and so would this.

Marvin Humphrey

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: Tool to generate disclaimer, NOTICE, etc. files

2014-02-09 Thread sebb
On 9 February 2014 06:47, Ted Dunning ted.dunn...@gmail.com wrote:
 On Sat, Feb 8, 2014 at 7:42 PM, sebb seb...@gmail.com wrote:

  WDYT?

 Non-trivial; maintaining the meta-data needed to accurately generate
 the NOTICE file is likely to require more effort than writing the
 NOTICE file itself.



 Actually, this sounds like a case where the effort for any individual
 project outweighs the benefit of saving the time, but the total effort
 across many projects would make the effort somewhat worth while.  In
 particular, something that enhances the communication between projects and
 between the author of the dependency and the users of that dependency would
 be very helpful.

AFAICT, the only way to amortise the effort between projects is to
have a shared register of NOTICE requirements for each dependency.
(basically each unique license text).

However, that is not sufficient - the NOTICE file must only include
attributions for bits that are actually included in the artifact to
which the NOTICE (and LICENSE) applies. That is something that is
unique to each project, and may change between releases.

Also as already mentioned, source attributions may sometimes be necessary.

 As such, this sounds like a great thing to add to maven.  If it provided me
 with a way to look at all my dependencies and examine my notices file for
 missing acknowledgments, that would be helpful.  Moreover, if there were a
 way for the community of users to upload sample acknowledgements and for
 the owners of such packages to reject or accept these acknowledgments, this
 would crowdsource the effort of approval.

This is an ASF-specific requirement.
I really don't see how crowd-sourcing this to people outside the ASF
is going to help reduce effort.
Also, only the ASF has a vested interest in getting this right.

 My guess is that since getting NOTIFICATIONS right is a largely one shot
 deal, it wouldn't be enough appeal to encourage somebody to actually
 implement the system, but it would certainly make one's life easier in
 small installments.

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Tool to generate disclaimer, NOTICE, etc. files

2014-02-08 Thread Alan Cabrera
Do you think it would be helpful if we had a tool that generated these files?  
It could work like a command line wizard that prompts the person for licensing 
information and then generates a valid disclaimer, notice, etc. files.

If this is a good idea, what files should we generate?  Currently, all I can 
think of is disclaimer and notice.

Maybe it could add the info into the project's DOAP file.  If we worked out the 
kinks then we could create sbt/gradle/mvn plugins to read the DOAP file and 
insert these files into the correct places in the distributions.  Apache RAT 
could also use this info as well.

WDYT?


Regards,
Alan


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: Tool to generate disclaimer, NOTICE, etc. files

2014-02-08 Thread sebb
On 8 February 2014 16:49, Alan Cabrera l...@toolazydogs.com wrote:
 Do you think it would be helpful if we had a tool that generated these files? 
  It could work like a command line wizard that prompts the person for 
 licensing information and then generates a valid disclaimer, notice, etc. 
 files.

AIUI the disclaimer file is the same for every project - only the
project name changes.
Seems unnecessary to automate this, though it should be trivial to implement.

The NOTICE file is much harder to automate, as it depends on knowing
what is actually going to be shipped and reading and interpreting all
the relevant licenses.
However it might be possible to create a sort of expert system that
asked the right questions and guided the user to create the NOTICE
file.

 If this is a good idea, what files should we generate?  Currently, all I can 
 think of is disclaimer and notice.

 Maybe it could add the info into the project's DOAP file.  If we worked out 
 the kinks then we could create sbt/gradle/mvn plugins to read the DOAP file 
 and insert these files into the correct places in the distributions.  Apache 
 RAT could also use this info as well.

The DOAP could certainly be used to create the DISCLAIMER.

It seems wrong to include any dependency information in the DOAP.

Dependencies must be present somewhere in the build scripts, however
even in Maven (which has very structured info) it's not at all easy to
determine which dependencies are actually included in the release
artifacts (and remember that source and binary artifacts may need
different NOTICE files)

Note also that some source files may require attribution in the NOTICE file.
These won't be documented in any build system; the info has to be
added to the NOTICE file manually when the code is added to SCM.

 WDYT?

Non-trivial; maintaining the meta-data needed to accurately generate
the NOTICE file is likely to require more effort than writing the
NOTICE file itself.


 Regards,
 Alan


 -
 To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
 For additional commands, e-mail: general-h...@incubator.apache.org


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: Tool to generate disclaimer, NOTICE, etc. files

2014-02-08 Thread Ted Dunning
On Sat, Feb 8, 2014 at 7:42 PM, sebb seb...@gmail.com wrote:

  WDYT?

 Non-trivial; maintaining the meta-data needed to accurately generate
 the NOTICE file is likely to require more effort than writing the
 NOTICE file itself.



Actually, this sounds like a case where the effort for any individual
project outweighs the benefit of saving the time, but the total effort
across many projects would make the effort somewhat worth while.  In
particular, something that enhances the communication between projects and
between the author of the dependency and the users of that dependency would
be very helpful.

As such, this sounds like a great thing to add to maven.  If it provided me
with a way to look at all my dependencies and examine my notices file for
missing acknowledgments, that would be helpful.  Moreover, if there were a
way for the community of users to upload sample acknowledgements and for
the owners of such packages to reject or accept these acknowledgments, this
would crowdsource the effort of approval.

My guess is that since getting NOTIFICATIONS right is a largely one shot
deal, it wouldn't be enough appeal to encourage somebody to actually
implement the system, but it would certainly make one's life easier in
small installments.