Re: [Dspace-devel] Asynchronous release and other related matters

2011-07-17 Thread Mark Diggory
An initial barebones prototype of whats being discussed, just needs
to be tuned up a bit more to create the appropriate source and release
trees.

http://scm.dspace.org/svn/repo/dspace/branches/dspace-async-release/dspace-release/

This is associated with the following ticket
https://jira.duraspace.org/browse/DS-818

Having a few others work on this in tandem would excelerate the process.
Mark

On Fri, Jul 15, 2011 at 3:10 AM, Robin Taylor robin.tay...@ed.ac.uk wrote:
 Hi all,

 The more I think about it the less frightened I am becoming. If we could
 include the safety net of a source release then that would sway it for
 me.

 Cheers, Robin.


 On Thu, 2011-07-14 at 19:26 +0100, Tim Donohue wrote:
 Thanks for doing this Robin!

 I'll add a few more thoughts in here..

 On 7/14/2011 10:41 AM, Mark Diggory wrote:
  Robin,
 
  Thank you for a great introduction. I'll inject just a few comments inline.
 
  On Thu, Jul 14, 2011 at 2:22 AM, Robin Taylorrobin.tay...@ed.ac.uk  
  wrote:
  Hi all,
 
  A large portion of yesterdays meeting ended up being a discussion on
  async releases. Mark (Diggory) has for some time been strongly
  advocating this approach and has written extensively on the subject both
  on the wiki and in Jira. I think it is incumbent on us all to consider
  his arguments and comment thereon. So to give him a rest I'll chip in,
  albeit from a different angle.

 To be honest, the more I think through this in my mind, the more I
 really do feel there are some major bonuses to Mark Diggory's ideas.

 Looking back at it now, it may have seemed from yesterday's meeting that
 I was arguing against async releases. The reality here is that I
 actually agree with MarkD that async releases is a good idea overall.
 However, I think there is some disagreement on implementation or
 processes to get there. My concerns are more around the processes of
 Async Releases.

 I'll admit, some of these concerns were allayed a bit from yesterday's
 discussions -- more below.

  So why do async releases ? There are a number of reasons, for example...
  1. To fix a bug in module dspace-xxx we could just release that module.
  2. Or make available new features without waiting for the annual
  release.
  3. Or release entire new modules
  etc etc.

 Your #1  #2 reasons are both extremely powerful and also potentially
 worrisome if not handled properly. This might sound controversial, but
 let me explain...

 First off, I fully agree that #1 and #2 are great things (I'm not
 arguing against that at all). I would love to be able to fix bugs more
 quickly (via small module releases) and make new features available more
 quickly. I full agree that we should think about moving in this direction.

 I think the one concern here is around *management* of all these
 individual modules (which unfortunately comes back to the idea of
 defining support).

 To retain our Community trust, we do need to be careful about how we
 define support and how we are releasing new bug fixes or features via
 async module releases.  For instance, we want to make sure that core
 modules (where core just means the main/primary code behind
 out-of-the-box DSpace) are still being well vetted  tested before being
 released.  So, we would need to establish some basic policies/best
 practices around how to also vet, test  release new bug fixes
 asynchronously (Obviously, this may not require a full testathon, or a
 full Release Coordinator role. But we need some basic policies/best
 practices on how a bug fix is vetted  tested  asynchronously released).

 We'd also want to make sure DSpace users fully understand who is
 supporting each individual module. As we move this route, I still feel
 that we'll start to enter a situation where some modules are centrally
 supported (by the Committers as a whole), while others are only
 supported by smaller sub-teams or even external, third-parties.  I think
 this is all a great thing (to have many modules built  supported by
 many different groups)!  But, users need to be able to know where they
 can go for support  documentation, and *which* modules have been
 stamped as Committer Approved, Vetted  Documented, or similar.

 NOTE: I should mention, I'm not saying we need some sort of detailed
 approval process for everything (I don't want to bog us down with
 red-tape, etc). I'm just saying users need to know which modules are
 actually controlled  released by the Committers, and which may be just
 a smaller project or experiment that was put up by a sub-team or one
 individual or an external group.

 So, to summarize this point:  In my opinion, Modularization 
 Asynchronous Releases seems like a good thing overall. However, we may
 need to rethink some of our vetting, testing  release processes to
 ensure that we are still continually putting out well-tested, vetted 
 stable code, not only during the main DSpace packaged releases, but
 also during smaller async module releases.  We also need to think about
 how 

Re: [Dspace-devel] Asynchronous release and other related matters

2011-07-15 Thread Robin Taylor
Hi all,

The more I think about it the less frightened I am becoming. If we could
include the safety net of a source release then that would sway it for
me. 

Cheers, Robin.


On Thu, 2011-07-14 at 19:26 +0100, Tim Donohue wrote:
 Thanks for doing this Robin!
 
 I'll add a few more thoughts in here..
 
 On 7/14/2011 10:41 AM, Mark Diggory wrote:
  Robin,
 
  Thank you for a great introduction. I'll inject just a few comments inline.
 
  On Thu, Jul 14, 2011 at 2:22 AM, Robin Taylorrobin.tay...@ed.ac.uk  wrote:
  Hi all,
 
  A large portion of yesterdays meeting ended up being a discussion on
  async releases. Mark (Diggory) has for some time been strongly
  advocating this approach and has written extensively on the subject both
  on the wiki and in Jira. I think it is incumbent on us all to consider
  his arguments and comment thereon. So to give him a rest I'll chip in,
  albeit from a different angle.
 
 To be honest, the more I think through this in my mind, the more I 
 really do feel there are some major bonuses to Mark Diggory's ideas.
 
 Looking back at it now, it may have seemed from yesterday's meeting that 
 I was arguing against async releases. The reality here is that I 
 actually agree with MarkD that async releases is a good idea overall. 
 However, I think there is some disagreement on implementation or 
 processes to get there. My concerns are more around the processes of 
 Async Releases.
 
 I'll admit, some of these concerns were allayed a bit from yesterday's 
 discussions -- more below.
 
  So why do async releases ? There are a number of reasons, for example...
  1. To fix a bug in module dspace-xxx we could just release that module.
  2. Or make available new features without waiting for the annual
  release.
  3. Or release entire new modules
  etc etc.
 
 Your #1  #2 reasons are both extremely powerful and also potentially 
 worrisome if not handled properly. This might sound controversial, but 
 let me explain...
 
 First off, I fully agree that #1 and #2 are great things (I'm not 
 arguing against that at all). I would love to be able to fix bugs more 
 quickly (via small module releases) and make new features available more 
 quickly. I full agree that we should think about moving in this direction.
 
 I think the one concern here is around *management* of all these 
 individual modules (which unfortunately comes back to the idea of 
 defining support).
 
 To retain our Community trust, we do need to be careful about how we 
 define support and how we are releasing new bug fixes or features via 
 async module releases.  For instance, we want to make sure that core 
 modules (where core just means the main/primary code behind 
 out-of-the-box DSpace) are still being well vetted  tested before being 
 released.  So, we would need to establish some basic policies/best 
 practices around how to also vet, test  release new bug fixes 
 asynchronously (Obviously, this may not require a full testathon, or a 
 full Release Coordinator role. But we need some basic policies/best 
 practices on how a bug fix is vetted  tested  asynchronously released).
 
 We'd also want to make sure DSpace users fully understand who is 
 supporting each individual module. As we move this route, I still feel 
 that we'll start to enter a situation where some modules are centrally 
 supported (by the Committers as a whole), while others are only 
 supported by smaller sub-teams or even external, third-parties.  I think 
 this is all a great thing (to have many modules built  supported by 
 many different groups)!  But, users need to be able to know where they 
 can go for support  documentation, and *which* modules have been 
 stamped as Committer Approved, Vetted  Documented, or similar.
 
 NOTE: I should mention, I'm not saying we need some sort of detailed 
 approval process for everything (I don't want to bog us down with 
 red-tape, etc). I'm just saying users need to know which modules are 
 actually controlled  released by the Committers, and which may be just 
 a smaller project or experiment that was put up by a sub-team or one 
 individual or an external group.
 
 So, to summarize this point:  In my opinion, Modularization  
 Asynchronous Releases seems like a good thing overall. However, we may 
 need to rethink some of our vetting, testing  release processes to 
 ensure that we are still continually putting out well-tested, vetted  
 stable code, not only during the main DSpace packaged releases, but 
 also during smaller async module releases.  We also need to think about 
 how we communicate the level of support provided for each module (is 
 support equal for everything, are some modules more well supported than 
 others?)
 
  Even better, we could distribute the binary
  release with ranges of version numbers (Maven allows forversion1.7.1,
  1.7.2, etc) and then all they would have to do would be to rebuild and
  it would automatically pick up the latest version. It is worth noting
  that 

[Dspace-devel] Asynchronous release and other related matters

2011-07-14 Thread Robin Taylor
Hi all,

A large portion of yesterdays meeting ended up being a discussion on
async releases. Mark (Diggory) has for some time been strongly
advocating this approach and has written extensively on the subject both
on the wiki and in Jira. I think it is incumbent on us all to consider
his arguments and comment thereon. So to give him a rest I'll chip in,
albeit from a different angle.

Some background - DSpace can currently be downloaded from Sourceforge
either as a source code or binary release.

The proposition (as I understand it) - Allow for releases of individual
DSpace Maven modules outwith the normal 'complete' release that
currently takes place roughly once a year. By a release I mean copying
the DSpace Maven artifacts (jars and wars) to the DSpace Maven
repository space to be publicly available. 

So why do async releases ? There are a number of reasons, for example...
1. To fix a bug in module dspace-xxx we could just release that module.
2. Or make available new features without waiting for the annual
release.
3. Or release entire new modules
etc etc.

So how would this work ? The important thing to note is that it is
dependant on people using the binary release. If they wanted to pick up
a newly released version of a module they would just need to change the
version number of that module in the appropriate pom and rebuild their
local version of DSpace. Even better, we could distribute the binary
release with ranges of version numbers (Maven allows for version1.7.1,
1.7.2, etc) and then all they would have to do would be to rebuild and
it would automatically pick up the latest version. It is worth noting
that this is exactly what we already do for the language packs so this
is not new. 

The problem (as I see it) - this all falls down if people have the
source code release. If people have the source code for a module in
their local installation then that is what they should be deploying.
Whilst it would be possible for them to change the dependencies in their
poms to pick up newly released artifacts from the Maven central repo I
don't think anyone would argue that it would be a good idea to have one
version of the source code in your installation, but actually deploy a
different one, too messy. 

So, if you are a fan of the XMLUI and Maven overlays, a binary release
might well be sufficient for you. You can of course still check out the
code for any module that interests you from the DSpace SVN site.

However, if you are a JSPUI user and/or want or need to see the source
code in general, then you probably want a source code release. That way
you don't have to familiarise yourself with the DSpace SVN site.

Could we continue to do both a source and binary release as we do at
present ? Yes. Many of the modules that currently reside in trunk in SVN
would be moved into the modules directory so that they could be released
independently of one and other, and the assembly of the source code
release would pick up the source code from there rather than from trunk.
The downside of continuing with a source code release is that it
prevents us from unifying behind one approach and being able to make
announcements relating to new releases (async or otherwise) that apply
to everyone.

Personally, I am still in favour of source code releases. I don't see
DSpace as a framework on which people build their local customisations,
I see it as a complete implementation which people take and do what they
want with it. 

Phew! I'm knackered.

Robin.  





















 


--
AppSumo Presents a FREE Video for the SourceForge Community by Eric 
Ries, the creator of the Lean Startup Methodology on Lean Startup 
Secrets Revealed. This video shows you how to validate your ideas, 
optimize your ideas and identify your business strategy.
http://p.sf.net/sfu/appsumosfdev2dev
___
Dspace-devel mailing list
Dspace-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-devel


Re: [Dspace-devel] Asynchronous release and other related matters

2011-07-14 Thread Mark Diggory
Robin,

Thank you for a great introduction. I'll inject just a few comments inline.

On Thu, Jul 14, 2011 at 2:22 AM, Robin Taylor robin.tay...@ed.ac.uk wrote:
 Hi all,

 A large portion of yesterdays meeting ended up being a discussion on
 async releases. Mark (Diggory) has for some time been strongly
 advocating this approach and has written extensively on the subject both
 on the wiki and in Jira. I think it is incumbent on us all to consider
 his arguments and comment thereon. So to give him a rest I'll chip in,
 albeit from a different angle.

Thanks, I thought my vacation would leave me well rested. But alas,
traipsing across the US with  6 yr old and a 11 month old turned out
to be as exhausting as it sounds.

 Some background - DSpace can currently be downloaded from Sourceforge
 either as a source code or binary release.

 The proposition (as I understand it) - Allow for releases of individual
 DSpace Maven modules outwith the normal 'complete' release that
 currently takes place roughly once a year. By a release I mean copying
 the DSpace Maven artifacts (jars and wars) to the DSpace Maven
 repository space to be publicly available.

This is the maven portion of the release, there are two
distributions that are released as well

(a) dspace-release-x.x.x : just the dspace assembly project that
brings together all jars/wars to build ones distro

(b) dspace-src-release-x.x.x : the above assembly + all maven projects
in trunk in source form, requires building all jars and wars needed
for DSpace.

If course, (a) requires more bandwidth to download officially released
binary jars of dspace and (b) packages local versions of those
jars/wars that I would describe as unofficial local binary variants.

At this time there are two approachs to customize your DSpace build.

(1) use the source distro and change all the files you want and
rebuild the entire sourcetree

Pro: you can navigate to exactly the source that is being used and
override it directly.

Cons:

Maintenance and Tracking of customizations across DSpace versions
requires a developer with extensive knowledge of version control
systems, changes will often conflict with work ongoing in DSpace
releases.

Local institutional developers altering DSpace internals introduce a
conservative viewpoint into the community limiting the advancement of
the codebase because they actually prefer these customized parts of
DSpace to not change rather than see benefit of new development come
into the Community.

DSpace source release build process is too long for iterative development


(2) use an overlay (works with all webapp projects (xmlui, jspui, oai,
sword, lni, solr) and copy the source files you want to alter/override
into separate dspace/modules/xxx projects, building only those changes
you want.

Pros:

Only the code that you've changed need to be rebuilt

Build Process is shorter

Models how to extend DSpace via Addons, resulting in less
customization of DSpace internals and greater ease in upgrading a
DSpace instance (of course, this depends on the the local developer
not adopting the practice of overriding existing DSpace classes and
instead applying a process of implementing their own Service Based
approach to implement a solution.

Works well IDE's that use maven source artifacts to resolve the source
for a specific java class.
(m2eclipse and Intellij IDEA)

You can Check out separate modules and via the maven profiles in the
dspace/pom.xml you can build only the source components you want.

Cons:

Its not my opinion, but some in the community seem to think that it is
a hinderance if you either need to use an IDE or separately checkout
the code for the module you want to get source from (when doing the
bad practice of overriding classes directly, which we suggest trying
to avoid doing to ease maintenance of your customizations.


 So why do async releases ? There are a number of reasons, for example...
 1. To fix a bug in module dspace-xxx we could just release that module.
 2. Or make available new features without waiting for the annual
 release.
 3. Or release entire new modules
 etc etc.

 So how would this work ? The important thing to note is that it is
 dependant on people using the binary release. If they wanted to pick up
 a newly released version of a module they would just need to change the
 version number of that module in the appropriate pom and rebuild their
 local version of DSpace.

Not explicitly, as i've said above, you can checkout or export the
source for a module into your build and have the dspace-release-x.x.x
build only that portion for your build.

 Even better, we could distribute the binary
 release with ranges of version numbers (Maven allows for version1.7.1,
 1.7.2, etc) and then all they would have to do would be to rebuild and
 it would automatically pick up the latest version. It is worth noting
 that this is exactly what we already do for the language packs so this
 is not new.

I don't really recommend this, it 

Re: [Dspace-devel] Asynchronous release and other related matters

2011-07-14 Thread Tim Donohue
Thanks for doing this Robin!

I'll add a few more thoughts in here..

On 7/14/2011 10:41 AM, Mark Diggory wrote:
 Robin,

 Thank you for a great introduction. I'll inject just a few comments inline.

 On Thu, Jul 14, 2011 at 2:22 AM, Robin Taylorrobin.tay...@ed.ac.uk  wrote:
 Hi all,

 A large portion of yesterdays meeting ended up being a discussion on
 async releases. Mark (Diggory) has for some time been strongly
 advocating this approach and has written extensively on the subject both
 on the wiki and in Jira. I think it is incumbent on us all to consider
 his arguments and comment thereon. So to give him a rest I'll chip in,
 albeit from a different angle.

To be honest, the more I think through this in my mind, the more I 
really do feel there are some major bonuses to Mark Diggory's ideas.

Looking back at it now, it may have seemed from yesterday's meeting that 
I was arguing against async releases. The reality here is that I 
actually agree with MarkD that async releases is a good idea overall. 
However, I think there is some disagreement on implementation or 
processes to get there. My concerns are more around the processes of 
Async Releases.

I'll admit, some of these concerns were allayed a bit from yesterday's 
discussions -- more below.

 So why do async releases ? There are a number of reasons, for example...
 1. To fix a bug in module dspace-xxx we could just release that module.
 2. Or make available new features without waiting for the annual
 release.
 3. Or release entire new modules
 etc etc.

Your #1  #2 reasons are both extremely powerful and also potentially 
worrisome if not handled properly. This might sound controversial, but 
let me explain...

First off, I fully agree that #1 and #2 are great things (I'm not 
arguing against that at all). I would love to be able to fix bugs more 
quickly (via small module releases) and make new features available more 
quickly. I full agree that we should think about moving in this direction.

I think the one concern here is around *management* of all these 
individual modules (which unfortunately comes back to the idea of 
defining support).

To retain our Community trust, we do need to be careful about how we 
define support and how we are releasing new bug fixes or features via 
async module releases.  For instance, we want to make sure that core 
modules (where core just means the main/primary code behind 
out-of-the-box DSpace) are still being well vetted  tested before being 
released.  So, we would need to establish some basic policies/best 
practices around how to also vet, test  release new bug fixes 
asynchronously (Obviously, this may not require a full testathon, or a 
full Release Coordinator role. But we need some basic policies/best 
practices on how a bug fix is vetted  tested  asynchronously released).

We'd also want to make sure DSpace users fully understand who is 
supporting each individual module. As we move this route, I still feel 
that we'll start to enter a situation where some modules are centrally 
supported (by the Committers as a whole), while others are only 
supported by smaller sub-teams or even external, third-parties.  I think 
this is all a great thing (to have many modules built  supported by 
many different groups)!  But, users need to be able to know where they 
can go for support  documentation, and *which* modules have been 
stamped as Committer Approved, Vetted  Documented, or similar.

NOTE: I should mention, I'm not saying we need some sort of detailed 
approval process for everything (I don't want to bog us down with 
red-tape, etc). I'm just saying users need to know which modules are 
actually controlled  released by the Committers, and which may be just 
a smaller project or experiment that was put up by a sub-team or one 
individual or an external group.

So, to summarize this point:  In my opinion, Modularization  
Asynchronous Releases seems like a good thing overall. However, we may 
need to rethink some of our vetting, testing  release processes to 
ensure that we are still continually putting out well-tested, vetted  
stable code, not only during the main DSpace packaged releases, but 
also during smaller async module releases.  We also need to think about 
how we communicate the level of support provided for each module (is 
support equal for everything, are some modules more well supported than 
others?)

 Even better, we could distribute the binary
 release with ranges of version numbers (Maven allows forversion1.7.1,
 1.7.2, etc) and then all they would have to do would be to rebuild and
 it would automatically pick up the latest version. It is worth noting
 that this is exactly what we already do for the language packs so this
 is not new.

 I don't really recommend this, it works for language pack because they
 are optional.

I wouldn't recommend an automatic pickup via Maven either.  That is 
potentially hazardous, as some small change (even a bug fix) could 
affect your local 

Re: [Dspace-devel] Asynchronous release and other related matters

2011-07-14 Thread Mark H. Wood
On Thu, Jul 14, 2011 at 08:41:12AM -0700, Mark Diggory wrote:
[snip]
 At this time there are two approachs to customize your DSpace build.
 
 (1) use the source distro and change all the files you want and
 rebuild the entire sourcetree
 
 Pro: you can navigate to exactly the source that is being used and
 override it directly.

Pro #2:  there are many well-understood tools to help you keep in step
with new releases.

 Cons:
 
 Maintenance and Tracking of customizations across DSpace versions
 requires a developer with extensive knowledge of version control
 systems, changes will often conflict with work ongoing in DSpace
 releases.

Applies equally to approaches 1 and 2.

 Local institutional developers altering DSpace internals introduce a
 conservative viewpoint into the community limiting the advancement of
 the codebase because they actually prefer these customized parts of
 DSpace to not change rather than see benefit of new development come
 into the Community.

Applies equally to approaches 1 and 2.

 DSpace source release build process is too long for iterative development
 
 
 (2) use an overlay (works with all webapp projects (xmlui, jspui, oai,
 sword, lni, solr) and copy the source files you want to alter/override
 into separate dspace/modules/xxx projects, building only those changes
 you want.
 
 Pros:
 
 Only the code that you've changed need to be rebuilt
 
 Build Process is shorter
 
 Models how to extend DSpace via Addons, resulting in less
 customization of DSpace internals and greater ease in upgrading a
 DSpace instance (of course, this depends on the the local developer
 not adopting the practice of overriding existing DSpace classes and
 instead applying a process of implementing their own Service Based
 approach to implement a solution.

Assuming that you're lucky enough to want something for which a
Service is already defined.  If you had to invent a new kind of
Service, nobody is ever going to request it until you hack the stock
source.  See the statistics add-on for an example.

Instead of trying to teach or constrain people to never never never
touch the source, we should be teaching them how, when necessary, to
do it right:

o  find the right spot;
o  generalize what you're doing, so that those who follow you won't
   have to do what you're doing again and again;
o  discuss with the community;
o  separate your local concerns into local classes;
o  contribute your enabling changes back to the community.

 Works well IDE's that use maven source artifacts to resolve the source
 for a specific java class.
 (m2eclipse and Intellij IDEA)
 
 You can Check out separate modules and via the maven profiles in the
 dspace/pom.xml you can build only the source components you want.
 
 Cons:
 
 Its not my opinion, but some in the community seem to think that it is
 a hinderance if you either need to use an IDE or separately checkout
 the code for the module you want to get source from (when doing the
 bad practice of overriding classes directly, which we suggest trying
 to avoid doing to ease maintenance of your customizations.

Or when doing the good practice of trying to understand the code
you're interacting with.

  Me:  Hmmm, what does that method actually do?  [hover]  Rats, the
  doc comments are worthless.  IDE, take me to the source!

  IDE:  That class is not in the workspace.

  Me:  [unprintable]!  Where is it?  WHAT is it?  [examine
  dependencies, rummage around in scm.dspace.org for a few minutes]
  There it is.  Check out Yet Another Module...now, what was I doing?

Cons #2:

You just forked DSpace, same as in (1) but covering your tracks.  Good
luck keeping your local modified sources in sync with the distributed
ones.  Your SCM tools cannot help you, because they have no way of
knowing that dspace-foo/x and modules/foo/x are related.  Nor does
your IDE.



And none of that has anything to do with async. releases.  It's does
touch on modularity, which enables async. release but also enables
other practices which are entirely separate.

-- 
Mark H. Wood, Lead System Programmer   mw...@iupui.edu
Asking whether markets are efficient is like asking whether people are smart.


pgpkasx9RYGSv.pgp
Description: PGP signature
--
AppSumo Presents a FREE Video for the SourceForge Community by Eric 
Ries, the creator of the Lean Startup Methodology on Lean Startup 
Secrets Revealed. This video shows you how to validate your ideas, 
optimize your ideas and identify your business strategy.
http://p.sf.net/sfu/appsumosfdev2dev___
Dspace-devel mailing list
Dspace-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-devel