Re: [Dspace-devel] Asynchronous release and other related matters
An initial barebones prototype of whats being discussed, just needs to be tuned up a bit more to create the appropriate source and release trees. http://scm.dspace.org/svn/repo/dspace/branches/dspace-async-release/dspace-release/ This is associated with the following ticket https://jira.duraspace.org/browse/DS-818 Having a few others work on this in tandem would excelerate the process. Mark On Fri, Jul 15, 2011 at 3:10 AM, Robin Taylor robin.tay...@ed.ac.uk wrote: Hi all, The more I think about it the less frightened I am becoming. If we could include the safety net of a source release then that would sway it for me. Cheers, Robin. On Thu, 2011-07-14 at 19:26 +0100, Tim Donohue wrote: Thanks for doing this Robin! I'll add a few more thoughts in here.. On 7/14/2011 10:41 AM, Mark Diggory wrote: Robin, Thank you for a great introduction. I'll inject just a few comments inline. On Thu, Jul 14, 2011 at 2:22 AM, Robin Taylorrobin.tay...@ed.ac.uk wrote: Hi all, A large portion of yesterdays meeting ended up being a discussion on async releases. Mark (Diggory) has for some time been strongly advocating this approach and has written extensively on the subject both on the wiki and in Jira. I think it is incumbent on us all to consider his arguments and comment thereon. So to give him a rest I'll chip in, albeit from a different angle. To be honest, the more I think through this in my mind, the more I really do feel there are some major bonuses to Mark Diggory's ideas. Looking back at it now, it may have seemed from yesterday's meeting that I was arguing against async releases. The reality here is that I actually agree with MarkD that async releases is a good idea overall. However, I think there is some disagreement on implementation or processes to get there. My concerns are more around the processes of Async Releases. I'll admit, some of these concerns were allayed a bit from yesterday's discussions -- more below. So why do async releases ? There are a number of reasons, for example... 1. To fix a bug in module dspace-xxx we could just release that module. 2. Or make available new features without waiting for the annual release. 3. Or release entire new modules etc etc. Your #1 #2 reasons are both extremely powerful and also potentially worrisome if not handled properly. This might sound controversial, but let me explain... First off, I fully agree that #1 and #2 are great things (I'm not arguing against that at all). I would love to be able to fix bugs more quickly (via small module releases) and make new features available more quickly. I full agree that we should think about moving in this direction. I think the one concern here is around *management* of all these individual modules (which unfortunately comes back to the idea of defining support). To retain our Community trust, we do need to be careful about how we define support and how we are releasing new bug fixes or features via async module releases. For instance, we want to make sure that core modules (where core just means the main/primary code behind out-of-the-box DSpace) are still being well vetted tested before being released. So, we would need to establish some basic policies/best practices around how to also vet, test release new bug fixes asynchronously (Obviously, this may not require a full testathon, or a full Release Coordinator role. But we need some basic policies/best practices on how a bug fix is vetted tested asynchronously released). We'd also want to make sure DSpace users fully understand who is supporting each individual module. As we move this route, I still feel that we'll start to enter a situation where some modules are centrally supported (by the Committers as a whole), while others are only supported by smaller sub-teams or even external, third-parties. I think this is all a great thing (to have many modules built supported by many different groups)! But, users need to be able to know where they can go for support documentation, and *which* modules have been stamped as Committer Approved, Vetted Documented, or similar. NOTE: I should mention, I'm not saying we need some sort of detailed approval process for everything (I don't want to bog us down with red-tape, etc). I'm just saying users need to know which modules are actually controlled released by the Committers, and which may be just a smaller project or experiment that was put up by a sub-team or one individual or an external group. So, to summarize this point: In my opinion, Modularization Asynchronous Releases seems like a good thing overall. However, we may need to rethink some of our vetting, testing release processes to ensure that we are still continually putting out well-tested, vetted stable code, not only during the main DSpace packaged releases, but also during smaller async module releases. We also need to think about how
Re: [Dspace-devel] Asynchronous release and other related matters
Hi all, The more I think about it the less frightened I am becoming. If we could include the safety net of a source release then that would sway it for me. Cheers, Robin. On Thu, 2011-07-14 at 19:26 +0100, Tim Donohue wrote: Thanks for doing this Robin! I'll add a few more thoughts in here.. On 7/14/2011 10:41 AM, Mark Diggory wrote: Robin, Thank you for a great introduction. I'll inject just a few comments inline. On Thu, Jul 14, 2011 at 2:22 AM, Robin Taylorrobin.tay...@ed.ac.uk wrote: Hi all, A large portion of yesterdays meeting ended up being a discussion on async releases. Mark (Diggory) has for some time been strongly advocating this approach and has written extensively on the subject both on the wiki and in Jira. I think it is incumbent on us all to consider his arguments and comment thereon. So to give him a rest I'll chip in, albeit from a different angle. To be honest, the more I think through this in my mind, the more I really do feel there are some major bonuses to Mark Diggory's ideas. Looking back at it now, it may have seemed from yesterday's meeting that I was arguing against async releases. The reality here is that I actually agree with MarkD that async releases is a good idea overall. However, I think there is some disagreement on implementation or processes to get there. My concerns are more around the processes of Async Releases. I'll admit, some of these concerns were allayed a bit from yesterday's discussions -- more below. So why do async releases ? There are a number of reasons, for example... 1. To fix a bug in module dspace-xxx we could just release that module. 2. Or make available new features without waiting for the annual release. 3. Or release entire new modules etc etc. Your #1 #2 reasons are both extremely powerful and also potentially worrisome if not handled properly. This might sound controversial, but let me explain... First off, I fully agree that #1 and #2 are great things (I'm not arguing against that at all). I would love to be able to fix bugs more quickly (via small module releases) and make new features available more quickly. I full agree that we should think about moving in this direction. I think the one concern here is around *management* of all these individual modules (which unfortunately comes back to the idea of defining support). To retain our Community trust, we do need to be careful about how we define support and how we are releasing new bug fixes or features via async module releases. For instance, we want to make sure that core modules (where core just means the main/primary code behind out-of-the-box DSpace) are still being well vetted tested before being released. So, we would need to establish some basic policies/best practices around how to also vet, test release new bug fixes asynchronously (Obviously, this may not require a full testathon, or a full Release Coordinator role. But we need some basic policies/best practices on how a bug fix is vetted tested asynchronously released). We'd also want to make sure DSpace users fully understand who is supporting each individual module. As we move this route, I still feel that we'll start to enter a situation where some modules are centrally supported (by the Committers as a whole), while others are only supported by smaller sub-teams or even external, third-parties. I think this is all a great thing (to have many modules built supported by many different groups)! But, users need to be able to know where they can go for support documentation, and *which* modules have been stamped as Committer Approved, Vetted Documented, or similar. NOTE: I should mention, I'm not saying we need some sort of detailed approval process for everything (I don't want to bog us down with red-tape, etc). I'm just saying users need to know which modules are actually controlled released by the Committers, and which may be just a smaller project or experiment that was put up by a sub-team or one individual or an external group. So, to summarize this point: In my opinion, Modularization Asynchronous Releases seems like a good thing overall. However, we may need to rethink some of our vetting, testing release processes to ensure that we are still continually putting out well-tested, vetted stable code, not only during the main DSpace packaged releases, but also during smaller async module releases. We also need to think about how we communicate the level of support provided for each module (is support equal for everything, are some modules more well supported than others?) Even better, we could distribute the binary release with ranges of version numbers (Maven allows forversion1.7.1, 1.7.2, etc) and then all they would have to do would be to rebuild and it would automatically pick up the latest version. It is worth noting that
[Dspace-devel] Asynchronous release and other related matters
Hi all, A large portion of yesterdays meeting ended up being a discussion on async releases. Mark (Diggory) has for some time been strongly advocating this approach and has written extensively on the subject both on the wiki and in Jira. I think it is incumbent on us all to consider his arguments and comment thereon. So to give him a rest I'll chip in, albeit from a different angle. Some background - DSpace can currently be downloaded from Sourceforge either as a source code or binary release. The proposition (as I understand it) - Allow for releases of individual DSpace Maven modules outwith the normal 'complete' release that currently takes place roughly once a year. By a release I mean copying the DSpace Maven artifacts (jars and wars) to the DSpace Maven repository space to be publicly available. So why do async releases ? There are a number of reasons, for example... 1. To fix a bug in module dspace-xxx we could just release that module. 2. Or make available new features without waiting for the annual release. 3. Or release entire new modules etc etc. So how would this work ? The important thing to note is that it is dependant on people using the binary release. If they wanted to pick up a newly released version of a module they would just need to change the version number of that module in the appropriate pom and rebuild their local version of DSpace. Even better, we could distribute the binary release with ranges of version numbers (Maven allows for version1.7.1, 1.7.2, etc) and then all they would have to do would be to rebuild and it would automatically pick up the latest version. It is worth noting that this is exactly what we already do for the language packs so this is not new. The problem (as I see it) - this all falls down if people have the source code release. If people have the source code for a module in their local installation then that is what they should be deploying. Whilst it would be possible for them to change the dependencies in their poms to pick up newly released artifacts from the Maven central repo I don't think anyone would argue that it would be a good idea to have one version of the source code in your installation, but actually deploy a different one, too messy. So, if you are a fan of the XMLUI and Maven overlays, a binary release might well be sufficient for you. You can of course still check out the code for any module that interests you from the DSpace SVN site. However, if you are a JSPUI user and/or want or need to see the source code in general, then you probably want a source code release. That way you don't have to familiarise yourself with the DSpace SVN site. Could we continue to do both a source and binary release as we do at present ? Yes. Many of the modules that currently reside in trunk in SVN would be moved into the modules directory so that they could be released independently of one and other, and the assembly of the source code release would pick up the source code from there rather than from trunk. The downside of continuing with a source code release is that it prevents us from unifying behind one approach and being able to make announcements relating to new releases (async or otherwise) that apply to everyone. Personally, I am still in favour of source code releases. I don't see DSpace as a framework on which people build their local customisations, I see it as a complete implementation which people take and do what they want with it. Phew! I'm knackered. Robin. -- AppSumo Presents a FREE Video for the SourceForge Community by Eric Ries, the creator of the Lean Startup Methodology on Lean Startup Secrets Revealed. This video shows you how to validate your ideas, optimize your ideas and identify your business strategy. http://p.sf.net/sfu/appsumosfdev2dev ___ Dspace-devel mailing list Dspace-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-devel
Re: [Dspace-devel] Asynchronous release and other related matters
Robin, Thank you for a great introduction. I'll inject just a few comments inline. On Thu, Jul 14, 2011 at 2:22 AM, Robin Taylor robin.tay...@ed.ac.uk wrote: Hi all, A large portion of yesterdays meeting ended up being a discussion on async releases. Mark (Diggory) has for some time been strongly advocating this approach and has written extensively on the subject both on the wiki and in Jira. I think it is incumbent on us all to consider his arguments and comment thereon. So to give him a rest I'll chip in, albeit from a different angle. Thanks, I thought my vacation would leave me well rested. But alas, traipsing across the US with 6 yr old and a 11 month old turned out to be as exhausting as it sounds. Some background - DSpace can currently be downloaded from Sourceforge either as a source code or binary release. The proposition (as I understand it) - Allow for releases of individual DSpace Maven modules outwith the normal 'complete' release that currently takes place roughly once a year. By a release I mean copying the DSpace Maven artifacts (jars and wars) to the DSpace Maven repository space to be publicly available. This is the maven portion of the release, there are two distributions that are released as well (a) dspace-release-x.x.x : just the dspace assembly project that brings together all jars/wars to build ones distro (b) dspace-src-release-x.x.x : the above assembly + all maven projects in trunk in source form, requires building all jars and wars needed for DSpace. If course, (a) requires more bandwidth to download officially released binary jars of dspace and (b) packages local versions of those jars/wars that I would describe as unofficial local binary variants. At this time there are two approachs to customize your DSpace build. (1) use the source distro and change all the files you want and rebuild the entire sourcetree Pro: you can navigate to exactly the source that is being used and override it directly. Cons: Maintenance and Tracking of customizations across DSpace versions requires a developer with extensive knowledge of version control systems, changes will often conflict with work ongoing in DSpace releases. Local institutional developers altering DSpace internals introduce a conservative viewpoint into the community limiting the advancement of the codebase because they actually prefer these customized parts of DSpace to not change rather than see benefit of new development come into the Community. DSpace source release build process is too long for iterative development (2) use an overlay (works with all webapp projects (xmlui, jspui, oai, sword, lni, solr) and copy the source files you want to alter/override into separate dspace/modules/xxx projects, building only those changes you want. Pros: Only the code that you've changed need to be rebuilt Build Process is shorter Models how to extend DSpace via Addons, resulting in less customization of DSpace internals and greater ease in upgrading a DSpace instance (of course, this depends on the the local developer not adopting the practice of overriding existing DSpace classes and instead applying a process of implementing their own Service Based approach to implement a solution. Works well IDE's that use maven source artifacts to resolve the source for a specific java class. (m2eclipse and Intellij IDEA) You can Check out separate modules and via the maven profiles in the dspace/pom.xml you can build only the source components you want. Cons: Its not my opinion, but some in the community seem to think that it is a hinderance if you either need to use an IDE or separately checkout the code for the module you want to get source from (when doing the bad practice of overriding classes directly, which we suggest trying to avoid doing to ease maintenance of your customizations. So why do async releases ? There are a number of reasons, for example... 1. To fix a bug in module dspace-xxx we could just release that module. 2. Or make available new features without waiting for the annual release. 3. Or release entire new modules etc etc. So how would this work ? The important thing to note is that it is dependant on people using the binary release. If they wanted to pick up a newly released version of a module they would just need to change the version number of that module in the appropriate pom and rebuild their local version of DSpace. Not explicitly, as i've said above, you can checkout or export the source for a module into your build and have the dspace-release-x.x.x build only that portion for your build. Even better, we could distribute the binary release with ranges of version numbers (Maven allows for version1.7.1, 1.7.2, etc) and then all they would have to do would be to rebuild and it would automatically pick up the latest version. It is worth noting that this is exactly what we already do for the language packs so this is not new. I don't really recommend this, it
Re: [Dspace-devel] Asynchronous release and other related matters
Thanks for doing this Robin! I'll add a few more thoughts in here.. On 7/14/2011 10:41 AM, Mark Diggory wrote: Robin, Thank you for a great introduction. I'll inject just a few comments inline. On Thu, Jul 14, 2011 at 2:22 AM, Robin Taylorrobin.tay...@ed.ac.uk wrote: Hi all, A large portion of yesterdays meeting ended up being a discussion on async releases. Mark (Diggory) has for some time been strongly advocating this approach and has written extensively on the subject both on the wiki and in Jira. I think it is incumbent on us all to consider his arguments and comment thereon. So to give him a rest I'll chip in, albeit from a different angle. To be honest, the more I think through this in my mind, the more I really do feel there are some major bonuses to Mark Diggory's ideas. Looking back at it now, it may have seemed from yesterday's meeting that I was arguing against async releases. The reality here is that I actually agree with MarkD that async releases is a good idea overall. However, I think there is some disagreement on implementation or processes to get there. My concerns are more around the processes of Async Releases. I'll admit, some of these concerns were allayed a bit from yesterday's discussions -- more below. So why do async releases ? There are a number of reasons, for example... 1. To fix a bug in module dspace-xxx we could just release that module. 2. Or make available new features without waiting for the annual release. 3. Or release entire new modules etc etc. Your #1 #2 reasons are both extremely powerful and also potentially worrisome if not handled properly. This might sound controversial, but let me explain... First off, I fully agree that #1 and #2 are great things (I'm not arguing against that at all). I would love to be able to fix bugs more quickly (via small module releases) and make new features available more quickly. I full agree that we should think about moving in this direction. I think the one concern here is around *management* of all these individual modules (which unfortunately comes back to the idea of defining support). To retain our Community trust, we do need to be careful about how we define support and how we are releasing new bug fixes or features via async module releases. For instance, we want to make sure that core modules (where core just means the main/primary code behind out-of-the-box DSpace) are still being well vetted tested before being released. So, we would need to establish some basic policies/best practices around how to also vet, test release new bug fixes asynchronously (Obviously, this may not require a full testathon, or a full Release Coordinator role. But we need some basic policies/best practices on how a bug fix is vetted tested asynchronously released). We'd also want to make sure DSpace users fully understand who is supporting each individual module. As we move this route, I still feel that we'll start to enter a situation where some modules are centrally supported (by the Committers as a whole), while others are only supported by smaller sub-teams or even external, third-parties. I think this is all a great thing (to have many modules built supported by many different groups)! But, users need to be able to know where they can go for support documentation, and *which* modules have been stamped as Committer Approved, Vetted Documented, or similar. NOTE: I should mention, I'm not saying we need some sort of detailed approval process for everything (I don't want to bog us down with red-tape, etc). I'm just saying users need to know which modules are actually controlled released by the Committers, and which may be just a smaller project or experiment that was put up by a sub-team or one individual or an external group. So, to summarize this point: In my opinion, Modularization Asynchronous Releases seems like a good thing overall. However, we may need to rethink some of our vetting, testing release processes to ensure that we are still continually putting out well-tested, vetted stable code, not only during the main DSpace packaged releases, but also during smaller async module releases. We also need to think about how we communicate the level of support provided for each module (is support equal for everything, are some modules more well supported than others?) Even better, we could distribute the binary release with ranges of version numbers (Maven allows forversion1.7.1, 1.7.2, etc) and then all they would have to do would be to rebuild and it would automatically pick up the latest version. It is worth noting that this is exactly what we already do for the language packs so this is not new. I don't really recommend this, it works for language pack because they are optional. I wouldn't recommend an automatic pickup via Maven either. That is potentially hazardous, as some small change (even a bug fix) could affect your local
Re: [Dspace-devel] Asynchronous release and other related matters
On Thu, Jul 14, 2011 at 08:41:12AM -0700, Mark Diggory wrote: [snip] At this time there are two approachs to customize your DSpace build. (1) use the source distro and change all the files you want and rebuild the entire sourcetree Pro: you can navigate to exactly the source that is being used and override it directly. Pro #2: there are many well-understood tools to help you keep in step with new releases. Cons: Maintenance and Tracking of customizations across DSpace versions requires a developer with extensive knowledge of version control systems, changes will often conflict with work ongoing in DSpace releases. Applies equally to approaches 1 and 2. Local institutional developers altering DSpace internals introduce a conservative viewpoint into the community limiting the advancement of the codebase because they actually prefer these customized parts of DSpace to not change rather than see benefit of new development come into the Community. Applies equally to approaches 1 and 2. DSpace source release build process is too long for iterative development (2) use an overlay (works with all webapp projects (xmlui, jspui, oai, sword, lni, solr) and copy the source files you want to alter/override into separate dspace/modules/xxx projects, building only those changes you want. Pros: Only the code that you've changed need to be rebuilt Build Process is shorter Models how to extend DSpace via Addons, resulting in less customization of DSpace internals and greater ease in upgrading a DSpace instance (of course, this depends on the the local developer not adopting the practice of overriding existing DSpace classes and instead applying a process of implementing their own Service Based approach to implement a solution. Assuming that you're lucky enough to want something for which a Service is already defined. If you had to invent a new kind of Service, nobody is ever going to request it until you hack the stock source. See the statistics add-on for an example. Instead of trying to teach or constrain people to never never never touch the source, we should be teaching them how, when necessary, to do it right: o find the right spot; o generalize what you're doing, so that those who follow you won't have to do what you're doing again and again; o discuss with the community; o separate your local concerns into local classes; o contribute your enabling changes back to the community. Works well IDE's that use maven source artifacts to resolve the source for a specific java class. (m2eclipse and Intellij IDEA) You can Check out separate modules and via the maven profiles in the dspace/pom.xml you can build only the source components you want. Cons: Its not my opinion, but some in the community seem to think that it is a hinderance if you either need to use an IDE or separately checkout the code for the module you want to get source from (when doing the bad practice of overriding classes directly, which we suggest trying to avoid doing to ease maintenance of your customizations. Or when doing the good practice of trying to understand the code you're interacting with. Me: Hmmm, what does that method actually do? [hover] Rats, the doc comments are worthless. IDE, take me to the source! IDE: That class is not in the workspace. Me: [unprintable]! Where is it? WHAT is it? [examine dependencies, rummage around in scm.dspace.org for a few minutes] There it is. Check out Yet Another Module...now, what was I doing? Cons #2: You just forked DSpace, same as in (1) but covering your tracks. Good luck keeping your local modified sources in sync with the distributed ones. Your SCM tools cannot help you, because they have no way of knowing that dspace-foo/x and modules/foo/x are related. Nor does your IDE. And none of that has anything to do with async. releases. It's does touch on modularity, which enables async. release but also enables other practices which are entirely separate. -- Mark H. Wood, Lead System Programmer mw...@iupui.edu Asking whether markets are efficient is like asking whether people are smart. pgpkasx9RYGSv.pgp Description: PGP signature -- AppSumo Presents a FREE Video for the SourceForge Community by Eric Ries, the creator of the Lean Startup Methodology on Lean Startup Secrets Revealed. This video shows you how to validate your ideas, optimize your ideas and identify your business strategy. http://p.sf.net/sfu/appsumosfdev2dev___ Dspace-devel mailing list Dspace-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-devel