-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
All of this will be highly subjective (what isn't?), and please beg with me if I made some wrong assumptions... I'm here to learn, not to lead. So, my 2 €-cents. Thomas Lord wrote: > > 1. Who should be in the target market for Arch 2.0? Some scenarios come to mind: i. As a young and inexperienced developer (still a student), I do more mistakes than it should be allowed by law :-). Mostly, I work on small prototypes with the aim to teach myself how to work with some specific technology, or to experience with newly learned algorithms. It makes little or no sense for me to publish my work on a public server, like Savannah, Gna! or Sourceforge. Like me, there are a lot of other students or professionals out there that could find interesting storing revisions of their work and backup them regularly. ii. A lot of people have access to a ftp server with a registered domain and unlimited storage (it's cheap enough). However, there's no way you can have a generic ISP to set up a, say, SVN server listening. So being able to commit to a ``plain'' filesystem, without a daemon listening on the other side of the connection is perfect for them. iii. A distributed system is perfect also for people other than developers: if it has a friendly user interface, you can teach them to commit their changes done to documents to a RCS, so they can revert their work and start again from a save-point if something went wrong. For example, students may keep track of different versions of the reports they write for school, and merge changes independently. Another example is that of system administrators that prepare quick-and-dirty layouts of configuration directories, like /etc. iv. In critical environment integrity of saved data is essential. If you have your hard disk damaged, for example, and all the repository is stored into a unique binary file, there might be chances you end losing the whole history of changes. v. Developers into small or medium-sized projects need to have a simple way to merge changes between them, minimizing conflicts. If everyone commits to the same repository, they will have to deal with conflicts a lot often, making difficult to find and revert a particular commit if it introduced a bug. vi. There are some projects in which there's only one or a small number of ``gatekeepers'' that can commit to the rep. ``Wine'' comes to mind. In this scenario, people may want to start their own branch, and do a lot of small atomic commits to them. Then they may want to replay them, grouping them by functionality added, and then build a `set of changesets'. The next step would be to send it to the maintainer of the ``official'' repository, for merging. So, what they need is a simple way to update their working copy with changes from mainstream, and leave these patches out of the changesets they're going to send to the lead maintainer, in order to avoid duplication. vii. Huge projects work by a lot of branching. Having people not getting in one anothers' way by _forcing_ them to branch isn't bad. Moreover, only _good_ changes would be merged into the ``official'' archive, thus (hopefully) leaving out badly written code, or incomplete features. For example, give a look at http://cvs.gnome.org/viewcvs/. Half of those modules have never seen the light. There are modules like ``anjuta2'', which has a deceptive name since the real development for anjuta2 is being done on anjuta-HEAD. Keeping incomplete/bad contributions around just makes more difficult to other people to understand what's happening and concurs in greatly waste space. viii. Developers of large projects may choose Arch 2.0 because it is a tool that helps them in their work, instead of being something that constrains the way they program. Unfortunately, this happens with a lot RCSs, expecially centralized ones: you cannot branch without admin's permission, you cannot commit to certain modules, sometimes the file you need to commit to is locked by someone else, and so on... Thus, I think there are at least three categories of people that may be interested in a DRCS like Arch 2.0: 1. Non-developers who need a way to easily store revisions of documents and other (sensible?) data 2. Developers and non-developers who work on small projects still in embrional stage and not ready to be made public 3. Developers of medium-to-large projects that could benefit from a distributed development model in which only a handful of started branches is then merged back to mainline > > 2. What are the needs of that target market and how will Arch 2.0 win > for them? > Some random remarks: i. What to keep from GNU Arch 1.x? a) being able to publish archives without a daemon listening on the other side, and accessing resources in a ``VFS-fashion'', maybe extending it further b) a good merging system. After having tried branching and merging in svn (one of the worse systems I've had occasion to work with), I really started appreciating arch's naming scheme and way to solve conflicts. SVN makes difficult to merge mostly due to a ugly user interface, but also its standard way of solving conflicts makes it easy for you to make a mistake. c) signing of archives ii. What to improve starting from GNU Arch 1.x? b) tla commands are a lot, long and quite difficult to remember (for example: "make-archive" and "archive-mirror" are very descriptive, but I think it would be easier to remember if they were uniformed: like "make-archive" and "mirror-archive", or "archive-make" and "archive-mirror"); a complex UI increases the entry barrier for non-developers or people in a hurry -- and if you aim for the corporate world this is not a secondary matter c) external module configuration could be worked on to be more user friendly d) probably a minor issue, but I always wondered why, when publishing to an FTP site, username and password couldn't be asked interactively when not provided on the command line, instead that having always to be passed as a part of the URI -- with obvious problems for whoever have a username containing a char like "@". iii. Things new / things particularly important a) low bandwidth usage when updating / getting a snapshot of the repository. You may think that almost everybody has ADSL nowadays, and it may even be true in the U.S.; but without citing third-world countries, you can simply come to Italy -- a fairly rich nation -- and see how many people still surf the web with a 56kbps modem. b) a plugin system; maybe Arch 2 should become little more than a specification and a framework, in the sense that gstreamer is; there could be plugins for a pletora of things, ranging from merging algorithms to supported media / transport protocols; and maybe even a plugin for special treatment of certain mime-types -- for example, imagine a .odt OpenOffice.org document: it's mostly a bunch of XML files compressed in a unique archive. Instead of doing a binary diff of it, knowing its ``semantic'' we could diff just its plain-text contents c) using XML for manifest files and to store other Arch metadata could make it easier to extend an archive format over time without breaking compatibility, and could probably help also when ``web viewers'' of the repository are developed (which is a must-have by now) d) I know that C gives higher portability, but personally I think that an object oriented approach could help engineering and maintaining Arch 2 over time. IMHO, nowadays C++ is quite portable, if you don't go ``too exotic'' with templates and such :-) -- and it's fast. e) it should be possible (not compulsory) to include the full history of a branch into a repository when merging; or else it would be difficult to revert a particular changeset at some time in the future, once the merging has taken place. This can be important if you exploit Arch decentralization idea fully and you have a ``tree'' structure of contributors. So, say that instead of: +-------------------------+ | FOOBAR(tm) | | ``Official'' repository | +-------------------------+ ^ ^ ^ ^ | | | | Adam Bob Clara Demetrio you've got for example: +-------------------------+ | FOOBAR(tm) | | ``Official'' repository | +-------------------------+ ^ ^ | | LNHC <--. HKL <-----. ^ | ^ | | | | | Adam Bob Clara Demetrio Now say that Adam added a feature to product FooBar(tm), that HongKongLuna doesn't want. So HKL tells Demetrio to revert it. However, LuNoHoCo has implemented twenty new features at once, thanks from the combined work of Adam and Bob, and these were merged into the official repository just before a new FooBar(tm) version release. So, how can Demetrio revert the commit with the merging of the unwanted feature, without asking LNHC? iv. Minor things / maybe-not-so-useful items: a) say you've a repository that is a ``library'' of code: you and your mates have committed a lot of prototypes to it. You may want to find a particular function or keyword inside it. Although grepping is always possible, giving people a way to associate keywords/a semantic to files could be interesting: e.g. you may want to search the whole KDE repository for ``functions written in C++ that have to do with network sockets''. This feature seems to be much-hyped in these days where there's a lot of talk going on about things like Google Desktop Search, Apple's Spotlight and Beagle. b) a way to automatically advertising changes. Probably this could be achieved by a third-party plugin. Think about Arch 2 offering a RSS feed of commit logs when asked, or feeding ``dot'' a graph showing how known branches relate. Although the actual changes wouldn't be automatically committed to a central archive, it could be interesting for, e.g., a company to keep an eye on what employees are working on at the moment. This lower the possibility of duplicate work. I know there's nothing better than ``human communication'' for this, for example on a ML, but a way to batch this, if so desired, could be a useful extension. c) a GUI isn't necessarily a bad thing. I know quite a lot of users that prefer to use Subversion over other alternatives because it has TortoiseSVN, or they love Cervisia for CVS, and so on. I ain't saying it should be the core of the application, but Arch2 should be thought out in a way that makes reasonably easy to write a GUI on top of it. d) no problem with filenames beginning with ",," or "++", but filenames beginning with "=" screw up my bash-completion thingie e) is there a way to conciliate p2p technologies with a RCS? Does it make sense? Mmmh... probably not :-). Although if you've a large number of people updating their snapshots at the same time, like it may happen with the linux kernel sources, something bittorrent-like wouldn't be totally unjustified. Sorry if it was difficult to read... English isn't my mothertongue and I just studied it on schoolbooks. :-| Cheers, - -- Matteo Settenvini FSF Associated Member Email : [EMAIL PROTECTED] - -----BEGIN GEEK CODE BLOCK----- Version: 3.12 GCS d--(-) s+:- a-- C++ UL+++ P?>++ L+++>$ E+>+++ W+++ N++ o? w--- O- M++ PS++ PE- Y+>++ PGP+++ t+ 5 X- R tv-- b+++ DI+ D++ G++ e h+ r-- y? - ------END GEEK CODE BLOCK------ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFESUYLUDehq0srSdYRApt1AJwLL2VY3LJ23oIUatQ8ezVWbxWTLACgjLKt Pwf+RM93odN07vACkn4Vcso= =JQVh -----END PGP SIGNATURE----- _______________________________________________ Gnu-arch-users mailing list Gnu-arch-users@gnu.org http://lists.gnu.org/mailman/listinfo/gnu-arch-users GNU arch home page: http://savannah.gnu.org/projects/gnu-arch/