Re: Apache sub-projects

2009-08-20 Thread Chris Anderson
On Tue, Aug 18, 2009 at 2:55 AM, Jan Lehnardtj...@apache.org wrote:

 On 18 Aug 2009, at 09:10, Bernd Fondermann wrote:

 On Tue, Aug 18, 2009 at 02:38, Jan Lehnardtj...@apache.org wrote:

 Hi Paul,

 snip/

 Related:
  - Do we want to foster plugins, extensions and other infrastructure
 software or do we want to rely on the non CouchDB open source world to
 come
 up with them?

 I think, that's the real question: What software does the CouchDB
 project want to provide to its users as (a) product(s).
 If you have the answer to that, you can still decide how to organize
 it and how to call it.

 For example, Apache Lucene provides Lucene (a programming framework)
 and Solr (a ready-to-go server). Both a complimentary.
 More important, they nurse each other with new feature. This ain't no
 one way street.

 Thanks, that's what I was trying to say.


 If there is software which is really important in the CouchDB
 ecosystem, I'd try to take them (code + people) onboard.

 +1

Thanks for all the feedback. I'm still digesting it and waiting to
hear what a few more of the committers think. I think this discussion
hasn't swayed my opinion about bringing on the sub-projects, but it
does raise interesting questions about the CouchDB-Lounge - Erlang
Lounge transformation (that are independent of the sub-project
question, but still interesting...)

Chris


 Cheers
 Jan
 --





-- 
Chris Anderson
http://jchrisa.net
http://couch.io


Re: Apache sub-projects

2009-08-17 Thread Paul Davis
Sub-projects are a bad idea.

Been following this thread for awhile without being able to put my
concerns into small sentences. Each time I think about it I think
about how rapidly CouchDB is growing and how much that would hurt
sub-projects that are trying to keep up. And as others have said, we
should make sure that CouchDB doesn't turn into a namespace for
sub-projects.

Personally, I think CouchApp should be very frightened of becoming an
ASF project of any sort. My guess is that the stability/agility trade
off is just too serious. I think adding a CouchApp page on
http://couchdb.apache.org would be good, but adding CouchApp traffic
to the bug tracker or mailing lists would make me want to throw twice
as much stuff.

couchdb-lounge should never be a sub-project. Implementing it in
Erlang is going to touch more bits than most people consider. It'll
end up being unavoidable not having it part of the default
distribution. Trying to pull in the entire project as it is and the
replace it piece by piece is not going to work. We should keep it like
it is, a reference implementation that we hope to achieve in the
default CouchDB distribution.

The only project I could even consider being a sub-project is
couchdb-lucene. Though for roughly the same reasons as CouchApp I'd
probably rather see it as a separate project and just include it on
our website ecosystem. Both projects are public-api compatibile and as
others have stated, they'll either stay compatible or die.

And with all that, we're an Alpha (or Beta), pre-1.0 software project.
Now is not the time for adding bureaucracy to release procedures. We
should be focused on removing obstacles to making good software and
adding sub-projects seems like a good way to cause us a crap load of
pain in the future.

Faster not slower.

Paul Davis

On Fri, Aug 14, 2009 at 2:52 AM, Chris Andersonjch...@apache.org wrote:
 Many Apache projects have sub-projects, for two good example see:

 http://hadoop.apache.org/ which has 9 sub-projects

 http://lucene.apache.org/ which has 10

 I think one benefit of having sub-projects is broadening the
 community. I think it also helps to give people looking at CouchDB for
 the first time an easier way to see some of the really cool tools and
 libraries it's offers.

 Also, I think it sounds relaxing. Being able to keep an eye on a more
 of the Apache-licensed CouchDB ecosystem in one repository I think
 will result in stronger code.

 I'd like to see a few projects out there become sub-projects, and
 maybe there are others we should include as well. Here's a list of 3:

 The CouchDB-Lounge project provides a CouchDB clustering via a smart
 HTTP proxy. I can see bringing that code in, and using it as a
 scaffold for our Erlang clustering infrastructure. If we do it right,
 deployments will have wide flexibility over which tools to use to
 scale CouchDB, being able to mix, say, CouchDB-Lounge's consistent
 hashing nginx-proxy for document CRUD, but use Erlang view merger or
 other cluster-optimized view engine. If someone is already a heavy
 nginx shop, but doesn't want to merge views in twisted python, they
 could see benefits to a mix and match architecture.

 Informally I asked Kevin Ferguson of CouchDB-Lounge if they'd be
 interested and he said it sounds great.

 CouchApp is a set of scripts to make deploying CouchDB design
 documents easy. I've been involved in it for a while, and Benoit has
 put a lot of time into it. The tool and the JavaScript framework it
 goes with are starting to have a community, and should gain more
 interest when the O'Reilly book goes to press. Benoit Chesneau is
 excited about bringing CouchApp into the CouchDB project.

 CouchDB-Lucene is another good candidate. I haven't asked Robert
 Newson yet what he thinks about it, but I think the project would be a
 good fit.

 There may be more candidates I'm missing, or maybe people will think
 I'm batty for having the idea in the first place... comments welcome.

 Cheers,
 Chris


 --
 Chris Anderson
 http://jchrisa.net
 http://couch.io



Re: Apache sub-projects

2009-08-17 Thread Jan Lehnardt

Hi Paul,

good points!

The questions boil down to:
 - Is there a notion of core CouchDB that doesn't have cluster  
features?
 - Do we want to ship whatever release (cluster or not or both) of  
CouchDB with a small ecosystem (Futon, CouchApp)?


Related:
 - Do we want to foster plugins, extensions and other infrastructure  
software or do we want to rely on the non CouchDB open source world to  
come up with them?


--

My take:

 - Is there a notion of core CouchDB that doesn't have cluster  
features?


My understanding of CouchDB is it being a toolkit for users to build  
their flavour of a distributed database with all the necessary bits in  
place. CouchDB 0.9 and couchdb-lounge are a perfect example of that.


My understanding is also that we want an official distribution to  
include couchdb-lounge style clustering. There's two ways to go about  
it: 1) put it all into the source tree and disable and enable features  
on build time (--enable-cluster) or 2) have separate trees (e.g. a  
core and a cluster add on) that can be used to create two releases  
(packaging time): couchdb-1.0.tar.gz and couchdb-with-cluster.tar.gz


Approach 2) could be generalized for other packaging-time plugins:  
couchdb-with-lucene-1.0.tar.gz (including couchdb-lucene :) or couchdb- 
no-futon-1.0.tar.gz We'd need to decide which and how many of these  
distributions does the PMC want to release?



 - Do we want to ship whatever release (cluster or not or both) of  
CouchDB with a small ecosystem (Futon, CouchApp)?


Futon is already in core, sub-projecting it would be merely done to  
attract more Futon developers. Maybe that can be achieved in other  
ways, too.


I think CouchApp should be a packaging-time part of CouchDB and  
installed by default. This would add Python as a build dependency.  
Maybe we can make ./configure smart enough to not install CouchApp and  
give a warning when the necessary dependencies are not met. Maybe we  
just add to the final `make install` output Now go and install  
CouchApp from 



 - Do we want to foster plugins, extensions and other infrastructure  
software or do we want to rely on the non CouchDB open source world to  
come up with them?


I'd like to see a place like the Firefox extension development center  
but for CouchDB plugins hosted on http://couchdb.apache.org.


--

Cheers
Jan
--



On 17 Aug 2009, at 08:12, Paul Davis wrote:


Sub-projects are a bad idea.

Been following this thread for awhile without being able to put my
concerns into small sentences. Each time I think about it I think
about how rapidly CouchDB is growing and how much that would hurt
sub-projects that are trying to keep up. And as others have said, we
should make sure that CouchDB doesn't turn into a namespace for
sub-projects.

Personally, I think CouchApp should be very frightened of becoming an
ASF project of any sort. My guess is that the stability/agility trade
off is just too serious. I think adding a CouchApp page on
http://couchdb.apache.org would be good, but adding CouchApp traffic
to the bug tracker or mailing lists would make me want to throw twice
as much stuff.

couchdb-lounge should never be a sub-project. Implementing it in
Erlang is going to touch more bits than most people consider. It'll
end up being unavoidable not having it part of the default
distribution. Trying to pull in the entire project as it is and the
replace it piece by piece is not going to work. We should keep it like
it is, a reference implementation that we hope to achieve in the
default CouchDB distribution.

The only project I could even consider being a sub-project is
couchdb-lucene. Though for roughly the same reasons as CouchApp I'd
probably rather see it as a separate project and just include it on
our website ecosystem. Both projects are public-api compatibile and as
others have stated, they'll either stay compatible or die.

And with all that, we're an Alpha (or Beta), pre-1.0 software project.
Now is not the time for adding bureaucracy to release procedures. We
should be focused on removing obstacles to making good software and
adding sub-projects seems like a good way to cause us a crap load of
pain in the future.

Faster not slower.

Paul Davis

On Fri, Aug 14, 2009 at 2:52 AM, Chris Andersonjch...@apache.org  
wrote:

Many Apache projects have sub-projects, for two good example see:

http://hadoop.apache.org/ which has 9 sub-projects

http://lucene.apache.org/ which has 10

I think one benefit of having sub-projects is broadening the
community. I think it also helps to give people looking at CouchDB  
for

the first time an easier way to see some of the really cool tools and
libraries it's offers.

Also, I think it sounds relaxing. Being able to keep an eye on a more
of the Apache-licensed CouchDB ecosystem in one repository I think
will result in stronger code.

I'd like to see a few projects out there become sub-projects, and
maybe there are others we should include as well. Here's a list of 

Re: Apache sub-projects

2009-08-17 Thread Paul Davis
 My understanding is also that we want an official distribution to include
 couchdb-lounge style clustering. There's two ways to go about it: 1) put it
 all into the source tree and disable and enable features on build time
 (--enable-cluster) or 2) have separate trees (e.g. a core and a cluster add
 on) that can be used to create two releases (packaging time):
 couchdb-1.0.tar.gz and couchdb-with-cluster.tar.gz


Erlang should allow us the flexibility to make clustering a runtime
configuration. As you state, its can be thought of as a toolbox for
creating db environments. If people want to shed bits of the toolbox
to target a phone, then I think that's more of an end user concern.

  - Do we want to ship whatever release (cluster or not or both) of CouchDB
 with a small ecosystem (Futon, CouchApp)?

 Futon is already in core, sub-projecting it would be merely done to
 attract more Futon developers. Maybe that can be achieved in other ways,
 too.

I think making it easier for developers to start hacking on Futon
would be pretty awesome. Though I'd put it more in the realm of us
being creative instead of creating a full on sub project for it.

 I think CouchApp should be a packaging-time part of CouchDB and installed by
 default. This would add Python as a build dependency. Maybe we can make
 ./configure smart enough to not install CouchApp and give a warning when the
 necessary dependencies are not met. Maybe we just add to the final `make
 install` output Now go and install CouchApp from 

It'd take a lot of convincing for me to be supportive of having
couchapp in an apache-couchdb tarball. Especially when `sudo
easy_install couchapp` is handy.

  - Do we want to foster plugins, extensions and other infrastructure
 software or do we want to rely on the non CouchDB open source world to come
 up with them?

 I'd like to see a place like the Firefox extension development center but
 for CouchDB plugins hosted on http://couchdb.apache.org.

I think having a place for plugins would be great. Though I tend to
wonder what type of requirements there would be if we hosted plugins
at the ASF. Somehow I'd think that requiring a signed legal document
on file would stifle widespread growth of the community.

Paul Davis


Re: Apache sub-projects

2009-08-17 Thread Paul Davis
On Mon, Aug 17, 2009 at 9:54 PM, Jan Lehnardtj...@apache.org wrote:

 On 18 Aug 2009, at 03:13, Paul Davis wrote:

 My understanding is also that we want an official distribution to include
 couchdb-lounge style clustering. There's two ways to go about it: 1) put
 it
 all into the source tree and disable and enable features on build time
 (--enable-cluster) or 2) have separate trees (e.g. a core and a cluster
 add
 on) that can be used to create two releases (packaging time):
 couchdb-1.0.tar.gz and couchdb-with-cluster.tar.gz


 Erlang should allow us the flexibility to make clustering a runtime
 configuration. As you state, its can be thought of as a toolbox for
 creating db environments. If people want to shed bits of the toolbox
 to target a phone, then I think that's more of an end user concern.

 Or does the PMC want to public couchdb-mobile.tar.gz?


  - Do we want to ship whatever release (cluster or not or both) of
 CouchDB
 with a small ecosystem (Futon, CouchApp)?

 Futon is already in core, sub-projecting it would be merely done to
 attract more Futon developers. Maybe that can be achieved in other ways,
 too.

 I think making it easier for developers to start hacking on Futon
 would be pretty awesome. Though I'd put it more in the realm of us
 being creative instead of creating a full on sub project for it.

 that might include confusion what it means to be a full on sub project.

 IMHO that is just moving source files to couchdb/futon/trunk/... and setting
 up svn externals (guesswork) or symlinks for couchdb proper accordingly.


Well, I'm still operating under the Its not nearly time for full on
subprojects mental model. Making Futon more hackable by people
unfamiliar with the code base shouldn't require sub-project status. It
should just be easier. Granted I haven't the slightest on what that
might entail.


 I think CouchApp should be a packaging-time part of CouchDB and installed
 by
 default. This would add Python as a build dependency. Maybe we can make
 ./configure smart enough to not install CouchApp and give a warning when
 the
 necessary dependencies are not met. Maybe we just add to the final `make
 install` output Now go and install CouchApp from 

 It'd take a lot of convincing for me to be supportive of having
 couchapp in an apache-couchdb tarball. Especially when `sudo
 easy_install couchapp` is handy.

 Imagine a CouchDB distribution where a user has everything ready to go
 instead of having to figure out what bits and pieces are needed next.
 There's a long way to go with documentation and notices in the right
 places (like after make install), but e.g. package manager users don't
 get to see these.


I suppose this depends on your view of whether CouchApp is 'required'
for using CouchDB. I place it in the tool category and as such don't
really see it as appropriate for a release tarball. if it were a
single file that ran on versions of Python back to say 2.4 and was
conditionally installed as part of the build process, then I'd be more
willing.



  - Do we want to foster plugins, extensions and other infrastructure
 software or do we want to rely on the non CouchDB open source world to
 come
 up with them?

 I'd like to see a place like the Firefox extension development center but
 for CouchDB plugins hosted on http://couchdb.apache.org.

 I think having a place for plugins would be great. Though I tend to
 wonder what type of requirements there would be if we hosted plugins
 at the ASF. Somehow I'd think that requiring a signed legal document
 on file would stifle widespread growth of the community.

 I agree. The actual code could be distributed from some other place,
 I'm not suggesting right away that all this has to be under the ASF/CLA
 umbrella. Although clear steps on how to get in should be
 documented.


Paul Davis


Re: Apache sub-projects

2009-08-16 Thread J Aaron Farr
On Fri 14 Aug 2009 21:50, Simon Metson simonmet...@googlemail.com wrote:

 For example, a clear warning sign is when you start giving out
 commit rights to only certain subprojects.

 I don't understand this point.

 Why would any sub-project NOT have commit rights?

 I read this as committer A is given commit for sub-project 1, but not
 the core project or sub-project 2 e.g. you want someone committing to
 a sub-project to be involved in the whole project not just their
 corner, and have one community not fragmented communities around the
 sub-projects...

This is exactly what I meant.

Now, it doesn't mean you _can't_ do this.  In fact, some projects have
different commit groups for documentation than for code.  But in
general, it's almost always better to make life simpler and have one
commit group.

Again, I don't want to tell you how to run CouchDB.  I just wanted to
give a little friendly insight.

-- 
   J. Aaron Farr
   馮傑仁
   www.cubiclemuses.com


Re: Apache sub-projects

2009-08-16 Thread Jan Lehnardt

Hi dev@,

Chris, thanks for bobsledding this :)

--

I'd welcome all three projects as sub-projects. I'd also throw in Futon
to become a sub project, too. My understanding of a sub-project
is this:

 - The development community between the main project and the
sub project should be overlapping significantly.

 - Everybody in the main project's committer list gets to commit
to all projects.

 - The main project's PMC oversees releases of all projects.

 - Technically, a new release of the main project should be followed
by compatible releases of sub projects.

I think we're at a scale where we can keep this all in the CouchDB
project.

My experience with branching out sub-projects into separate groups
comes from the PHP community (no flames please) where extensions
to PHP were divided into core extensions and PECL extensions.
PECL being an extension development and distribution project
independent of the PHP core. Core extensions are maintained and
kept up to date with the core project, PECL extensions have their
own schedule and the individual authors are responsible for updates
in time for new PHP releases. This structure was introduced with
over 100 extensions being in core PHP and way over 1000
committers to the project (one of the cases where documenters got
a separate VCS module to work with). This proved to be good for
the timely release of PHP releases as core developers have
significantly less extensions to take care of.

In a project as big as PHP, this makes sense and was needed
badly. CouchDB is not that big yet. I'd say let's figure out how
to incorporate sub projects into our subversion tree (e.g.
couchdb/trunk is couchdb-trunk, where do sub projects go?
couchdb/projects/...?) and how to make releases (separate or
bundled with CouchDB, Futon e.g. could be bundled and
couchdb-lucene a separate package) and worry about splitting
out independent sub projects when when the need arises.

Rehashing the above points for the proposed projects:

Futon:

 - The developers are currently the CouchDB committers and we
   got contributions small and big from users through patches.

 - A future Futon-only developer doesn't necessarily have to know
   any Erlang and I'd trust her to not touch that part of the code
   until everybody feeling comfortable with this. (Besides, Erlang is
   so simple on the couch_http_*-end that small fixes in the HTTP
   API could be done without much hassle). And we're using SVN,
   if any unwanted changes, accidental or deliberate are introduced,
   we'll revert them.

 - Futon always has to be current with CouchDB, even through
   trunk-only development phases. Splitting this out to a separate
   schedule doesn't make much sense to me.

 - Working on Futon is a great way for new developers to get into
   CouchDB via simpler, more familiar means (web development).


couchdb-lucene:

 - CouchDB greatly benefits from an officially endorsed Lucene
   integration plugin. This is prone to bikeshedding and
   unnecessary for users to implement all over again and again.

 - A future couchdb-lucene developer doesn't necessarily have to
   know any Erlang and I'd trust her to not touch that part of the
   code until everybody feeling comfortable with this.

 - The CouchDB PMC doesn't have much love for Java. We'd
   need to overcome our differences to be able to support
   releases. I'm confident we can do this.

 - The plugin API is relatively stable, but improvements on the
   CouchDB end are ongoing. Having stable releases alongside
   CouchDB releases are desirable. However, technically, we
   are not bound to hold a CouchDB release if couchdb-lucene
   is not ready yet, we can always make a later release. Our
   users will apply the necessary pressure if they need synched
   releases. couchdb-lucene wouldn't have to part of the standard
   CouchDB release tarball. It could however (`./configure
   --with-couchdb-lucene`) in which case a CouchDB release
   could be blocked by a couchdb-lucene release. The PMC
   has to decide on the route. To get going, I propose to start
   with separated release tarballs.

 - Working on couchdb-lucene is a great way for new developers
   to get into CouchDB more familiar means (Java).


couchdb-lounge:

 (see couchdb-lucene =~ s/ucene/ounge/g)

 - Porting couchdb-lounge to Erlang and obsoleting the plugin
   step-by-step sounds like a good solution.


CouchApp:

  (see Futon =~ s/Futon/CouchApp/g).


In addition: The couchdb-lounge and CouchApp developers
(Benoit and Robert) have been contributing to CouchDB via
patches in the past and I'd be confident to have them as
CouchDB committers anyway.

--

Next steps:

 - Figure out how to host sub-projects both bundled (Futon)
   and unbundled (couchdb-lucene) and how to accommodate
   them in SVN and the build system (if applicable).

 - Prepare PMC votes to integrate the sources.

 - On success, prepare IP-clearance through the incubator.

 - PMC vote on accepting the existing committers as CouchDB
 

Re: Apache sub-projects

2009-08-16 Thread Dirkjan Ochtman
On Sun, Aug 16, 2009 at 18:21, Jan Lehnardtj...@apache.org wrote:
 In a project as big as PHP, this makes sense and was needed
 badly. CouchDB is not that big yet. I'd say let's figure out how
 to incorporate sub projects into our subversion tree (e.g.
 couchdb/trunk is couchdb-trunk, where do sub projects go?
 couchdb/projects/...?) and how to make releases (separate or
 bundled with CouchDB, Futon e.g. could be bundled and
 couchdb-lucene a separate package) and worry about splitting
 out independent sub projects when when the need arises.

As someone who is involved in a lot of repository migrations (to
DVCS), please don't do any weird layouts. If you're going to do
trunk/tags/branches, put every project on the same level and t/b/t in
each of them. For source code, facilitating your way out of a specific
system is a good idea.

 Futon:

  - The developers are currently the CouchDB committers and we
   got contributions small and big from users through patches.

  - A future Futon-only developer doesn't necessarily have to know
   any Erlang and I'd trust her to not touch that part of the code
   until everybody feeling comfortable with this. (Besides, Erlang is
   so simple on the couch_http_*-end that small fixes in the HTTP
   API could be done without much hassle). And we're using SVN,
   if any unwanted changes, accidental or deliberate are introduced,
   we'll revert them.

  - Futon always has to be current with CouchDB, even through
   trunk-only development phases. Splitting this out to a separate
   schedule doesn't make much sense to me.

  - Working on Futon is a great way for new developers to get into
   CouchDB via simpler, more familiar means (web development).

Having Futon as a somewhat separate project sounds interesting. I've
been porting another site to CouchDB over the past week, and I've been
having some ideas (but putting up a CouchDB snapshot has prevented me
from hacking on it so far -- making that easier would be
interesting...). A major detriment to my last Futon patch was also the
fact that it sat untouched in JIRA for months, which wasn't very
motivating. (It eventually got applied, but without mentioning so on
the bug, so I didn't even know about that... Kind of a pity, that.)

Relatedly, I wonder if couchdb-python would also be a good fit for
being a subproject. Note that both Futon and couchdb-python have
suffered a bit in the past from the fact that the primary author has
had little time to deal with patches, bugs, etc. That happens to
everyone, but it would be nice if we could prevent the projects from
stalling as much as possible.

Cheers,

Dirkjan


Re: Apache sub-projects

2009-08-16 Thread Jan Lehnardt


On 16 Aug 2009, at 22:11, Dirkjan Ochtman wrote:


On Sun, Aug 16, 2009 at 18:21, Jan Lehnardtj...@apache.org wrote:

In a project as big as PHP, this makes sense and was needed
badly. CouchDB is not that big yet. I'd say let's figure out how
to incorporate sub projects into our subversion tree (e.g.
couchdb/trunk is couchdb-trunk, where do sub projects go?
couchdb/projects/...?) and how to make releases (separate or
bundled with CouchDB, Futon e.g. could be bundled and
couchdb-lucene a separate package) and worry about splitting
out independent sub projects when when the need arises.


As someone who is involved in a lot of repository migrations (to
DVCS), please don't do any weird layouts. If you're going to do
trunk/tags/branches, put every project on the same level and t/b/t in
each of them. For source code, facilitating your way out of a specific
system is a good idea.


That'd mean reorganising SVN, but I'd be okay with that unless
somebody can point out significant drawbacks.



Futon:

 - The developers are currently the CouchDB committers and we
  got contributions small and big from users through patches.

 - A future Futon-only developer doesn't necessarily have to know
  any Erlang and I'd trust her to not touch that part of the code
  until everybody feeling comfortable with this. (Besides, Erlang is
  so simple on the couch_http_*-end that small fixes in the HTTP
  API could be done without much hassle). And we're using SVN,
  if any unwanted changes, accidental or deliberate are introduced,
  we'll revert them.

 - Futon always has to be current with CouchDB, even through
  trunk-only development phases. Splitting this out to a separate
  schedule doesn't make much sense to me.

 - Working on Futon is a great way for new developers to get into
  CouchDB via simpler, more familiar means (web development).


Having Futon as a somewhat separate project sounds interesting. I've
been porting another site to CouchDB over the past week, and I've been
having some ideas (but putting up a CouchDB snapshot has prevented me
from hacking on it so far -- making that easier would be
interesting...). A major detriment to my last Futon patch was also the
fact that it sat untouched in JIRA for months, which wasn't very
motivating. (It eventually got applied, but without mentioning so on
the bug, so I didn't even know about that... Kind of a pity, that.)


I'm sorry to hear, we need to be more careful. Sometimes something
gets patched that corresponds to a ticket that the committer is not
aware of.



Relatedly, I wonder if couchdb-python would also be a good fit for
being a subproject. Note that both Futon and couchdb-python have
suffered a bit in the past from the fact that the primary author has
had little time to deal with patches, bugs, etc. That happens to
everyone, but it would be nice if we could prevent the projects from
stalling as much as possible.


I'd be -1 on client libraries to becoming sub-projects. At least for now
where there's still significant styles being worked out. Down the road,
I can see that being useful.

That said, couchdb-python could do with a larger, more active
community. E.g. Benoit's couchdbkit* kicks couchdb-python's
online presence's butt :) Not blaming anyone, just trying to
encourage :)

* http://couchdbkit.org/

Cheers
Jan
--



Re: Apache sub-projects

2009-08-16 Thread Noah Slater
On Sun, Aug 16, 2009 at 10:23:03PM +0200, Jan Lehnardt wrote:
 I'd be -1 on client libraries to becoming sub-projects. At least for now
 where there's still significant styles being worked out. Down the road,
 I can see that being useful.

I am similarly -1 on this issue for now.

-- 
Noah Slater, http://tumbolia.org/nslater


Re: Apache sub-projects

2009-08-16 Thread Adam Kocoloski

On Aug 16, 2009, at 12:21 PM, Jan Lehnardt wrote:


Chris, thanks for bobsledding this :)


Jan, great summary.  Actually, this has been a well thought out  
discussion by all involved. It's great to see such a vibrant community!


I have some small reservations about making couchdb-lounge a  
subproject.  What happens if it becomes completely obsolete?  The  
native Erlang version doesn't make too much sense as a subproject in  
my opinion.  Also, I don't know how the PMC would handle  
committership, since Shaun and Kevin are not quite so plugged into the  
community as Benoit and Robert.


Best, Adam


Apache sub-projects

2009-08-14 Thread Chris Anderson
Many Apache projects have sub-projects, for two good example see:

http://hadoop.apache.org/ which has 9 sub-projects

http://lucene.apache.org/ which has 10

I think one benefit of having sub-projects is broadening the
community. I think it also helps to give people looking at CouchDB for
the first time an easier way to see some of the really cool tools and
libraries it's offers.

Also, I think it sounds relaxing. Being able to keep an eye on a more
of the Apache-licensed CouchDB ecosystem in one repository I think
will result in stronger code.

I'd like to see a few projects out there become sub-projects, and
maybe there are others we should include as well. Here's a list of 3:

The CouchDB-Lounge project provides a CouchDB clustering via a smart
HTTP proxy. I can see bringing that code in, and using it as a
scaffold for our Erlang clustering infrastructure. If we do it right,
deployments will have wide flexibility over which tools to use to
scale CouchDB, being able to mix, say, CouchDB-Lounge's consistent
hashing nginx-proxy for document CRUD, but use Erlang view merger or
other cluster-optimized view engine. If someone is already a heavy
nginx shop, but doesn't want to merge views in twisted python, they
could see benefits to a mix and match architecture.

Informally I asked Kevin Ferguson of CouchDB-Lounge if they'd be
interested and he said it sounds great.

CouchApp is a set of scripts to make deploying CouchDB design
documents easy. I've been involved in it for a while, and Benoit has
put a lot of time into it. The tool and the JavaScript framework it
goes with are starting to have a community, and should gain more
interest when the O'Reilly book goes to press. Benoit Chesneau is
excited about bringing CouchApp into the CouchDB project.

CouchDB-Lucene is another good candidate. I haven't asked Robert
Newson yet what he thinks about it, but I think the project would be a
good fit.

There may be more candidates I'm missing, or maybe people will think
I'm batty for having the idea in the first place... comments welcome.

Cheers,
Chris


-- 
Chris Anderson
http://jchrisa.net
http://couch.io


Re: Apache sub-projects

2009-08-14 Thread J Aaron Farr

On Fri 14 Aug 2009 14:52, Chris Anderson jch...@apache.org wrote:

 I think one benefit of having sub-projects is broadening the
 community. I think it also helps to give people looking at CouchDB for
 the first time an easier way to see some of the really cool tools and
 libraries it's offers.

Just to give you a heads up, the ASF has had rather mixed results with
subprojects and we've specifically tried to avoid what's been termed as
umbrella projects.  Jakarta is/was the poster child example.

The issue always comes down to oversight: does the PMC know what's going
on when you have a dozen subprojects?  Is there one community or lots of
little ones?

The idea is that each community deserves to have its own PMC which can
give the code the proper sort of care and when a project gets too big,
then there will be pressure to split.  Now, some of that split can be
rather internal to the ASF.  In theory we could have a single web site
that is the public face to multiple PMCs.

I'm not saying couchdb can't organize itself into subproject, but I do
want to give you a heads up about the sort of subproject politics you
can bump into down the road.  For example, a clear warning sign is when
you start giving out commit rights to only certain subprojects.


Thanks for your attention, now back to your regularly scheduled hacking...

-- 
   J. Aaron Farr
   馮傑仁
   www.cubiclemuses.com


Re: Apache sub-projects

2009-08-14 Thread Benoit Chesneau
2009/8/14 Chris Anderson jch...@apache.org:
 Many Apache projects have sub-projects, for two good example see:

 http://hadoop.apache.org/ which has 9 sub-projects

 http://lucene.apache.org/ which has 10

 I think one benefit of having sub-projects is broadening the
 community. I think it also helps to give people looking at CouchDB for
 the first time an easier way to see some of the really cool tools and
 libraries it's offers.

 Also, I think it sounds relaxing. Being able to keep an eye on a more
 of the Apache-licensed CouchDB ecosystem in one repository I think
 will result in stronger code.

 I'd like to see a few projects out there become sub-projects, and
 maybe there are others we should include as well. Here's a list of 3:

 The CouchDB-Lounge project provides a CouchDB clustering via a smart
 HTTP proxy. I can see bringing that code in, and using it as a
 scaffold for our Erlang clustering infrastructure. If we do it right,
 deployments will have wide flexibility over which tools to use to
 scale CouchDB, being able to mix, say, CouchDB-Lounge's consistent
 hashing nginx-proxy for document CRUD, but use Erlang view merger or
 other cluster-optimized view engine. If someone is already a heavy
 nginx shop, but doesn't want to merge views in twisted python, they
 could see benefits to a mix and match architecture.

 Informally I asked Kevin Ferguson of CouchDB-Lounge if they'd be
 interested and he said it sounds great.

 CouchApp is a set of scripts to make deploying CouchDB design
 documents easy. I've been involved in it for a while, and Benoit has
 put a lot of time into it. The tool and the JavaScript framework it
 goes with are starting to have a community, and should gain more
 interest when the O'Reilly book goes to press. Benoit Chesneau is
 excited about bringing CouchApp into the CouchDB project.

 CouchDB-Lucene is another good candidate. I haven't asked Robert
 Newson yet what he thinks about it, but I think the project would be a
 good fit.

 There may be more candidates I'm missing, or maybe people will think
 I'm batty for having the idea in the first place... comments welcome.

 Cheers,
 Chris


 --
 Chris Anderson
 http://jchrisa.net
 http://couch.io


Having Couchapp as a CouchDB subproject would be great. Imho using
apache infrastructure for bugs and release process is good for users
and developpers. Also since couchapp and other projects mentionned
here are dependant on couchdb and sometimes require changes/evolution
from CouchDB I think it's good to have all of them together.

So yest, that should be good! :)

-benoit


Re: Apache sub-projects

2009-08-14 Thread Robert Newson
No objections to couchdb-lucene being a subproject at all. I'll let
others chime in on the merit of subprojects in general.

B.

On Fri, Aug 14, 2009 at 12:23 PM, Benoit Chesneaubchesn...@gmail.com wrote:
 2009/8/14 Chris Anderson jch...@apache.org:
 Many Apache projects have sub-projects, for two good example see:

 http://hadoop.apache.org/ which has 9 sub-projects

 http://lucene.apache.org/ which has 10

 I think one benefit of having sub-projects is broadening the
 community. I think it also helps to give people looking at CouchDB for
 the first time an easier way to see some of the really cool tools and
 libraries it's offers.

 Also, I think it sounds relaxing. Being able to keep an eye on a more
 of the Apache-licensed CouchDB ecosystem in one repository I think
 will result in stronger code.

 I'd like to see a few projects out there become sub-projects, and
 maybe there are others we should include as well. Here's a list of 3:

 The CouchDB-Lounge project provides a CouchDB clustering via a smart
 HTTP proxy. I can see bringing that code in, and using it as a
 scaffold for our Erlang clustering infrastructure. If we do it right,
 deployments will have wide flexibility over which tools to use to
 scale CouchDB, being able to mix, say, CouchDB-Lounge's consistent
 hashing nginx-proxy for document CRUD, but use Erlang view merger or
 other cluster-optimized view engine. If someone is already a heavy
 nginx shop, but doesn't want to merge views in twisted python, they
 could see benefits to a mix and match architecture.

 Informally I asked Kevin Ferguson of CouchDB-Lounge if they'd be
 interested and he said it sounds great.

 CouchApp is a set of scripts to make deploying CouchDB design
 documents easy. I've been involved in it for a while, and Benoit has
 put a lot of time into it. The tool and the JavaScript framework it
 goes with are starting to have a community, and should gain more
 interest when the O'Reilly book goes to press. Benoit Chesneau is
 excited about bringing CouchApp into the CouchDB project.

 CouchDB-Lucene is another good candidate. I haven't asked Robert
 Newson yet what he thinks about it, but I think the project would be a
 good fit.

 There may be more candidates I'm missing, or maybe people will think
 I'm batty for having the idea in the first place... comments welcome.

 Cheers,
 Chris


 --
 Chris Anderson
 http://jchrisa.net
 http://couch.io


 Having Couchapp as a CouchDB subproject would be great. Imho using
 apache infrastructure for bugs and release process is good for users
 and developpers. Also since couchapp and other projects mentionned
 here are dependant on couchdb and sometimes require changes/evolution
 from CouchDB I think it's good to have all of them together.

 So yest, that should be good! :)

 -benoit



Re: Apache sub-projects

2009-08-14 Thread Noah Slater
On Thu, Aug 13, 2009 at 11:52:42PM -0700, Chris Anderson wrote:
 The CouchDB-Lounge project provides a CouchDB clustering via a smart
 HTTP proxy. I can see bringing that code in, and using it as a
 scaffold for our Erlang clustering infrastructure. If we do it right,
 deployments will have wide flexibility over which tools to use to
 scale CouchDB, being able to mix, say, CouchDB-Lounge's consistent
 hashing nginx-proxy for document CRUD, but use Erlang view merger or
 other cluster-optimized view engine. If someone is already a heavy
 nginx shop, but doesn't want to merge views in twisted python, they
 could see benefits to a mix and match architecture.

+1

 CouchApp is a set of scripts to make deploying CouchDB design
 documents easy. I've been involved in it for a while, and Benoit has
 put a lot of time into it. The tool and the JavaScript framework it
 goes with are starting to have a community, and should gain more
 interest when the O'Reilly book goes to press. Benoit Chesneau is
 excited about bringing CouchApp into the CouchDB project.

+1

 CouchDB-Lucene is another good candidate. I haven't asked Robert
 Newson yet what he thinks about it, but I think the project would be a
 good fit.

+1

-- 
Noah Slater, http://tumbolia.org/nslater


Re: Apache sub-projects

2009-08-14 Thread Noah Slater
On Fri, Aug 14, 2009 at 03:37:01PM +0800, J Aaron Farr wrote:
 Just to give you a heads up, the ASF has had rather mixed results with
 subprojects and we've specifically tried to avoid what's been termed as
 umbrella projects.  Jakarta is/was the poster child example.

Thanks for the feedback.

I think that we have a lot to learn from Django about selection here:

  http://jacobian.org/writing/what-is-django-contrib/

Just to summarise in quotes:

  contrib packages should be removable.

  anything in contrib needs to be generally accepted as the Right Way to do
  something for a large majority of users.

  contrib packages should solve problems encountered frequently by real-world
  web developers.

  Good contrib packages tackle issues that that are not trivial, are bikeshed
  prone, and are difficult to get right for the common case. We want to prevent
  folks from needing to decide among seventeen different session frameworks, for
  example.

  there’s a danger in bringing something into the core: it stifles future
  innovation. As soon as we “bless” a contrib package, we drastically reduce
  impetus to write competing libraries. So, a good contrib package should have
  general consensus, and should be fairly mature.

Would we need to adopt the same thinking?

Would be throw the doors wide open and let any project be a sub-project?

What about competing sub-projects, doing the same thing?

I think we need to discuss this, and get the policy nailed first.

 I'm not saying couchdb can't organize itself into subproject, but I do
 want to give you a heads up about the sort of subproject politics you
 can bump into down the road.  For example, a clear warning sign is when
 you start giving out commit rights to only certain subprojects.

I don't understand this point.

Why would any sub-project NOT have commit rights?

Or did you mean, only letting certain projects be official sub-projects?

I would have thought the latter is a requirement!

Thanks,

-- 
Noah Slater, http://tumbolia.org/nslater


Re: Apache sub-projects

2009-08-14 Thread Simon Metson

Hi,

For example, a clear warning sign is when you start giving out  
commit rights to only certain subprojects.


I don't understand this point.

Why would any sub-project NOT have commit rights?


I read this as committer A is given commit for sub-project 1, but not  
the core project or sub-project 2 e.g. you want someone committing to  
a sub-project to be involved in the whole project not just their  
corner, and have one community not fragmented communities around the  
sub-projects...

Cheers
Simon


Re: Apache sub-projects

2009-08-14 Thread Curt Arnold
Are you seeing these as having substantially different development  
communities?  If not, it would be cleaner to have them as distinct  
products of the CouchDB project and instead of distinct projects.   
A lot of the umbrella projects had little dependency between sub- 
projects and a sub-project could take or leave any relationship with  
its siblings.  Couchdb-lucene on the other hand would have a lot  
closer dependency on CouchDB that is common on the umbrella projects.


Re: Apache sub-projects

2009-08-14 Thread Noah Slater
On Fri, Aug 14, 2009 at 08:55:27AM -0500, Curt Arnold wrote:
 Are you seeing these as having substantially different development
 communities?  If not, it would be cleaner to have them as distinct
 products of the CouchDB project and instead of distinct projects.  A
 lot of the umbrella projects had little dependency between sub-projects
 and a sub-project could take or leave any relationship with its siblings.
  Couchdb-lucene on the other hand would have a lot closer dependency on
 CouchDB that is common on the umbrella projects.

All of the projects mentioned have an dependency on CouchDB. They were all
designed with the explicit intention of augmenting CouchDB in some way, and
would make no sense in any other context. So what's the difference between a
project and a product in ASF terms?

Thanks,

-- 
Noah Slater, http://tumbolia.org/nslater


Re: Apache sub-projects

2009-08-14 Thread Robert Newson
It's worth considering that most of the proposed contrib packages uses
languages other than Erlang (Python, Java, etc) and that will reduce
the number of developers that are prepared to hack on them.

On a purely selfish note, I would welcome a second person to work on
couchdb-lucene with a view to making it rock solid (I fixed a decent
sized bug today, for example), so anything that helps form a user- and
developer- community around the contrib projects would be welcome.

B.

On Fri, Aug 14, 2009 at 2:59 PM, Noah Slaternsla...@apache.org wrote:
 On Fri, Aug 14, 2009 at 08:55:27AM -0500, Curt Arnold wrote:
 Are you seeing these as having substantially different development
 communities?  If not, it would be cleaner to have them as distinct
 products of the CouchDB project and instead of distinct projects.  A
 lot of the umbrella projects had little dependency between sub-projects
 and a sub-project could take or leave any relationship with its siblings.
  Couchdb-lucene on the other hand would have a lot closer dependency on
 CouchDB that is common on the umbrella projects.

 All of the projects mentioned have an dependency on CouchDB. They were all
 designed with the explicit intention of augmenting CouchDB in some way, and
 would make no sense in any other context. So what's the difference between a
 project and a product in ASF terms?

 Thanks,

 --
 Noah Slater, http://tumbolia.org/nslater



Re: Apache sub-projects

2009-08-14 Thread Chris Anderson
On Fri, Aug 14, 2009 at 6:43 AM, Noah Slaternsla...@apache.org wrote:
 On Fri, Aug 14, 2009 at 03:37:01PM +0800, J Aaron Farr wrote:
 Just to give you a heads up, the ASF has had rather mixed results with
 subprojects and we've specifically tried to avoid what's been termed as
 umbrella projects.  Jakarta is/was the poster child example.

 Thanks for the feedback.

 I think that we have a lot to learn from Django about selection here:

  http://jacobian.org/writing/what-is-django-contrib/

 Just to summarise in quotes:

  contrib packages should be removable.

  anything in contrib needs to be generally accepted as the Right Way to do
  something for a large majority of users.

  contrib packages should solve problems encountered frequently by real-world
  web developers.

  Good contrib packages tackle issues that that are not trivial, are bikeshed
  prone, and are difficult to get right for the common case. We want to prevent
  folks from needing to decide among seventeen different session frameworks, 
 for
  example.

  there’s a danger in bringing something into the core: it stifles future
  innovation. As soon as we “bless” a contrib package, we drastically reduce
  impetus to write competing libraries. So, a good contrib package should have
  general consensus, and should be fairly mature.

 Would we need to adopt the same thinking?

I generally agree with those principles, although I think we can get
away with being informal about them at least at first.

For instance, I don't think it would serve well to make CouchRest or
other client-library code a subproject. If we find that subproject
organization does help a lot, then maybe Futon would make sense as a
subproject, but I'm not in a rush.

I'd be happy to give subproject developers full commit access, and
trust them to know their limits around things like the storage engine,
etc.

As far as calling them projects vs some other word, I'm not really
concerned about labeling, but other Apache projects use the word
sub-project and that seems to make it clear what they mean. Generally
I think getting both the community and the code for a few of these
core projects under one roof seems like the win to me.


 Would be throw the doors wide open and let any project be a sub-project?

 What about competing sub-projects, doing the same thing?

 I think we need to discuss this, and get the policy nailed first.

 I'm not saying couchdb can't organize itself into subproject, but I do
 want to give you a heads up about the sort of subproject politics you
 can bump into down the road.  For example, a clear warning sign is when
 you start giving out commit rights to only certain subprojects.

 I don't understand this point.

 Why would any sub-project NOT have commit rights?

 Or did you mean, only letting certain projects be official sub-projects?

 I would have thought the latter is a requirement!

 Thanks,

 --
 Noah Slater, http://tumbolia.org/nslater




-- 
Chris Anderson
http://jchrisa.net
http://couch.io


Re: Apache sub-projects

2009-08-14 Thread Noah Slater
On Fri, Aug 14, 2009 at 09:55:34AM -0700, Chris Anderson wrote:
 I generally agree with those principles, although I think we can get
 away with being informal about them at least at first.

Creating a project as a sub-project of Apache CouchDB is a necessarily formal
affair, and on that we can't easily go back on without a lot of pain. Blessing
something as a sub-project will have massive implications for the community
around it, and around CouchDB proper.

While I think there is some obvious value to be found here, like with Django, I
think that we need to think about this seriously before we accept any projects.
If if we decide on a criteria for inclusion at some future point, that excludes
some existing sub-projects because we were in a rush to add them, then we are
going to have a bit of an awkward problem on our hands.

Does the ASF have any guidelines for this?

Is there any ASF lore that we can fall back on for guidance here?

Thanks,

-- 
Noah Slater, http://tumbolia.org/nslater


Re: Apache sub-projects

2009-08-14 Thread Chris Anderson
On Fri, Aug 14, 2009 at 10:19 AM, Noah Slaternsla...@apache.org wrote:
 On Fri, Aug 14, 2009 at 09:55:34AM -0700, Chris Anderson wrote:
 I generally agree with those principles, although I think we can get
 away with being informal about them at least at first.

 Creating a project as a sub-project of Apache CouchDB is a necessarily formal
 affair, and on that we can't easily go back on without a lot of pain. Blessing
 something as a sub-project will have massive implications for the community
 around it, and around CouchDB proper.

 While I think there is some obvious value to be found here, like with Django, 
 I
 think that we need to think about this seriously before we accept any 
 projects.
 If if we decide on a criteria for inclusion at some future point, that 
 excludes
 some existing sub-projects because we were in a rush to add them, then we are
 going to have a bit of an awkward problem on our hands.

 Does the ASF have any guidelines for this?

 Is there any ASF lore that we can fall back on for guidance here?

I think a shared understanding of our values around subprojects (and
more generally) is important. But I'd be quite happy to bring in
sub-projects on an ad-hoc basis.

So often it is a mix of the people, the code, and circumstance that
makes one project stand out for inclusion. Trying to apply the same
reasoning when bringing in all/any sub-projects doesn't seem
necessarily productive. I *do* agree with you that we should talk
about how to make the decision to bring on a sub-project. I just think
that we should treat them on a project by project basis.

I liked the Django quotes you posted. I'm pasting them again here:

==

I think that we have a lot to learn from Django about selection here:

 http://jacobian.org/writing/what-is-django-contrib/

Just to summarise in quotes:

 contrib packages should be removable.

 anything in contrib needs to be generally accepted as the Right Way to do
 something for a large majority of users.

 contrib packages should solve problems encountered frequently by real-world
 web developers.

 Good contrib packages tackle issues that that are not trivial, are bikeshed
 prone, and are difficult to get right for the common case. We want to prevent
 folks from needing to decide among seventeen different session frameworks, for
 example.

 there’s a danger in bringing something into the core: it stifles future
 innovation. As soon as we “bless” a contrib package, we drastically reduce
 impetus to write competing libraries. So, a good contrib package should have
 general consensus, and should be fairly mature.

Would we need to adopt the same thinking?

 --
 Noah Slater, http://tumbolia.org/nslater



One thing all the projects we've mentioned have in common is a
polyglot of languages. I think it will be healthy for CouchDB to have
a few languages represented in the distribution.

Now that I think about it, QueryServers might make a good sub-project.
It would be a good way to involve people from the Erlang / Ruby /
Python, etc communities to contribute to CouchDB. Now that we have a
test suite for query servers we can afford to keep contrib versions of
them in more than one or two languages. I'm not proposing we refactor
this out immediately, but it's the direction I can see sub-projects
going.

I'd love to hear what other people think about sub-projects in
general, or if there are other projects we haven't mentioned that
would make good sub-projects. It would be helpful to have some obvious
that's not a subproject candidates as well.

Chris



-- 
Chris Anderson
http://jchrisa.net
http://couch.io


Re: Apache sub-projects

2009-08-14 Thread Will Hartung
On Fri, Aug 14, 2009 at 1:59 PM, Shaun Lindsaysh...@meebo.com wrote:
 I'd be excited to see couchdb-lounge make it in as a subproject, especially
 if the overall goal is to incorporate features from the lounge directly in
 to CouchDB.  Making it a subproject seems like a good intermediate step in
 that direction.

That's a key consideration, I think.

Taking lounge as an example, if Couch intends to offer lounge
capability as a first class service, and, in the end, obsolete
lounge, then the decision needs to be made whether to incorporate and
absorb lounge, or, effectively, compete against lounge.

Making it a 'sub project', leads more towards the former incorporate
option. If it intends to compete, then, clearly, lounge has no formal
place within the project.

Mind, even if lounge in its current incarnation goes against the way
most folks would want to eventually see lounge-like capability
implemented in couch (i.e. native erlang rather than a bolted on,
external service, or whatever), incorporating the existing project
with the intent towards incorporation helps funnel the energy and more
formally direct the effort.

Now, on a related note, I think there is a down side to sub projects
and incorporation.

Using lounge as an example, I would expect, as a consumer, that when
Couch reaches the next version, whatever or whenever that may be, the
sub projects should follow along. If I download the latest CouchDB, I
should be able to download the matching, tested, approved version of
CouchDB Lounge at the same time.

Specifically, I think once under the umbrella, the project as a whole
is responsible for maintaining and testing these sub projects.

If lounge was maintained outside of the Couch project, for example,
then the fact that Couch .10 or .11 broke lounge isn't a CouchDB
Project issue, it's a couch-lounge projects maintainers issue. In a
CouchDB projects issue in terms of customer service, friendliness
to outside developers, etc. But, in the end, the fact that an
external project isn't keeping up, isn't a project issue as a whole.

Whereas if lounge were a sub project of Couch, I can see that the fact
that lounge didn't work would be a release stopping issue that must be
resolved (minimally considered) before a new Couch release comes out.

Now, I'm not saying any of this is or isn't happening with the
couch-lounge project specifically, I merely use it as a potential
example.

If sub projects don't fall under this kind of formal attention, then I
don't see what value they really have over simply having a list of
links to good projects that work with couch on the wiki or web site.

Regards,

Will Hartung


Re: Apache sub-projects

2009-08-14 Thread Noah Slater
On Fri, Aug 14, 2009 at 03:11:45PM -0700, Will Hartung wrote:
 Taking lounge as an example, if Couch intends to offer lounge
 capability as a first class service, and, in the end, obsolete
 lounge, then the decision needs to be made whether to incorporate and
 absorb lounge, or, effectively, compete against lounge.

 Making it a 'sub project', leads more towards the former incorporate
 option. If it intends to compete, then, clearly, lounge has no formal
 place within the project.

 Mind, even if lounge in its current incarnation goes against the way
 most folks would want to eventually see lounge-like capability
 implemented in couch (i.e. native erlang rather than a bolted on,
 external service, or whatever), incorporating the existing project
 with the intent towards incorporation helps funnel the energy and more
 formally direct the effort.

I think that this might form a nice guiding principal:

  All sub-projects will eventually be merged into CouchDB, or abandoned.

Are there any flaws to this approach?

Thanks,

-- 
Noah Slater, http://tumbolia.org/nslater


Re: Apache sub-projects

2009-08-14 Thread Chris Anderson
On Fri, Aug 14, 2009 at 3:17 PM, Noah Slaternsla...@apache.org wrote:
 On Fri, Aug 14, 2009 at 03:11:45PM -0700, Will Hartung wrote:
 Taking lounge as an example, if Couch intends to offer lounge
 capability as a first class service, and, in the end, obsolete
 lounge, then the decision needs to be made whether to incorporate and
 absorb lounge, or, effectively, compete against lounge.

 Making it a 'sub project', leads more towards the former incorporate
 option. If it intends to compete, then, clearly, lounge has no formal
 place within the project.

 Mind, even if lounge in its current incarnation goes against the way
 most folks would want to eventually see lounge-like capability
 implemented in couch (i.e. native erlang rather than a bolted on,
 external service, or whatever), incorporating the existing project
 with the intent towards incorporation helps funnel the energy and more
 formally direct the effort.

 I think that this might form a nice guiding principal:

  All sub-projects will eventually be merged into CouchDB, or abandoned.

 Are there any flaws to this approach?

I was actually thinking of it from the opposite direction.

First of all let me say that the 3 projects I nominated follow the
Django tests, and will be kept up to date with CouchDB as new versions
are released. That is part of what made picking those 3 so easy.

Focussing on Lounge: I see a future where Lounge is ported piece by
piece to Erlang. The Lounge architecture is sound, and there are
advantages to porting eg, the message passing from HTTP / JSON to
Erlang terms. There is also the advantage of interoperability with
existing stacks (which the current Lounge implementation is awesome
at.) I also think remixes like an Erlang smart proxy behind an nginx
running dumbproxy, might turn out to be useful.

I'm merely speculating, but I think if we build a pure-Erlang CouchDB
cluster, getting there by working from the current Lounge
implementation, and porting parts of the service to Erlang, will give
us a cleaner, stronger code base. Both for CouchDB core as well as in
the Lounge.

  All sub-projects will eventually be merged into CouchDB, or abandoned.

I'd like to see Lounge as a sub-project precisely because I don't want
to see it merged into CouchDB. CouchDB is not just targeting
multi-node clusters, but also smartphones and in-browser operation.
This makes me think of Django's principle:

 contrib packages should be removable.

CouchDB core should remain focused on reliability and performance on a
single node. If CouchDB Lounge is maintained as both HTTP / Python
tools, as well as a set of Erlang applications you can run inside a
CouchDB beam machine, then we'll have all the benefits of an Erlang
Lounge, with the discipline and code clarity that will come from
refactoring CouchDB core to better support Erlang Lounge clients.

There is a lot of code that can go into monitoring a large cluster,
and our Erlang Lounge will eventually want to provide cluster health
services as well. There are also optimizations (like view row
reshuffling) which are only appropriate on very large clusters, so
they should not be part of the core project, but we want to encourage
their development.

The modularity we'll get from being able to deploy the most
appropriate combination of nginx, twisted python, erlang, etc to
monitor a cluster will serve us well in the long run, I hope.

I guess part of what it means to bring in a sub-project is an
acknowledgment that the project we're bringing in reflects CouchDB's
goals and architecture, but might not be appropriate to deploy for all
the use cases we want to support.

There is a similar story to tell about CouchApp and Lucene, but I'll
spare the details.

Chris




 Thanks,

 --
 Noah Slater, http://tumbolia.org/nslater




-- 
Chris Anderson
http://jchrisa.net
http://couch.io


Re: Apache sub-projects

2009-08-14 Thread Noah Slater
On Fri, Aug 14, 2009 at 03:49:33PM -0700, Chris Anderson wrote:
 CouchDB core should remain focused on reliability and performance on a
 single node.

I disagree pretty strongly with this. Focusing on single node performance is a
good short term goal, but saying that multi-node environments are less important
is antithetical to the core goals of CouchDB.

Without sounding like a dick, the clue is in the name:

  Cluster of Unreliable Commodity Hardware.

I'm not aware that this vision of a distributed database has been abandoned.

 There is a lot of code that can go into monitoring a large cluster,
 and our Erlang Lounge will eventually want to provide cluster health
 services as well. There are also optimizations (like view row
 reshuffling) which are only appropriate on very large clusters, so
 they should not be part of the core project, but we want to encourage
 their development.

 The modularity we'll get from being able to deploy the most
 appropriate combination of nginx, twisted python, erlang, etc to
 monitor a cluster will serve us well in the long run, I hope.

I think it's a bit more nuanced than this. Like the essay I referenced in my
first post, we don't want people to be overwhelmed by too many choices. We
should figure out what works best for most people, and roll that in.

 I guess part of what it means to bring in a sub-project is an
 acknowledgment that the project we're bringing in reflects CouchDB's
 goals and architecture, but might not be appropriate to deploy for all
 the use cases we want to support.

It worries me that we understand the project goals so differently.

To the best of my understanding, CouchDB was always intended to be distributed
at some point, and I would be interested in hearing what other people think
about this, and when, if so, this was abandoned.

Best,

-- 
Noah Slater, http://tumbolia.org/nslater


Re: Apache sub-projects

2009-08-14 Thread Chris Anderson
On Fri, Aug 14, 2009 at 4:01 PM, Noah Slaternsla...@apache.org wrote:
 On Fri, Aug 14, 2009 at 03:49:33PM -0700, Chris Anderson wrote:
 CouchDB core should remain focused on reliability and performance on a
 single node.

 I disagree pretty strongly with this. Focusing on single node performance is a
 good short term goal, but saying that multi-node environments are less 
 important
 is antithetical to the core goals of CouchDB.

 Without sounding like a dick, the clue is in the name:

  Cluster of Unreliable Commodity Hardware.

 I'm not aware that this vision of a distributed database has been abandoned.

 There is a lot of code that can go into monitoring a large cluster,
 and our Erlang Lounge will eventually want to provide cluster health
 services as well. There are also optimizations (like view row
 reshuffling) which are only appropriate on very large clusters, so
 they should not be part of the core project, but we want to encourage
 their development.

 The modularity we'll get from being able to deploy the most
 appropriate combination of nginx, twisted python, erlang, etc to
 monitor a cluster will serve us well in the long run, I hope.

 I think it's a bit more nuanced than this. Like the essay I referenced in my
 first post, we don't want people to be overwhelmed by too many choices. We
 should figure out what works best for most people, and roll that in.

 I guess part of what it means to bring in a sub-project is an
 acknowledgment that the project we're bringing in reflects CouchDB's
 goals and architecture, but might not be appropriate to deploy for all
 the use cases we want to support.

 It worries me that we understand the project goals so differently.

 To the best of my understanding, CouchDB was always intended to be distributed
 at some point, and I would be interested in hearing what other people think
 about this, and when, if so, this was abandoned.


abandoned ;)

Wow, that's a lot of word. I'd be the last to say anything of the
sort. I consider multi-node CouchDB to be as important as it gets, as
far as goals of the project go.

It's encouraging that even without explicit code to deal with
clusters, people have been able to run large reliable clusters.
CouchDB can only get better at clustering from here. I think it's
important that in the near-term, the practice of running a 100-node
CouchDB is well known and easy for people to boot up and run and get
comfortable with.

On the other hand, we're trying to compress the entire application
into a few MB so it can run in browsers and smartphones. It's
legitimate to want to deploy CouchDB in these environments as well. A
mobile phone has no need to run a hundred-node partitioned cluster of
CouchDBs. However, there's some awesome stuff it could do if it ran
the CouchDB we have now. I'm just trying to preserve that option.

I'm trying to organize the code and the effort so we can reach both
goals, pursuing them both makes us a stronger project with cleaner
code.

Chris




 Best,

 --
 Noah Slater, http://tumbolia.org/nslater




-- 
Chris Anderson
http://jchrisa.net
http://couch.io


Re: Apache sub-projects

2009-08-14 Thread Noah Slater
On Fri, Aug 14, 2009 at 04:11:59PM -0700, Chris Anderson wrote:
 It's encouraging that even without explicit code to deal with
 clusters, people have been able to run large reliable clusters.
 CouchDB can only get better at clustering from here. I think it's
 important that in the near-term, the practice of running a 100-node
 CouchDB is well known and easy for people to boot up and run and get
 comfortable with.

 On the other hand, we're trying to compress the entire application
 into a few MB so it can run in browsers and smartphones. It's
 legitimate to want to deploy CouchDB in these environments as well. A
 mobile phone has no need to run a hundred-node partitioned cluster of
 CouchDBs. However, there's some awesome stuff it could do if it ran
 the CouchDB we have now. I'm just trying to preserve that option.

Sure, but I think your original wording polarised the goals of the project.

Perhaps it would be sufficient to say that we want to build a system that is
flexible enough to be deployed on anything from embedded devices, mobile phones,
workstations, or massively distributed server clusters.

And on that note, why would we want to keep lounge out of the core? I'm sure we
have enough brains between us to figure out how to package our software so that
it can shrink or expand based on the needs of the local sysadmin.

Best,

-- 
Noah Slater, http://tumbolia.org/nslater


Re: Apache sub-projects

2009-08-14 Thread Chris Anderson
On Fri, Aug 14, 2009 at 4:32 PM, Noah Slaternsla...@apache.org wrote:
 On Fri, Aug 14, 2009 at 04:11:59PM -0700, Chris Anderson wrote:
 It's encouraging that even without explicit code to deal with
 clusters, people have been able to run large reliable clusters.
 CouchDB can only get better at clustering from here. I think it's
 important that in the near-term, the practice of running a 100-node
 CouchDB is well known and easy for people to boot up and run and get
 comfortable with.

 On the other hand, we're trying to compress the entire application
 into a few MB so it can run in browsers and smartphones. It's
 legitimate to want to deploy CouchDB in these environments as well. A
 mobile phone has no need to run a hundred-node partitioned cluster of
 CouchDBs. However, there's some awesome stuff it could do if it ran
 the CouchDB we have now. I'm just trying to preserve that option.

 Sure, but I think your original wording polarised the goals of the project.

 Perhaps it would be sufficient to say that we want to build a system that is
 flexible enough to be deployed on anything from embedded devices, mobile 
 phones,
 workstations, or massively distributed server clusters.

 And on that note, why would we want to keep lounge out of the core? I'm sure 
 we
 have enough brains between us to figure out how to package our software so 
 that
 it can shrink or expand based on the needs of the local sysadmin.

Well I guess I shouldn't have started calling it core. I think what we
have now is CouchDB, and it could do to be modularized. I see the
project of creating an Erlang Lounge to be a good way to modularize
CouchDB. Lounge is already a project with a community and users, and
is architecturally compatible. So the idea of bringing it in as a
sub-project makes sense, especially in light of what you're saying
about shrink and expand, and the Django idea that contrib modules
should be removable.

I'm probably just being polarizing in my wording because I'm exhausted
working up to the move to Berkeley.

Anyway, my goals with the whole sub-project thing are at least as much
about community as technology. It looks like we have the opportunity
to bring in some more people who really understand CouchDB, and at the
same time a parallel opportunity to structure our development in a way
that increases our flexibility and code quality, so I'm suggesting we
take it. It may not be 100% perfect from every conceivable angle, but
it seems like a general win-win.

Also, I'm totally open to calling Lounge / CouchApp / etc something
other than sub-projects but the sub-project seems like a known
entity in the Apache world, so perhaps it's the most relaxing option.

Cheers,
Chris


 Best,

 --
 Noah Slater, http://tumbolia.org/nslater




-- 
Chris Anderson
http://jchrisa.net
http://couch.io