Bob, thanks, that is interesting. I will checkout your code and see if I can get it working, I wrote couchdb-clucene and am interested in a lightweight text search for couchdb. I also liked your work with ontylog, but I can't mix GPL with anything I am doing.
Norman On Mon, Sep 20, 2010 at 7:22 PM, Robert Dionne <[email protected]> wrote: > Norman, > > Actually ontylog is GPL, and I wouldn't wish that code on anyone just yet. > Think of it as the contents of my /etc directory. > > The indexer I'm chipping away at is just a proof of concept hacked up from > Joe Armstrong's Erlang book (with his permission). Anyone is welcome to use > it that as they see fit, though it does have restrictions from Armstrong > press. It's been great for me to learn erlang and explore the couch > internals. It's also nice to have something nice and light running in couch. > > My thoughts about plugins have nothing to do with licenses. I'd like the > fact that couchdb is simple and lean and more rock solid. I'm not sure > multiview, geocouch, fti, or any other indexers belong in the core. With > multiview I think there's perhaps something more general that might be part > of core but I haven't given it a lot of thought yet. > > Cheers, > > Bob > > > > > On Sep 20, 2010, at 7:02 PM, Norman Barker wrote: > >> Bob, >> >> I can see why plugins might work for you since your ontology / >> indexing code is GPL, however I am more than happy for the multiview >> to be apache licensed and would like to see it in trunk. >> >> I like the concept of plugins as it creates a stable API for third >> parties, but I think a multiview is a core feature of CouchDB. >> >> Norman >> >> On Mon, Sep 20, 2010 at 4:19 AM, Robert Dionne >> <[email protected]> wrote: >>> I see, neat. >>> >>> I ask because you might treat disjunction and conjunction differently in >>> terms of whether you run around the ring or broadcast to all the nodes. For >>> conjunctions you need all to succeed so broadcast might fare better whereas >>> for disjunctions only one need succeed. I suppose it would depend largely >>> on the number of views and the amount of each computation. >>> >>> Anyway I guess I have mixed feelings about seeing this in core. I see a lot >>> of folks already struggling to get their arms around working with >>> map/reduce. It would make a good plugin for advanced users. Actually the >>> ability to have plugins is almost there now. I have an indexer that only >>> requires some ini file mods and getting the code on the classpath. I think >>> all that's needed at this point is: >>> >>> 1. conventions for a plugins directory >>> >>> 2. way of specing gen_servers in order to supervise them >>> >>> 3. some apis around some of the internals. >>> >>> I'm oversimplifying it for sure, the devils in the details and it's the >>> kind of thing programmers love to argue about ad nauseum but no one wants >>> to do it (myself included :) >>> >>> Best, >>> >>> Bob >>> >>> >>> >>> On Sep 19, 2010, at 10:22 AM, Norman Barker wrote: >>> >>>> Bob, >>>> >>>> it is just checking that a given id participates in a view, if it >>>> makes it around the ring then it wins and gets streamed to the client, >>>> adding disjoints would be fairly simple. Currently the only way I can >>>> check if an id is in a view is to loop over the results of each view, >>>> hence each node in the ring is in its own process to keep things >>>> moving. >>>> >>>> A use case is two views, one that emits datetime (numeric) and another >>>> view that emits values, e.g. A, B, C ..., the query would then be to >>>> find the all documents with value A between start time and end time. >>>> >>>> Norman >>>> >>>> On Sun, Sep 19, 2010 at 5:21 AM, Robert Dionne >>>> <[email protected]> wrote: >>>>> I took another peek at this and I'm curious as to what it's doing. Is it >>>>> just checking that a given id participates in a view? So if it makes it >>>>> around the ring it wins? Or is it actually computing the result of >>>>> passing the doc thru all the views? >>>>> >>>>> If the answer is the former then would disjunction also be something one >>>>> might want? I'm just curious, I don't have a use case and I forget the >>>>> original discussion around this. I sort of think of views as a functional >>>>> mapping from the database to some subset. That's not entirely accurate >>>>> given there's this reduce phase also. So I could imagine composing views >>>>> in a functional way, but the same thing can be had with just a different >>>>> map function that is the composition. >>>>> >>>>> Anyway if you have a brief description of this, with a use case, it >>>>> would help. >>>>> >>>>> Cheers, >>>>> >>>>> Bob >>>>> >>>>> >>>>> >>>>> >>>>> On Sep 17, 2010, at 11:32 PM, Norman Barker wrote: >>>>> >>>>>> Chris, James >>>>>> >>>>>> thanks for bumping this, we are using this internally at 'scale' >>>>>> (million+ keys). I want this to work for couchdb as we want to give >>>>>> back for such a great product and support this going forward, so any >>>>>> suggestions welcomed and we will test and add them to the local github >>>>>> account with the aim of getting this into trunk. >>>>>> >>>>>> Norman >>>>>> >>>>>> On Fri, Sep 17, 2010 at 7:00 PM, James Hayton >>>>>> <[email protected]> wrote: >>>>>>> I want to use it! I just haven't gotten around to it. I was going to >>>>>>> try >>>>>>> and test it out this weekend and if I am able, I will certainly report >>>>>>> back >>>>>>> what I find. >>>>>>> >>>>>>> James >>>>>>> >>>>>>> On Fri, Sep 17, 2010 at 5:55 PM, Chris Anderson <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> On Mon, Aug 30, 2010 at 10:58 AM, Norman Barker >>>>>>>> <[email protected]> >>>>>>>> wrote: >>>>>>>>> Bob, >>>>>>>>> >>>>>>>>> I can and have been testing the multiview at this scale, it is ok >>>>>>>>> (fast enough), but I think being able to test inclusion of a document >>>>>>>>> id in a view without having to loop would be a considerable speed >>>>>>>>> improvement. If you have any ideas let me know. >>>>>>>>> >>>>>>>> >>>>>>>> I just want to bump this thread, as I think this is a useful feature. >>>>>>>> I don't expect to be able to test it in the coming weeks, but if I did >>>>>>>> I would. Is anyone besides Norman using this? Has anyone used it at >>>>>>>> scale? >>>>>>>> >>>>>>>> Cheers, >>>>>>>> Chris >>>>>>>> >>>>>>>>> thanks, >>>>>>>>> >>>>>>>>> Norman >>>>>>>>> >>>>>>>>> On Mon, Aug 30, 2010 at 10:49 AM, Robert Newson >>>>>>>>> <[email protected]> >>>>>>>> wrote: >>>>>>>>>> I'm sorry, I've had no time to play with this at scale. >>>>>>>>>> >>>>>>>>>> On Mon, Aug 30, 2010 at 5:35 PM, Norman Barker >>>>>>>>>> <[email protected]> >>>>>>>> wrote: >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> are there any more comments on this, if not can you describe the >>>>>>>>>>> process (in particular how to obtain a wiki and jira account for >>>>>>>>>>> couchdb which I have been unable to do) and I will start documenting >>>>>>>>>>> this so we can put this into the trunk. >>>>>>>>>>> >>>>>>>>>>> Bob, were you able to do any more testing with large views, are >>>>>>>>>>> there >>>>>>>>>>> any suggestions on how to speed up the document id inclusion test as >>>>>>>>>>> described below? >>>>>>>>>>> >>>>>>>>>>> thanks, >>>>>>>>>>> >>>>>>>>>>> Norman >>>>>>>>>>> >>>>>>>>>>> On Mon, Aug 23, 2010 at 9:22 AM, Norman Barker < >>>>>>>> [email protected]> wrote: >>>>>>>>>>>> Bob, >>>>>>>>>>>> >>>>>>>>>>>> thanks for the feedback and for taking a look at the code. >>>>>>>>>>>> Guidelines >>>>>>>>>>>> on when to use a supervisor within couchdb with a gen_server would >>>>>>>>>>>> be >>>>>>>>>>>> appreciated, currently I have a supervisor and a gen_server, but if >>>>>>>>>>>> couchdb has a supervision process I could remove that layer. >>>>>>>>>>>> >>>>>>>>>>>> I think plugins is a great idea, however intersection of views is >>>>>>>>>>>> such >>>>>>>>>>>> as common request, perhaps there needs to plugin system and if a >>>>>>>>>>>> plugin is rated enough it goes into trunk as a core feature. >>>>>>>>>>>> >>>>>>>>>>>> the four (or slightly more) summary is here >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>> http://github.com/normanb/couchdb/raw/trunk/src/couchdb/couch_query_ring.erl >>>>>>>>>>>> >>>>>>>>>>>> % >>>>>>>>>>>> % send an id from the start list to the next node in the ring, if >>>>>>>>>>>> the >>>>>>>>>>>> id is in adjacent node then the this node sends to the next ring >>>>>>>>>>>> node >>>>>>>>>>>> .... >>>>>>>>>>>> % if the id gets all round the ring and back to the start node >>>>>>>>>>>> then is >>>>>>>>>>>> has intersected all queries and should be included. The nodes in >>>>>>>>>>>> the >>>>>>>>>>>> ring >>>>>>>>>>>> % should be sorted in size from small to large for this to be >>>>>>>> effective >>>>>>>>>>>> % >>>>>>>>>>>> % In addition send the initial id list round in parallel >>>>>>>>>>>> >>>>>>>>>>>> it really needs some eyes from the core couchdb coders to see how >>>>>>>>>>>> to >>>>>>>>>>>> speed up the inclusion testing, looping is bad even if it is done >>>>>>>>>>>> in >>>>>>>>>>>> parallel. >>>>>>>>>>>> >>>>>>>>>>>> Multiview is usable, I am using it with some pretty big mega-views >>>>>>>>>>>> (as >>>>>>>>>>>> per the raindrop) model, I am also available to add features to >>>>>>>>>>>> this >>>>>>>>>>>> as this is core part of our work and we want to give it to couch >>>>>>>>>>>> as a >>>>>>>>>>>> contribution. >>>>>>>>>>>> >>>>>>>>>>>> thanks, >>>>>>>>>>>> >>>>>>>>>>>> Norman >>>>>>>>>>>> >>>>>>>>>>>> On Mon, Aug 23, 2010 at 5:05 AM, Robert Dionne >>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>> Hi Norman, >>>>>>>>>>>>> >>>>>>>>>>>>> I took a peek at multiview. I haven't followed this too closely >>>>>>>>>>>>> on >>>>>>>> the mailing list but this is *view intersection*? Is there a 5 line >>>>>>>> summary >>>>>>>> of what this does somewhere? >>>>>>>>>>>>> >>>>>>>>>>>>> I'm curious as to why the daemon needs to be a supervisor, most >>>>>>>>>>>>> if >>>>>>>> not all of the other daemons are gen_servers. OTP allows this but I >>>>>>>> think >>>>>>>> this is a good area where some CouchDB guidelines on plugins would >>>>>>>> apply. >>>>>>>>>>>>> >>>>>>>>>>>>> It strikes me that views, the use of map/reduce, etc. are one of >>>>>>>>>>>>> the >>>>>>>> trickier aspects of using CouchDB, particularly for new users coming >>>>>>>> from >>>>>>>> the SQL world. People are also reporting issues with performance of >>>>>>>> views, I >>>>>>>> guess often because reduce functions go out of control. >>>>>>>>>>>>> >>>>>>>>>>>>> I think the project would be better served if features like this >>>>>>>> were available as plugins. I would put GeoCouch in the same category. >>>>>>>> Its >>>>>>>> very neat and timely (given everyone wants to know where everyone else >>>>>>>> is >>>>>>>> using their telephone but without talking other than asynchronously), >>>>>>>> but a >>>>>>>> server plugin architecture that would allow this to be done cleanly >>>>>>>> should >>>>>>>> come first. >>>>>>>>>>>>> >>>>>>>>>>>>> This is just my opinion. I'd love to see some of the project >>>>>>>> founders and committers weigh in on this and set some direction. >>>>>>>>>>>>> >>>>>>>>>>>>> Best regards, >>>>>>>>>>>>> >>>>>>>>>>>>> Bob >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Aug 22, 2010, at 5:45 PM, Norman Barker wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> I would like to take this multiview code and have it added to >>>>>>>>>>>>>> trunk >>>>>>>> if >>>>>>>>>>>>>> possible, what are the next steps? >>>>>>>>>>>>>> >>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Norman >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Wed, Aug 18, 2010 at 11:44 AM, Norman Barker < >>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>> I have made >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> http://github.com/normanb/couchdb >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> which is a fork of the latest couchdb trunk with the multiview >>>>>>>>>>>>>>> code >>>>>>>>>>>>>>> and tests added. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> If geocouch is available then it can still be used. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> There are a couple of questions about the multiview on the user >>>>>>>> /dev >>>>>>>>>>>>>>> list so I will be adding some more test cases during today. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Norman >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Tue, Aug 17, 2010 at 9:23 PM, Norman Barker < >>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>> this is possible, I forked geocouch since I use it, but I have >>>>>>>> already >>>>>>>>>>>>>>>> separated the geocouch dependencies from the trunk. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I can do this tomorrow, certainly be interested in any >>>>>>>>>>>>>>>> feedback. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Norman >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Tue, Aug 17, 2010 at 7:49 PM, Volker Mische < >>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>> On 08/18/2010 03:26 AM, J Chris Anderson wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Aug 16, 2010, at 4:38 PM, Norman Barker wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I have made the changes as recommended, adding a test case >>>>>>>>>>>>>>>>>>> multiview.js and also adding the userCtx to open the db. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I have also forked geocouch and this is available here >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> this patch seems important (especially as people are already >>>>>>>> asking for >>>>>>>>>>>>>>>>>> help using it on user@) >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> to get it committed, it either must remove the dependency on >>>>>>>> GeoCouch, or >>>>>>>>>>>>>>>>>> become part of CouchDB when (and if) GeoCouch becomes part of >>>>>>>> CouchDB. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Is it possible / useful to make a version that doesn't use >>>>>>>> GeoCouch? And >>>>>>>>>>>>>>>>>> then to make the GeoCouch capabilities part GeoCouch for now? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hi Norman, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> if the patch is ready for trunk, I'd be happy to move the >>>>>>>> GeoCouch bits to >>>>>>>>>>>>>>>>> GeoCouch itself (as GeoCouch isn't ready for trunk yet). >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Lately I haven't been that responsive when it comes to >>>>>>>>>>>>>>>>> GeoCouch, >>>>>>>> but that >>>>>>>>>>>>>>>>> will change (in about a month) after holidays and FOSS4G. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>>>>> Volker >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Chris Anderson >>>>>>>> http://jchrisa.net >>>>>>>> http://couch.io >>>>>>>> >>>>>>> >>>>> >>>>> >>> >>> > >
