On Aug 16, 2011, at 4:00 PM, Robert Newson wrote: > Ok, let's see Pauls' code concerns addressed first, it needs that > cleanup before it can hit trunk. > > I'd still prefer to see an event-driven rather than polling approach, > e.g, hook into update_notifier and build a queue of databases that are > actively being written to (and therefore growing). A much lazier > background thing could compact databases that are inactive.
Jup, my discussion was barring that all that is sorted out as an "implementation detail". Back to JIRA. Cheers Jan -- > > B. > > On 16 August 2011 14:48, Jan Lehnardt <[email protected]> wrote: >> >> On Aug 16, 2011, at 3:44 PM, Robert Newson wrote: >> >>> All good points Jan, thanks. >>> >>> Having large numbers of databases is one thing, but I'm focused on the >>> impact on ongoing operations with this running in the background. What >>> does it do to the users experience to have all dbs scanned >>> periodically, etc? >>> >>> The reason I suggest doing it after the move, and in its own app, is >>> to reduce the work needed to not use this code in some circumstances >>> (Cloudant hosting, for example). Yes, it's a separate module and >>> disabled by default, but putting it in its own application will make >>> the separation much more explicit and preclude unintended >>> entanglements with core over time. >> >> I think this is a valid concern, but I don't think it outweighs the >> disadvantage. I'm happy to spend time to make sure this is properly >> modular after srcmv. >> >> Cheers >> Jan >> -- >> >> >>> >>> B. >>> >>> On 16 August 2011 14:31, Jan Lehnardt <[email protected]> wrote: >>>> >>>> On Aug 16, 2011, at 2:59 PM, Robert Newson wrote: >>>> >>>>> I'm -1 on the approach (as I understand it) taken by the scheduler as >>>>> it will be problematic in precisely the circumstance when you'd most >>>>> want auto compaction (large numbers of databases and views). >>>> >>>> As Filipe mentions in the ticket, this was tested with large numbers of >>>> databases. >>>> >>>> In addition, your "most want" assumption doesn't hold for the average >>>> user, I'd wager (no numbers, alas). I'd say it's a basic user-experience >>>> plus that a software doesn't start wasting a system resource without >>>> cleaning up after itself. But this isn't even suggesting to enable this by >>>> default. We have plenty of other features that need proper documentation >>>> to be used correctly and that we are improving over time to make them >>>> more obvious by removing common errors or odd behaviour. >>>> >>>>> To this point "Just curious, would it make a big difference to commit >>>>> the patch before srcmv and migrate it with the rest of the code base >>>>> rather than letting it rot in JIRA and leave it all to Filipe to keep >>>>> it updated." -- I'm -∞ on any suggestion that code should be put in >>>>> trunk to stop it from rotting. Code should land when it's ready. I >>>>> hope we're all agreed on that and that this paragraph was redundant. >>>> >>>> I was suggesting that the the patch is ready enough for trunk and that >>>> the level of readiness should not be "solves all possible cases". >>>> Especially >>>> for something that is disabled by default. If we take this to the extreme, >>>> we'd never add any new features. >>>> >>>> I'm not suggesting "it compiles for me, lets throw it into trunk". >>>> >>>>> After srcmv, and then some work to OTP-ify each of the resultant >>>>> subdirs, we should add this as a separate application. We might also >>>>> mark it as beta in the first release to gather feedback from the >>>>> community. >>>> >>>> I don't see how that is any different from adding it before srcmv and >>>> avoiding leaving the front-porting effort to a single person. >>>> >>>> Ideally we'd already have srcmv done, but we don't and I don't want >>>> to hold off progress for an architecture change. >>>> >>>>> I'll be accused of 'stop energy' within nanoseconds of this post so I >>>>> should end by saying I'm +1 on couchdb gaining the ability to >>>>> automatically compact its databases and views in principle. >>>> >>>> :) >>>> >>>> Cheers >>>> Jan >>>> -- >>>> >>>> >>>>> >>>>> B. >>>>> >>>>> On 16 August 2011 13:19, Jan Lehnardt <[email protected]> wrote: >>>>>> Good points Robert, >>>>>> >>>>>> I replied inline and then hijacked the thread for a more general >>>>>> discussion, sorry about that :) >>>>>> >>>>>> On Aug 16, 2011, at 2:08 PM, Robert Dionne wrote: >>>>>> >>>>>>> Filipe, >>>>>>> >>>>>>> This is neat, I can definitely see the utility of the approach. I do >>>>>>> share the concerns expressed in other comments with respect to the use >>>>>>> of the config file for per db compaction specs and the use of a >>>>>>> compact_loop that waits on config change messages when the ets table is >>>>>>> empty. I don't think it fully takes into account the use case of large >>>>>>> numbers of small dbs and/or some very large dbs interspersed with a lot >>>>>>> of mid-size dbs. >>>>>> >>>>>> As I seid in the ticket, per-db config is desirable, but I think outside >>>>>> of the scope of the ticket. >>>>>> >>>>>>> Anyway I like it a lot though I've only read the code for 1/2 and hour >>>>>>> or so. I also agree with others that the code base is reaching a point >>>>>>> of being a bit crufty and it might be a good time with the git >>>>>>> migration, etc.. to take a breath and commit to making some of these >>>>>>> OTP compliant changes and design changes we've talked about. >>>>>> >>>>>> Just curious, would it make a big difference to commit the patch before >>>>>> srcmv and migrate it with the rest of the code base rather than letting >>>>>> it rot in JIRA and leave it all to Filipe to keep it updated. >>>>>> >>>>>> I also fear that a srcmv'd release is still out a bit and I'd really >>>>>> like to see this one (and a few others) go into 1.2 (as per my previous >>>>>> mail to this list in another thread). While it isn't the absolute >>>>>> perfect solution in all cases, it is disabled by default and manual >>>>>> compaction strategies work as they did before. In the meantime, we can >>>>>> refine the rest of the system to make it more fully fledged and maybe >>>>>> even enable it by default a few versions down when we are all >>>>>> comfortable with it. I'm not very comfortable keeping good patches in >>>>>> JIRA and not trunk until they solve every little edge case. We haven't >>>>>> worked like this in the past and I don't think it is worth doing. >>>>>> >>>>>> Cheers >>>>>> Jan >>>>>> -- >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> Regards, >>>>>>> >>>>>>> Bob >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Aug 15, 2011, at 9:29 PM, Filipe David Manana wrote: >>>>>>> >>>>>>>> Developers, users, >>>>>>>> >>>>>>>> It's been a while now since I opened a Jira ticket for it ( >>>>>>>> https://issues.apache.org/jira/browse/COUCHDB-1153 ). >>>>>>>> I won't describe it here with detail since it's already done in the >>>>>>>> Jira ticket. >>>>>>>> >>>>>>>> Unless there are objections, I would like to get this moving soon. >>>>>>>> >>>>>>>> Thanks >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Filipe David Manana, >>>>>>>> [email protected], [email protected] >>>>>>>> >>>>>>>> "Reasonable men adapt themselves to the world. >>>>>>>> Unreasonable men adapt the world to themselves. >>>>>>>> That's why all progress depends on unreasonable men." >>>>>>> >>>>>> >>>>>> >>>> >>>> >> >>
