Hi Robert, I've been thinking about this one for the week or so, and I have a simple suggestion:
Add the query parameter slow=true to enable this behaviour. This meets all the original requirements: 1. It is not default behaviour 2. You can grep the log files for the word 'slow' and find evidence 3. There is a shorthand, simple way to enable the behaviour 4. Any self-respecting developer will try to remove slow=true, find a break, and be forced to learn about indexes 5. It's a bit cheeky, which I think is kind of fun :D All the best, Joan ----- Original Message ----- > From: "William Edney" <[email protected]> > To: [email protected] > Sent: Friday, January 8, 2016 10:27:29 AM > Subject: Re: [POC] Mango Catch All Selector > > Hi Robert - > > As a builder of UI, API and library code who has also done developer > training on a variety of technologies, one simple fix might be go > ahead and > not require indexes to be built, but then to put a big NOTE at the > beginning of the "Mango Getting Started" guide (I would assume there > is > such a piece of documentation) that states: "Note that the examples > in this > document do not require you to build an index, but for performance > reasons > we HIGHLY RECOMMEND that you do so. *Click here* for more information > about > how to do that" (or some such verbiage). > > My 2 cents. > > Cheers, > > - Bill > > On Fri, Jan 8, 2016 at 9:04 AM, Robert Kowalski <[email protected]> > wrote: > > > Hi list, > > > > At the end of the mail I would like to invite the other folks from > > the > > mailing list that build interfaces for humans (APIs, CLIs or even > > UIs) > > to chime in again with their opinions. So all people one the ML, > > the > > mail is not just a response to Paul, feedback is welcome :) > > > > Hi Paul, I agree with the timeout. It could lead to very unpleasant > > errors which are hard to debug and support. > > > > I added some thoughts to the other points you made: > > > > > a) know that the slow queries logs exist, > > > > Hmm... If I take a look at the 1.x logging it was very > > straightforward. As a developer you would spin up a CouchDB and you > > get all the log messages into your terminal. It was quite handy in > > general for all kind of debugging. That the logs are not displayed > > directly on stdout/stderr is in my opinion a general 2.x problem. > > The > > problem does occur with all kinds of log message we produce in > > CouchDB > > for 2.x and is not specific to the slow-query-logging. > > > > > > > Ie, "You can try queries with testing:true, when you're ready to > > > move to > > production you can > > > POST your selector to _index to create the index which allows you > > > to > > > remove testing:true". > > > > I really like the migration path you mentioned here with the API to > > create indexes. I am worried to have a too high entry barrier for > > absolute newcomers, people that you want to play around before they > > are ready to think about indexes, e.g. by putting coupling the > > index > > topic from the beginning to the querying. > > > > When I throw too much things to learn on people (which may not > > have > > used a database before), most people get discouraged and does not > > take > > a look. The usual things they feel or say are : "too complicated", > > "I > > have not enough time", "product XY is easier to use". > > > > I would argue that newcomers to a database will launch a high > > traffic, > > multi-gigabyte product with the database from day one. Day one is > > the > > day where they learn how to query the data and put data into the > > database. Even for scenarios where people have a running high > > traffic > > system, and have used other databases at a medium to large scale I > > would expect given they migrate to Couch, that they run both > > systems > > in parallel for the first time in order to fix the issues that > > occur > > during a migration. > > > > I think we we share the same goal (getting beginners started > > quickly) > > and the cool thing about your suggestion is that everyone gets the > > required knowledge to run a production system right from the very > > start. My suggestion leaves some parts out, but reduces the > > cognitive > > load required to get the very first basic results, e.g. in a > > university class setting - or junior developers on their "casual > > friday 20% time". My big hope is, once those folks build high > > traffic > > systems, they remember how easy the usage of CouchDB was and that > > they > > start to learn more about CouchDB in order to run it in a system > > with > > more than a few thousand documents. > > > > > > For us both I think the "what" is clear, but the "how" is a bit > > different. I also think this discussion still makes progress, but I > > am > > afraid it could stall. I see that we both have very good rudiments > > and > > I would like to invite the other folks from the mailing list that > > build interfaces for humans (APIs, CLIs or even UIs) to chime in > > again > > with their opinions - of course I'm also looking forward to your > > answer :) > > > > Best, > > Robert :) > > > > On Wed, Jan 6, 2016 at 6:21 PM, Paul Davis > > <[email protected]> > > wrote: > > >>> - is a timeout solving the root cause or the symptoms? Could it > > >>> be a > > >>> temporary or additional step as in conjunction with query > > >>> optimisation > > >>> tooling? > > >> > > >> It really depends. From my CouchDB admin and user perspective, > > >> this > > >> doesn't seem so important to me right now. However, I recognize > > >> that > > >> there are different usage scenarios with different requirents > > >> (e.g. the > > >> ones at Cloudant). > > > > > > I don't think there's anything special about Cloudant in this > > > discussion. Its just a question of how do we allow new users the > > > ability to easily test and learn the selector/query API while > > > also > > > preventing them from going too far without creating indexes for > > > their > > > queries. The slow queries messages are fine, but just as any > > > other > > > database they don't really prompt the developer to make the > > > correct > > > change. Ie, the developer has to be savvy enough to a) know that > > > the > > > slow queries logs exist, b) understand that creating an index > > > would > > > speed things up, and then c) know which index to create based on > > > the > > > logged query. > > > > > > In my experience, the group of users that we're concerned about > > > in > > > this discussion most likely don't know about any of those three > > > things, hence why the current API is designed to force them to > > > learn > > > about and understand indexes as part of learning the API. Granted > > > the > > > `_id > null` trick muddies that learning process. I would think > > > that > > > replacing the _id trick with `"testing": true` or similar would > > > be an > > > obvious indication to users that this is a dev/debug type feature > > > and > > > when they went to production they would still be pushed to using > > > an > > > index. If we add the "create index from selector" API then I > > > think > > > this would be a relatively straightforward method to on ramping > > > to > > > both the query and index sides of the API. Ie, "You can try > > > queries > > > with testing:true, when you're ready to move to production you > > > can > > > POST your selector to _index to create the index which allows you > > > to > > > remove testing:true". > > > > > > That's also why I don't particularly care for the timeout > > > approach. > > > It's a binary threshold that a user would (maybe) meet after some > > > unknown amount of time after they falsely believe their app is > > > working > > > correctly. The feedback is "Everything is fine until it isn't". > > > Consider an app that's been working for a week or a month or more > > > that > > > suddenly starts throwing timeouts for a query. From the user's > > > perspective the database broke because the query that used to > > > work > > > fine no longer does. And then there's the follow on question on > > > how > > > that timeout might instruct the user that they need an index, and > > > that > > > the fix may be as easy as POSTing their selector to the _index > > > endpoint. Sure Google would most likely have the answer if our > > > docs > > > are good enough, but by that point the developer is probably > > > already > > > experiencing downtime if their app is live which means they're > > > frantically trying to fix the thing. From my point of view, a few > > > road > > > blocks that guide developers towards the correct usage early on > > > would > > > be better than letting them get to the adrenaline fueled > > > expletive > > > fountain of downtime. > > >
