Re: [POC] Mango Catch All Selector

Robert Kowalski Wed, 13 Jan 2016 11:48:19 -0800

Hi Garren,

what would selector: null do? Return all docs?


Where in the answer from CouchDB would be the warning? Next to the
resultset, like

[{"_id": "foo", "_rev": "535"}, {"_warning": "slow query, use an index for
better performance"}] ?

Am Mittwoch, 13. Januar 2016 schrieb Garren Smith :

> Hi Robert,
>
> I think you miss understood me, I don’t want it to be a different endpoint.
> I just don’t want a user to have to do queries like this find({slow:
> true}). I want them to be able to do a query e.g. find({}) or
> find({selector: null}) and then get back the results along with a warning
> message telling them that this query would be slow in production.
> The lower the barrier for entry here the better. I know we want to protect
> our users for when they go to production, but forcing them to add a slow:
> true flag won’t help. It will still require them to read the docs a lot
> more than most people are willing to on a first attempt of something new.
>
> Cheers
> Garren
> > On 12 Jan 2016, at 9:16 PM, Robert Kowalski <[email protected]
> <javascript:;>> wrote:
> >
> > thank you all for your feedback!
> >
> > i like the idea of the error message with a new url.
> >
> > i agree with garren that it should be a separate endpoint. it takes
> > some complexity off when explaining each endpoint.
> >
> > maybe: `/_find_slow`?
> >
> > On Tue, Jan 12, 2016 at 10:36 AM, Jan Lehnardt <[email protected]
> <javascript:;>> wrote:
> >>
> >>> On 11 Jan 2016, at 19:55, Tony Sun <[email protected]
> <javascript:;>> wrote:
> >>>
> >>> Hi Robert,
> >>>
> >>> Building upon what others have stated above, what do you think about
> >>> the following:
> >>>
> >>> 1) Let the user query without creating an index
> >>> 2) Return an error message with a new url that has
> >>> "slow/no_index/developer":true appended at the end. The message clearly
> >>> explains that this query will be slow, and that creating an index will
> be
> >>> more efficient. However, he or she can continue. The error message will
> >>> then have a link to point to our documentation.
> >>> 3) In Fauxton, there is a checkbox or button that also appends the
> >>> "slow/no_index/developer":true to the _find url. If the user clicks it,
> >>> then the same message pops up to notify the user.
> >>
> >>
> >> I like this!
> >>
> >>
> >> Jan
> >> --
> >>
> >>>
> >>>
> >>>
> >>> Tony
> >>>
> >>>
> >>>
> >>> On Mon, Jan 11, 2016 at 9:45 AM, Eli Stevens (Gmail) <
> [email protected] <javascript:;>>
> >>> wrote:
> >>>
> >>>> Just wanted to chime in here as a user - I've run into similar
> >>>> behavior from CouchDB with the reduce-not-reducing-enough heuristic,
> >>>> where stuff I was working on went smoothly in dev, but stopped once
> >>>> real load was pushed through it (thankfully for me, that was in
> >>>> testing, rather than released to customers).
> >>>>
> >>>> It's a frustrating experience, and I don't think that a reputation for
> >>>> "works until you cross a threshold, and then it doesn't, but only in
> >>>> production" is a good thing to move towards.
> >>>>
> >>>> Perhaps something like adding a key to the returned data along the
> >>>> lines of "_slow_warning": "This query is going to be slow on large
> >>>> data sets. See http://..."; in addition to the ?slow_warning=true
> query
> >>>> param (note that I'm calling it "slow_warning" in both places only to
> >>>> increase discoverability; without the url param, the no-index query
> >>>> wouldn't work at all). Bikeshed the name as needed.
> >>>>
> >>>> I'd like to see a lot more URLs in CouchDB error messages in general,
> >>>> actually - I would find it very useful when trying to determine what's
> >>>> going wrong to have a URL right there in the logs that I can get more
> >>>> information from.
> >>>>
> >>>> On Sun, Jan 10, 2016 at 11:54 AM, Joan Touzet <[email protected]
> <javascript:;>> wrote:
> >>>>> Hi Robert,
> >>>>>
> >>>>> I've been thinking about this one for the week or so, and I have a
> >>>>> simple suggestion:
> >>>>>
> >>>>> Add the query parameter slow=true to enable this behaviour.
> >>>>>
> >>>>> This meets all the original requirements:
> >>>>>
> >>>>> 1. It is not default behaviour
> >>>>> 2. You can grep the log files for the word 'slow' and find evidence
> >>>>> 3. There is a shorthand, simple way to enable the behaviour
> >>>>> 4. Any self-respecting developer will try to remove slow=true, find
> >>>>> a break, and be forced to learn about indexes
> >>>>> 5. It's a bit cheeky, which I think is kind of fun :D
> >>>>>
> >>>>> All the best,
> >>>>> Joan
> >>>>>
> >>>>> ----- Original Message -----
> >>>>>> From: "William Edney" <[email protected] <javascript:;>>
> >>>>>> To: [email protected] <javascript:;>
> >>>>>> Sent: Friday, January 8, 2016 10:27:29 AM
> >>>>>> Subject: Re: [POC] Mango Catch All Selector
> >>>>>>
> >>>>>> Hi Robert -
> >>>>>>
> >>>>>> As a builder of UI, API and library code who has also done developer
> >>>>>> training on a variety of technologies, one simple fix might be go
> >>>>>> ahead and
> >>>>>> not require indexes to be built, but then to put a big NOTE at the
> >>>>>> beginning of the "Mango Getting Started" guide (I would assume there
> >>>>>> is
> >>>>>> such a piece of documentation) that states: "Note that the examples
> >>>>>> in this
> >>>>>> document do not require you to build an index, but for performance
> >>>>>> reasons
> >>>>>> we HIGHLY RECOMMEND that you do so. *Click here* for more
> information
> >>>>>> about
> >>>>>> how to do that" (or some such verbiage).
> >>>>>>
> >>>>>> My 2 cents.
> >>>>>>
> >>>>>> Cheers,
> >>>>>>
> >>>>>> - Bill
> >>>>>>
> >>>>>> On Fri, Jan 8, 2016 at 9:04 AM, Robert Kowalski <[email protected]
> <javascript:;>>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Hi list,
> >>>>>>>
> >>>>>>> At the end of the mail I would like to invite the other folks from
> >>>>>>> the
> >>>>>>> mailing list that build interfaces for humans (APIs, CLIs or even
> >>>>>>> UIs)
> >>>>>>> to chime in again with their opinions. So all people one the ML,
> >>>>>>> the
> >>>>>>> mail is not just a response to Paul, feedback is welcome :)
> >>>>>>>
> >>>>>>> Hi Paul, I agree with the timeout. It could lead to very unpleasant
> >>>>>>> errors which are hard to debug and support.
> >>>>>>>
> >>>>>>> I added some thoughts to the other points you made:
> >>>>>>>
> >>>>>>>> a) know that the slow queries logs exist,
> >>>>>>>
> >>>>>>> Hmm... If I take a look at the 1.x logging it was very
> >>>>>>> straightforward. As a developer you would spin up a CouchDB and you
> >>>>>>> get all the log messages into your terminal. It was quite handy in
> >>>>>>> general for all kind of debugging. That the logs are not displayed
> >>>>>>> directly on stdout/stderr is in my opinion a general 2.x problem.
> >>>>>>> The
> >>>>>>> problem does occur with all kinds of log message we produce in
> >>>>>>> CouchDB
> >>>>>>> for 2.x and is not specific to the slow-query-logging.
> >>>>>>>
> >>>>>>>
> >>>>>>>> Ie, "You can try queries with testing:true, when you're ready to
> >>>>>>>> move to
> >>>>>>> production you can
> >>>>>>>> POST your selector to _index to create the index which allows you
> >>>>>>>> to
> >>>>>>>> remove testing:true".
> >>>>>>>
> >>>>>>> I really like the migration path you mentioned here with the API to
> >>>>>>> create indexes. I am worried to have a too high entry barrier for
> >>>>>>> absolute newcomers, people that you want to play around before they
> >>>>>>> are ready to think about indexes, e.g. by putting coupling the
> >>>>>>> index
> >>>>>>> topic from the beginning to the querying.
> >>>>>>>
> >>>>>>> When I throw too much things to learn on people (which  may not
> >>>>>>> have
> >>>>>>> used a database before), most people get discouraged and does not
> >>>>>>> take
> >>>>>>> a look. The usual things they feel or say are : "too complicated",
> >>>>>>> "I
> >>>>>>> have not enough time", "product XY is easier to use".
> >>>>>>>
> >>>>>>> I would argue that newcomers to a database will launch a high
> >>>>>>> traffic,
> >>>>>>> multi-gigabyte product with the database from day one. Day one is
> >>>>>>> the
> >>>>>>> day where they learn how to query the data and put data into the
> >>>>>>> database. Even for scenarios where people have a running high
> >>>>>>> traffic
> >>>>>>> system, and have used other databases at a medium to large scale I
> >>>>>>> would expect given they migrate to Couch, that they run both
> >>>>>>> systems
> >>>>>>> in parallel for the first time in order to fix the issues that
> >>>>>>> occur
> >>>>>>> during a migration.
> >>>>>>>
> >>>>>>> I think we we share the same goal (getting beginners started
> >>>>>>> quickly)
> >>>>>>> and the cool thing about your suggestion is that everyone gets the
> >>>>>>> required knowledge to run a production system right from the very
> >>>>>>> start. My suggestion leaves some parts out, but reduces the
> >>>>>>> cognitive
> >>>>>>> load required to get the very first basic results, e.g. in a
> >>>>>>> university class setting - or junior developers on their "casual
> >>>>>>> friday 20% time". My big hope is, once those folks build high
> >>>>>>> traffic
> >>>>>>> systems, they remember how easy the usage of CouchDB was and that
> >>>>>>> they
> >>>>>>> start to learn more about CouchDB in order to run it in a system
> >>>>>>> with
> >>>>>>> more than a few thousand documents.
> >>>>>>>
> >>>>>>>
> >>>>>>> For us both I think the "what" is clear, but the "how" is a bit
> >>>>>>> different. I also think this discussion still makes progress, but I
> >>>>>>> am
> >>>>>>> afraid it could stall. I see that we both have very good rudiments
> >>>>>>> and
> >>>>>>> I would like to invite the other folks from the mailing list that
> >>>>>>> build interfaces for humans (APIs, CLIs or even UIs) to chime in
> >>>>>>> again
> >>>>>>> with their opinions - of course I'm also looking forward to your
> >>>>>>> answer :)
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Robert :)
> >>>>>>>
> >>>>>>> On Wed, Jan 6, 2016 at 6:21 PM, Paul Davis
> >>>>>>> <[email protected] <javascript:;>>
> >>>>>>> wrote:
> >>>>>>>>>> - is a timeout solving the root cause or the symptoms? Could it
> >>>>>>>>>> be a
> >>>>>>>>>> temporary or additional step as in conjunction with query
> >>>>>>>>>> optimisation
> >>>>>>>>>> tooling?
> >>>>>>>>>
> >>>>>>>>> It really depends. From my CouchDB admin and user perspective,
> >>>>>>>>> this
> >>>>>>>>> doesn't seem so important to me right now. However, I recognize
> >>>>>>>>> that
> >>>>>>>>> there are different usage scenarios with different requirents
> >>>>>>>>> (e.g. the
> >>>>>>>>> ones at Cloudant).
> >>>>>>>>
> >>>>>>>> I don't think there's anything special about Cloudant in this
> >>>>>>>> discussion. Its just a question of how do we allow new users the
> >>>>>>>> ability to easily test and learn the selector/query API while
> >>>>>>>> also
> >>>>>>>> preventing them from going too far without creating indexes for
> >>>>>>>> their
> >>>>>>>> queries. The slow queries messages are fine, but just as any
> >>>>>>>> other
> >>>>>>>> database they don't really prompt the developer to make the
> >>>>>>>> correct
> >>>>>>>> change. Ie, the developer has to be savvy enough to a) know that
> >>>>>>>> the
> >>>>>>>> slow queries logs exist, b) understand that creating an index
> >>>>>>>> would
> >>>>>>>> speed things up, and then c) know which index to create based on
> >>>>>>>> the
> >>>>>>>> logged query.
> >>>>>>>>
> >>>>>>>> In my experience, the group of users that we're concerned about
> >>>>>>>> in
> >>>>>>>> this discussion most likely don't know about any of those three
> >>>>>>>> things, hence why the current API is designed to force them to
> >>>>>>>> learn
> >>>>>>>> about and understand indexes as part of learning the API. Granted
> >>>>>>>> the
> >>>>>>>> `_id > null` trick muddies that learning process. I would think
> >>>>>>>> that
> >>>>>>>> replacing the _id trick with `"testing": true` or similar would
> >>>>>>>> be an
> >>>>>>>> obvious indication to users that this is a dev/debug type feature
> >>>>>>>> and
> >>>>>>>> when they went to production they would still be pushed to using
> >>>>>>>> an
> >>>>>>>> index. If we add the "create index from selector" API then I
> >>>>>>>> think
> >>>>>>>> this would be a relatively straightforward method to on ramping
> >>>>>>>> to
> >>>>>>>> both the query and index sides of the API. Ie, "You can try
> >>>>>>>> queries
> >>>>>>>> with testing:true, when you're ready to move to production you
> >>>>>>>> can
> >>>>>>>> POST your selector to _index to create the index which allows you
> >>>>>>>> to
> >>>>>>>> remove testing:true".
> >>>>>>>>
> >>>>>>>> That's also why I don't particularly care for the timeout
> >>>>>>>> approach.
> >>>>>>>> It's a binary threshold that a user would (maybe) meet after some
> >>>>>>>> unknown amount of time after they falsely believe their app is
> >>>>>>>> working
> >>>>>>>> correctly. The feedback is "Everything is fine until it isn't".
> >>>>>>>> Consider an app that's been working for a week or a month or more
> >>>>>>>> that
> >>>>>>>> suddenly starts throwing timeouts for a query. From the user's
> >>>>>>>> perspective the database broke because the query that used to
> >>>>>>>> work
> >>>>>>>> fine no longer does. And then there's the follow on question on
> >>>>>>>> how
> >>>>>>>> that timeout might instruct the user that they need an index, and
> >>>>>>>> that
> >>>>>>>> the fix may be as easy as POSTing their selector to the _index
> >>>>>>>> endpoint. Sure Google would most likely have the answer if our
> >>>>>>>> docs
> >>>>>>>> are good enough, but by that point the developer is probably
> >>>>>>>> already
> >>>>>>>> experiencing downtime if their app is live which means they're
> >>>>>>>> frantically trying to fix the thing. From my point of view, a few
> >>>>>>>> road
> >>>>>>>> blocks that guide developers towards the correct usage early on
> >>>>>>>> would
> >>>>>>>> be better than letting them get to the adrenaline fueled
> >>>>>>>> expletive
> >>>>>>>> fountain of downtime.
> >>>>>>>
> >>>>>>
> >>>>
> >>
>
>

Re: [POC] Mango Catch All Selector

Reply via email to