Hi,

I've also seen this behaviour before at YPlan, we sometimes got the "mysql
can't join more than 61 tables" error or had the database slow to a crawl.
Some of the admin classes required custom get_search_results functions to
fix this.

I have only skimmed the old tickets and PR's but I think that a breaking
change to subtly changing the search behaviour is worth it to prevent this
footgun of doom is a good compromise.

Adam


On Tue, 5 Mar 2019 at 19:35, Joel M <cuthalion...@gmail.com> wrote:

> I don't know whether anyone will see this, as it's my first time posting
> here, but I have to second this.
>
> This problem brought down our site yesterday when an unsuspecting admin
> user tried a 3-word search in an admin that had several search fields
> involving joins. The result was that the number of joins was multiplied by
> the number of search terms. The database machine literally ran out of space
> making the temporary table (it had been about 10% full -- this query
> occupied many times the size of the entire database), and crashed
> everything. It works fine with just 1 word. As a workaround, we blocked
> nonessential users from being able to make that search and will be removing
> less-important fields from the search_fields in a future code update. But
> it took awhile to figure out what the problem was, and this could happen in
> any admin with enough related search fields or search terms. Eventually, I
> came up with the same fix petros suggests in their PR.
>
> I know there may be users who expect the previous query behavior, but this
> is a gotcha that can hide for awhile before somebody accidentally breaks
> everything.
>
> More related tickets: https://code.djangoproject.com/ticket/16063 (with a
> similar solution), https://code.djangoproject.com/ticket/27864 (mitigate
> by limiting number of search terms)
>
> On Friday, September 23, 2016 at 1:49:14 PM UTC-4, petros.m...@gmail.com
> wrote:
>>
>> Hello,
>>
>> I would like to open a discussion on the change I have proposed with pull
>> request #7277 <https://github.com/django/django/pull/7277>.
>>
>> To bring the discussion here, the problem with current implementation of
>> get_search_results() is that, when search fields include fields through
>> reverse relationships, it produces queries that the more words are used in
>> the search term the more inefficient they are. This inefficiency comes from
>> duplicate left joins with same tables which are in that case introduced by
>> Django ORM. There have been relevant reports by others before, which you
>> can see in tickets #16603 <https://code.djangoproject.com/ticket/16603>
>>  and #25789 <https://code.djangoproject.com/ticket/25789>. However,
>> judging by the fact that for five years there has not been a solution for
>> this, it seems that it is not an easy fix.
>>
>> That said, this inefficiency can easily get the system down, as users in
>> the admin can use a few words in the search term, either deliberately or by
>> mistake, e.g. by accidentally copying & pasting a whole sentence or
>> paragraph. In my case, with just 3 words in the search term and 4 tables
>> involved in the search (with their size being in the range of 1500 to 4000
>> rows), the query had been running for 25 minutes before I actually killed
>> it on the database server. Evenmore, the impact of that inefficiency can be
>> easily multiplied as users are not that patient and tend to repeat the
>> search again and again when they see that they are not getting results fast
>> enough.
>>
>> So, I wrote a small patch which changes the way the query is built and,
>> happily, the problem is gone. However, as Simon Charette has pointed out in
>> the pull request's discussion, there is the corner case scenario in which
>> some results may not be returned.
>> I am quoting his exact words:
>>
>> Keep in mind that filter(or_queries[0] & or_queries[1] & ... &
>>> or_queries[n]) can generate different results from
>>> filter(or_queries[0]).filter(or_queries[1]).filter(...).filter(or_queries[n])
>>> if any of filters spans multi-valued relationships.
>>> For example, given models Book and Author and a AuthorAmin searching
>>> both books__title and books__description your solution could filter out
>>> results that use to be displayed as search terms will have to be present in
>>> both Book.title and Book.description to match.
>>
>>
>> So, the question is whether that behavior is absolutely necessary for
>> searches in the admin. I personally believe that the compromise is small
>> enough and the benefit is really great. We are eliminating a possibility
>> for our normal users to cause a denial of service through a simple search
>> in the admin, with just a little compromise on the powerfulness of the
>> search function. What are your thoughts?
>>
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to django-developers+unsubscr...@googlegroups.com.
> To post to this group, send email to django-developers@googlegroups.com.
> Visit this group at https://groups.google.com/group/django-developers.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-developers/9c504356-1efa-4a9b-84e7-a613940f4255%40googlegroups.com
> <https://groups.google.com/d/msgid/django-developers/9c504356-1efa-4a9b-84e7-a613940f4255%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>


-- 
Adam

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CAMyDDM2vEjwSbzbsuSxboPe0vC%2BentWL1%3D8Qya-DCHNH7PiPuw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to