Re: Using EXISTS instead of IN for subqueries

2013-03-25 Thread Michael Manfre


On Monday, March 25, 2013 6:58:12 AM UTC-4, Tim Chase wrote:
>
> I can only speak for testing IN-vs-EXISTS speed on MSSQLServer at 
> $OLD_JOB, but there it's usually about the same, occasionally with IN 
> winning out. 


In SQL 2008r2, the optimizer is usually smart enough to end up with the 
same execution plan for IN and EXISTS queries. Historically, EXISTS was 
usually the faster operation for SQL Server and if memory serves it had to 
deal with its ability to bail out of the EXISTS query sooner compared to 
the IN query.

MSSQL is a 2nd-class citizen in the Django world, so I'm +1 
>

Reasoning like that helps to keep it in its place.

Anssi,

Any chance of adding a new database feature to flip the behavior of __in to 
either IN or EXISTS? Sounds like this change of logical and documented 
behavior is being made specifically because of failings with Postgresql. 
The feature would also help satisfy the deprecation cycle normally used for 
changes to documented behaviors. Sub-queries are more likely to expose 
database specific issues with the SQL provided by Django (normally when 
used with aggregates or slicing). Adding the database feature might save 
every other backend from having to potentially jump 
through unnecessary hoops (mangling more SQL).

Regards,
Michael Manfre

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: URL dispatcher fallthrough?

2013-03-25 Thread meric
Previous discussion (which I've read before): 
https://code.djangoproject.com/ticket/16774 

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: What can I do to get feedback on my pull request?

2013-03-25 Thread meric
What action, if any, do you suggest I take now?

On Tuesday, March 26, 2013 2:42:11 PM UTC+11, Ramiro Morales wrote:
>
> On Tue, Mar 26, 2013 at 12:36 AM, meric  
> wrote: 
> > Thanks, I'll reply to that thread. I posted my proposal 6 months before 
> that 
> > post, would've been nice if they posted in my thread... 
>
> Jacob voted in favor of the feature (on the  #16774 [1]ticket) and Adrian 
> voted -1 on that thread. But they were unaware of the existence of the 
> ticket. 
>
>
> 1. https://code.djangoproject.com/ticket/16774 
>
> -- 
> Ramiro Morales 
> @ramiromorales 
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: URL dispatcher fallthrough?

2013-03-25 Thread meric
I've previously raised this idea and created a pull request.

https://github.com/django/django/pull/378

The problem with creating with a catch all view:

You have following models:

Country, Industry, Company, School.

You want to have the following kinds of urls:

//
// /
///

///

/ /
//
///
//
///

It can get extremely cumbersome to use catch-all views to manage all of 
these URLs, rather than a single view with optional keyword-arguments.

In my example I have only listed 4 models. It wouldn't be implausible to 
suggest there are situations when there could be more.

I suggest it is better for the framework to handle this complex URL routing 
on the urls.py level, because it really isn't the view's responsibility to 
think about URLs; Since URLs already introspect into view's arguments, and 
in some cases, even instantiate class based views, especially generic 
views, it would be better to keep urlresolver's relationship with views 
instead of introducing a new relationship where views must now take care of 
URLs too.

I think the Inversion of Control principle would be good to 
follow: http://en.wikipedia.org/wiki/Inversion_of_control








On Tuesday, March 19, 2013 2:23:41 AM UTC+11, julianb wrote:
>
> Hi,
>
> imagine the following use case:
>
> You build an online store where you have sorted products into several 
> categories and maybe associated an occasion. Now you want to build URLs. So 
> the URL schema that all of the store's owners agree on is:
>
> //
> //
> //
>
> Look simple.
>
> Because product slugs should generally be different to category slugs and 
> occasions, you expect no clashes here. And you don't want to bloat the URLs 
> by putting product/ and category/ in there. And you also do not want to 
> resort making it look cryptic by having just /p// and so on.
> Then it comes to Django. How do you do that?
>
> Well, at the moment, as far as I am aware, you can't. The first URL will 
> match everything all the time, not giving the other views a chance to kick 
> in.
>
> So I propose some kind of URL fallthrough. The view could do
>
> raise UrlNotMatched
>
> and the dispatcher goes to the next possible matching view.
>
> Would that be good? Any thoughts?
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: What can I do to get feedback on my pull request?

2013-03-25 Thread Ramiro Morales
On Tue, Mar 26, 2013 at 12:36 AM, meric  wrote:
> Thanks, I'll reply to that thread. I posted my proposal 6 months before that
> post, would've been nice if they posted in my thread...

Jacob voted in favor of the feature (on the  #16774 [1]ticket) and Adrian
voted -1 on that thread. But they were unaware of the existence of the ticket.


1. https://code.djangoproject.com/ticket/16774

-- 
Ramiro Morales
@ramiromorales

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: What can I do to get feedback on my pull request?

2013-03-25 Thread meric
Thanks, I'll reply to that thread. I posted my proposal 6 months before 
that post, would've been nice if they posted in my thread...

On Tuesday, March 26, 2013 2:33:14 PM UTC+11, Ramiro Morales wrote:
>
> On Tue, Mar 26, 2013 at 12:28 AM, meric  
> wrote: 
> > I have made a pull request for django 6 months ago, but it doesn't seem 
> to 
> > be getting much response so far. 
> > 
> > What can I do to get more feedback as to what's wrong with it, and try 
> to 
> > get it accepted? 
> > 
> > Here is the pull request: 
> > 
> > https://github.com/django/django/pull/378 
>
> Isn't it the same idea proposed (and rejected) in this thread?: 
>
>
> https://groups.google.com/forum/?fromgroups=#!topic/django-developers/64I3Qy4OH-A
>  
>
> -- 
> Ramiro Morales 
> @ramiromorales 
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: What can I do to get feedback on my pull request?

2013-03-25 Thread Ramiro Morales
On Tue, Mar 26, 2013 at 12:28 AM, meric  wrote:
> I have made a pull request for django 6 months ago, but it doesn't seem to
> be getting much response so far.
>
> What can I do to get more feedback as to what's wrong with it, and try to
> get it accepted?
>
> Here is the pull request:
>
> https://github.com/django/django/pull/378

Isn't it the same idea proposed (and rejected) in this thread?:

https://groups.google.com/forum/?fromgroups=#!topic/django-developers/64I3Qy4OH-A

-- 
Ramiro Morales
@ramiromorales

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




What can I do to get feedback on my pull request?

2013-03-25 Thread meric
I have made a pull request for django 6 months ago, but it doesn't seem to 
be getting much response so far.

What can I do to get more feedback as to what's wrong with it, and try to 
get it accepted?

Here is the pull request:

https://github.com/django/django/pull/378

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Proposal for allowing dynamic site based on domain name WITHOUT changing settings.SITE_ID.

2013-03-25 Thread meric
Previous discussion: 
https://groups.google.com/forum/?fromgroups=#!searchin/django-developers/dynamic$20sites/django-developers/QSXLGSxy7Vk/TxgiJzz5nd8J
https://code.djangoproject.com/ticket/16983
https://code.djangoproject.com/ticket/4438

My proposal allows getting the current site based on request.get_host(), 
without changing settings.SITE_ID, and maintaining backwards compatibility.

Required code changes: http://dpaste.com/1035045/

I propose adding an optional `request` argument to 
Site.objects.get_current(), so that:

`Site.objects.get_current()` returns current `Site` based on 
settings.SITE_ID
`Site.objects.get_current(request)` returns current `Site` based on 
request.get_host()

Both conditional branches will be cached in SITE_CACHE.

The current changes I've proposed means if there is no site with domain == 
request.get_host(), it won't be able to get the current site. An 
alternative would be to fall back onto using settings.SITE_ID.

What do you think?

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: ticket #6103 - modeltests/model_forms/models.py could do with some rewriting

2013-03-25 Thread Russell Keith-Magee
On Tue, Mar 26, 2013 at 1:49 AM, Bharadwaj Desikan
wrote:

> Hi
>
> I am new to Django Contributors but have quite a good expertise in python..
>
> I have assigned ticket https://code.djangoproject.com/ticket/6103 to
> myself.. So the current tests are stable  for
> which Django Release.
>
> Between which commits I should take diff .. This will help me to summarize
> the change at the api side , so that I can rewrite the test.
>
> Also , please highlight if have to address anything specific in this patch.
>
> Hi Bharadwaj,

Patches should always be made against the current tip of the master git
branch.

As for what needs to be addressed? What the ticket says -- the examples
need to be clarified to make more sense.

As an aside, you've possibly selected a bad ticket as your first
contribution to Django. A refactoring of models_forms tests is a large
task, and one that will require either a lot of trust in the person doing
the work, or a lot of intensive review. I would suggest starting with
something much smaller, so that you can get to know our coding
expectations, and so we can build trust in your abilities.

Yours,
Russ Magee %-)

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Jenkins missing all Django jobs

2013-03-25 Thread Florian Apolloner
And fixed

On Monday, March 25, 2013 10:06:21 PM UTC+1, Florian Apolloner wrote:
>
> Hi,
>
> I updated jenkins today and ran into a major issue (
> https://issues.jenkins-ci.org/browse/JENKINS-17337). This will be fixed 
> in a few hours and I'll update jenkins tomorrow.
>
> Sorry for the inconvenience.
>
> Regards,
> Florian
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Jenkins missing all Django jobs

2013-03-25 Thread Florian Apolloner
Hi,

I updated jenkins today and ran into a major issue 
(https://issues.jenkins-ci.org/browse/JENKINS-17337). This will be fixed in 
a few hours and I'll update jenkins tomorrow.

Sorry for the inconvenience.

Regards,
Florian

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Is casting Field.help_text to string in Field.__init__ necessary?

2013-03-25 Thread Claude Paroz
Le samedi 23 mars 2013 22:28:31 UTC+1, Claude Paroz a écrit :
>
> Le samedi 23 mars 2013 12:16:15 UTC+1, Evgeny a écrit :
>>
>> Hi.
>>
>> Is it necessary to cast help_text to string in Field.__init__ there 
>> https://github.com/django/django/blob/master/django/forms/fields.py#L92 ? 
>> I will be eventually displayed as string in template, and will be casted 
>> there. I think it would be better design - cast string type only in last 
>> moment in presentation template.
>>  
>> I am asking because i try to display in template two help texts one to 
>> the right from the field and one to the bottom, and to do that i tried to 
>> pass in help_text tuple of two strings but failed because of that cast.
>>
>
> It seems to me that it is an "historic" remainder. I suggest you remove 
> the offending lines, run the entire test suite, and if you don't get any 
> errors, open a ticket suggesting the removal.
>

Eventually, I addressed this issue in 
https://github.com/django/django/commit/066bf42675040abd7b1a42e5559890e5f9881058
Hopefully it will solve your problem.

Cheers,

Claude 

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Tests of contrib apps

2013-03-25 Thread Aymeric Augustin
On 25 mars 2013, at 20:20, Stephen Burrows  wrote:

> django-nose is pretty useful for handling test discovery issues, if you're 
> looking for a quick fix.


I don't suffer from this problem because I use a custom test runner to avoid 
it. My goal here is to improve the framework for others.

These days, about 5% of the new valid tickets are a variant of this. Here's one 
from three days ago: https://code.djangoproject.com/ticket/20114

I've committed a few patches of this kind, and now I regret it. Mechanically 
adding an override_settings for each case reported to us doesn't improve the 
code base. On the contrary, it adds noise.

I realize that when we started fixing such bugs with override_settings, we 
embarked on a sisyphean task that degrades the readability of the tests. It 
seems unrealistic to guarantee that all contrib apps tests will pass with any 
combination of settings, and this will be a constant source of bugs when we add 
new tests or new settings.

So, from now on, I'll dutifully watch these tickets pile up, and enjoy the 
little nudge about the deficiency in Django's testing tools, which I don't feel 
qualified to address.

I'm disappointed but let's move on, this isn't a big deal :)

-- 
Aymeric.



-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.





Re: Tests of contrib apps

2013-03-25 Thread Stephen Burrows
The tests are nested whether they're in the core tests or in contrib. 
Philosophically this is more a question of coding at a distance. Putting 
the tests in /tests makes it less obvious which app they actually belong 
to. Additionally, I'd see the fact that it would be impossible to use 
manage.py test as a detriment rather than an advantage. If people want to 
not run the tests for an app, they can always just list the apps they do 
want to test. Also, django-nose is pretty useful for handling test 
discovery issues, if you're looking for a quick fix.

--Stephen

On Wednesday, March 20, 2013 2:26:33 AM UTC-7, Aymeric Augustin wrote:
>
> Hello, 
>
> Currently there are three locations for the tests of contrib apps: 
> - under tests/ — eg. admin 
> - inside the app — eg. auth 
> - both — eg. contenttypes 
>
> Following the modeltests / regressiontests merge, I propose to move all 
> contrib app tests under tests/. This has de following advantages: 
> - it makes them easier to discover and prevents accidental duplication 
> - they won't be run by './manage.py test' 
> - it'll dam up the stream of "if I change setting X then test Y in contrib 
> app Z fails" 
>
> I'm aware of the idea that contrib apps could include integration tests to 
> validate that they're properly used within projects, but I don't believe we 
> have any such tests currently. 
>
> What do you think? 
>
> -- 
> Aymeric. 
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Using EXISTS instead of IN for subqueries

2013-03-25 Thread Simon Riggs
On 25 March 2013 12:37, Anssi Kääriäinen  wrote:

> I feel pretty strongly that NOT EXISTS semantics are wanted. The NOT
> IN semantics are likely there just because that is how the
> implementation was originally done, not because there was any decision
> to choose those semantics.

Most likely, yes, so it looks like a bug fix now not an optimization.

> Also, multicolumn NOT IN lookups aren't
> supported on all databases (SQLite at least), so for that case NOT
> EXISTS semantics is going to happen anyways.

Yes, I think that's the clincher.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: unittest.TestCase vs. django.test.TestCase in overview example

2013-03-25 Thread Tim Graham
It seems like it could be a dangerous precedent to cater to people who 
don't take the time to fully read the docs, but in this case I'm a bit 
sympathetic. On the other hand, this example will probably be a bit more 
obvious when we drop support for Python 2.6 and no longer have 
django.utils.unittest. At the least, we could probably move the warning 
above the example so it's a bit more visible.

On Saturday, March 16, 2013 8:27:01 PM UTC-4, Lorin Hochstein wrote:
>
> Hi there:
>
> On the Django testing overview doc page <
> https://docs.djangoproject.com/en/dev/topics/testing/overview/>, the 
> initial example uses unittest.TestCase. A Django developer who was looking 
> for a quick reminder on how to write unit tests is likely to hit this page 
> first. If that developer doesn't read the "warning" section below, they 
> could mistakenly use unittest.TestCase when their unit tests change  the 
> database. This very scenario happened to a colleague of mine.
>
> I proposed changing this to django.test.TestCase <
> https://github.com/django/django/pull/903>, but that pull request with 
> closed out by Aymeric Augustin, with reference to <
> https://code.djangoproject.com/ticket/15896>. I don't think ticket #15986 
> covers quite the same issue, despite its title. Django devs, can you 
> reconsider this doc patch?
>
> Take care,
>
> Lorin
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




ticket #6103 - modeltests/model_forms/models.py could do with some rewriting

2013-03-25 Thread Bharadwaj Desikan
Hi 

I am new to Django Contributors but have quite a good expertise in python..

I have assigned ticket https://code.djangoproject.com/ticket/6103 to 
myself.. So the current tests are stable  for 
which Django Release. 

Between which commits I should take diff .. This will help me to summarize 
the change at the api side , so that I can rewrite the test.

Also , please highlight if have to address anything specific in this patch.

Thanks for same.

Regards

Bharadwaj D

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Django 1.5 using a cached HttpResponse with WSGI has an empty body

2013-03-25 Thread SteveB
With the change to HttpResponse made in Django 1.5, I'm finding that in my 
code, which caches a generated response, results in an empty body when that 
page is requested a second time. The first time the page is requested, it 
is not in the cache, and the page is generated normally and added to the 
cache. A subsequent request for the same page finds the response in the 
cache and that response is returned, but with a content length of zero.

The reason is that the HttpResponse in Django 1.5 *does not reset the 
content iterator* when the content is requested to be iterated over again 
(the next time the response content is required).

I note the comments made about the way an iterator should behave when 
requested to iterate again:
https://code.djangoproject.com/ticket/13222
and the code which was added to explicitly prevent a reiteration from 
resetting the iterator. However, that causes a problem when using cached 
responses.

The HttpResponse in my case was not created by passing an iterator to 
HttpResponse. It is just using a string.

The problem is that the __iter__ method of HttpResponse contains the line:

> if not hasattr(self, '_iterator'):
>   self._iterator = iter(self._container)


> This prevents the iterator being reset the next time it is required to 
iterate over the content.
_container still has the original content, but __iter__ does not reset the 
iterator as _iterator exists as an attribute since the first time that 
response was returned. The cached response is returning a used iterator, 
which returns no content.

I suspect this is a bug. Any thoughts?
What about a work-around in the meantime?
When I retrieve the response from the cache, I could do:

> response._iterator = iter(response._container)


or:

> del response._iterator


This works, but makes my code dependent on the internals of the 
HttpResponse class, which isn't great. Any better ideas?

Kind regards,
Steve
P.S. I posted a message about this on Django users group about a week ago, 
but got no reply, hence posting here to get the views of the Django 
developers.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Using EXISTS instead of IN for subqueries

2013-03-25 Thread Alex Gaynor
I have no idea how EXISTS performs on MySQL, however I can say that IN +
subqueries on MySQL are so atrocious that we outright banned that where I
work, so I don't see how it could be worse :)

Alex


On Mon, Mar 25, 2013 at 8:37 AM, Anssi Kääriäinen
wrote:

> On 25 maalis, 13:23, Simon Riggs  wrote:
> > On 25 March 2013 10:58, Tim Chase 
> wrote:
> >
> > > On 2013-03-25 03:40, Anssi Kääriäinen wrote:
> > >> I am very likely going to change the ORM to use EXISTS subqueries
> > >> instead of IN subqueries. I know this is a good idea on PostgreSQL
> > >> but I don't have enough experience of other databases to know if
> > >> this is a good idea or not.
> >
> > > I can only speak for testing IN-vs-EXISTS speed on MSSQLServer at
> > > $OLD_JOB, but there it's usually about the same, occasionally with IN
> > > winning out. However, the wins were marginal, and MSSQL is a 2nd-class
> > > citizen in the Django world, so I'm +1 on using EXISTS instead of IN,
> > > if the results are assured to be the same.
> >
> > The results are definitely different because NOT IN has some quite
> > strange characteristics: if the subquery returns a NULL then the whole
> > result is "unknown". It is that weirdness that makes it hard to
> > optimize for, or at least, not-yet-optimized for in PostgreSQL.
> >
> > In most cases it is the NOT EXISTS behaviour that people find natural
> > and normal anyway and that is the best mechanism to use.
>
> When doing an .exclude() that requires subquery Django automatically
> generates the queries so that the inner query's select clause can't
> contain nulls. For example:
> >>> print D.objects.exclude(e__id__gte=0).query
> SELECT `table_d`.`id`, `table_d`.`a`, `table_d`.`b` FROM `table_d`
> WHERE NOT (`table_d`.`id` IN (SELECT U1.`d_id` FROM `table_e` U1 WHERE
> (U1.`id` >= 0  AND U1.`d_id` IS NOT NULL)))
>
> However it is possible to generate NOT IN query where the SQL
> semantics are in effect when using __in lookup:
> >>> print
> D.objects.exclude(id__in=E.objects.filter(id__gte=0).values_list('d_id')).query
> SELECT `table_d`.`id`, `table_d`.`a`, `table_d`.`b` FROM `table_d`
> WHERE NOT (`table_d`.`id` IN (SELECT U0.`d_id` FROM `table_e` U0 WHERE
> U0.`id` >= 0 ))
>
> The results of the latter case could change (assuming d_id can contain
> null values).
>
> I think that this could be considered a bug fix. Django's ORM doesn't
> try to mimic SQL semantics, it tries to have Python semantics for the
> query. So an exclude(__in) lookup should behave like Python's "value
> not in list", not like SQL's NOT IN.
>
> On the other hand having __in lookups that do EXISTS in SQL might be a
> bit surprising. The way __in works is documented as generating SQL IN
> lookup: https://docs.djangoproject.com/en/dev/ref/models/querysets/#in.
>
> I feel pretty strongly that NOT EXISTS semantics are wanted. The NOT
> IN semantics are likely there just because that is how the
> implementation was originally done, not because there was any decision
> to choose those semantics. Also, multicolumn NOT IN lookups aren't
> supported on all databases (SQLite at least), so for that case NOT
> EXISTS semantics is going to happen anyways.
>
>  - Anssi
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to django-developers+unsubscr...@googlegroups.com.
> To post to this group, send email to django-developers@googlegroups.com.
> Visit this group at http://groups.google.com/group/django-developers?hl=en
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
>


-- 
"I disapprove of what you say, but I will defend to the death your right to
say it." -- Evelyn Beatrice Hall (summarizing Voltaire)
"The people's good is the highest law." -- Cicero

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Using EXISTS instead of IN for subqueries

2013-03-25 Thread Anssi Kääriäinen
On 25 maalis, 13:23, Simon Riggs  wrote:
> On 25 March 2013 10:58, Tim Chase  wrote:
>
> > On 2013-03-25 03:40, Anssi Kääriäinen wrote:
> >> I am very likely going to change the ORM to use EXISTS subqueries
> >> instead of IN subqueries. I know this is a good idea on PostgreSQL
> >> but I don't have enough experience of other databases to know if
> >> this is a good idea or not.
>
> > I can only speak for testing IN-vs-EXISTS speed on MSSQLServer at
> > $OLD_JOB, but there it's usually about the same, occasionally with IN
> > winning out. However, the wins were marginal, and MSSQL is a 2nd-class
> > citizen in the Django world, so I'm +1 on using EXISTS instead of IN,
> > if the results are assured to be the same.
>
> The results are definitely different because NOT IN has some quite
> strange characteristics: if the subquery returns a NULL then the whole
> result is "unknown". It is that weirdness that makes it hard to
> optimize for, or at least, not-yet-optimized for in PostgreSQL.
>
> In most cases it is the NOT EXISTS behaviour that people find natural
> and normal anyway and that is the best mechanism to use.

When doing an .exclude() that requires subquery Django automatically
generates the queries so that the inner query's select clause can't
contain nulls. For example:
>>> print D.objects.exclude(e__id__gte=0).query
SELECT `table_d`.`id`, `table_d`.`a`, `table_d`.`b` FROM `table_d`
WHERE NOT (`table_d`.`id` IN (SELECT U1.`d_id` FROM `table_e` U1 WHERE
(U1.`id` >= 0  AND U1.`d_id` IS NOT NULL)))

However it is possible to generate NOT IN query where the SQL
semantics are in effect when using __in lookup:
>>> print 
>>> D.objects.exclude(id__in=E.objects.filter(id__gte=0).values_list('d_id')).query
SELECT `table_d`.`id`, `table_d`.`a`, `table_d`.`b` FROM `table_d`
WHERE NOT (`table_d`.`id` IN (SELECT U0.`d_id` FROM `table_e` U0 WHERE
U0.`id` >= 0 ))

The results of the latter case could change (assuming d_id can contain
null values).

I think that this could be considered a bug fix. Django's ORM doesn't
try to mimic SQL semantics, it tries to have Python semantics for the
query. So an exclude(__in) lookup should behave like Python's "value
not in list", not like SQL's NOT IN.

On the other hand having __in lookups that do EXISTS in SQL might be a
bit surprising. The way __in works is documented as generating SQL IN
lookup: https://docs.djangoproject.com/en/dev/ref/models/querysets/#in.

I feel pretty strongly that NOT EXISTS semantics are wanted. The NOT
IN semantics are likely there just because that is how the
implementation was originally done, not because there was any decision
to choose those semantics. Also, multicolumn NOT IN lookups aren't
supported on all databases (SQLite at least), so for that case NOT
EXISTS semantics is going to happen anyways.

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Using EXISTS instead of IN for subqueries

2013-03-25 Thread Simon Riggs
On 25 March 2013 10:58, Tim Chase  wrote:
> On 2013-03-25 03:40, Anssi Kääriäinen wrote:
>> I am very likely going to change the ORM to use EXISTS subqueries
>> instead of IN subqueries. I know this is a good idea on PostgreSQL
>> but I don't have enough experience of other databases to know if
>> this is a good idea or not.
>
> I can only speak for testing IN-vs-EXISTS speed on MSSQLServer at
> $OLD_JOB, but there it's usually about the same, occasionally with IN
> winning out. However, the wins were marginal, and MSSQL is a 2nd-class
> citizen in the Django world, so I'm +1 on using EXISTS instead of IN,
> if the results are assured to be the same.

The results are definitely different because NOT IN has some quite
strange characteristics: if the subquery returns a NULL then the whole
result is "unknown". It is that weirdness that makes it hard to
optimize for, or at least, not-yet-optimized for in PostgreSQL.

In most cases it is the NOT EXISTS behaviour that people find natural
and normal anyway and that is the best mechanism to use.

> However, the query constuction to move the condition into the EXISTS
> subclause might be a bit more complex.


-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Using EXISTS instead of IN for subqueries

2013-03-25 Thread Tim Chase
On 2013-03-25 03:40, Anssi Kääriäinen wrote:
> I am very likely going to change the ORM to use EXISTS subqueries
> instead of IN subqueries. I know this is a good idea on PostgreSQL
> but I don't have enough experience of other databases to know if
> this is a good idea or not.

I can only speak for testing IN-vs-EXISTS speed on MSSQLServer at
$OLD_JOB, but there it's usually about the same, occasionally with IN
winning out. However, the wins were marginal, and MSSQL is a 2nd-class
citizen in the Django world, so I'm +1 on using EXISTS instead of IN,
if the results are assured to be the same.

However, the query constuction to move the condition into the EXISTS
subclause might be a bit more complex.

-tkc


-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Using EXISTS instead of IN for subqueries

2013-03-25 Thread Anssi Kääriäinen
I am very likely going to change the ORM to use EXISTS subqueries
instead of IN subqueries. I know this is a good idea on PostgreSQL but
I don't have enough experience of other databases to know if this is a
good idea or not.

There are two main reasons for doing this. First, exists should
perform better on some databases, and exists allows for filter
conditions other than single column equality on all databases. So,
EXISTS subqueries are needed in the ORM in any case, the question is
if they should be the only option.

The semantics of NOT IN are harder to optimize for the DB than NOT
EXISTS, and this can result in large performance differences. See for
example this post from pgsql-hacker mailing list:
http://www.postgresql.org/message-id/19913.1359149...@sss.pgh.pa.us

It is easy to construct cases where NOT IN results in runtime of days
and NOT EXISTS in runtime of seconds. Just have a large enough table
in the subquery and PostgreSQL will choke.

Quick testing indicates that Oracle and MySQL seem to perform about
the same for IN and EXISTS variants, and SQLite seems to be a bit
faster when using EXISTS over IN. The docs of MySQL suggests using
EXISTS: 
https://dev.mysql.com/doc/refman/5.5/en/subquery-optimization-with-exists.html
(see the part about "very useful optimization"). My experience of
using these databases is very limited, so I might be missing some
known problematic cases.

So, the question is if there are situations where performance of
EXISTS is a lot worse than IN?

It will be possible to have a
connection.features.prefers_exists_subqueries flag and use that to
decide if the query should be generated as IN or EXISTS subquery.
However, always using EXISTS is a lot simpler.

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.