Re: Increase default integer keys to 64 bits

2021-01-29 Thread Christophe Pettus



> On Jan 29, 2021, at 07:40, charettes  wrote:
> 
> As Tom said Django 3.2 supports swapping the default primary key of models so 
> the integer exhaustion part of your suggestion should be addressed

That's not particularly related.  The issue isn't that there isn't any way to 
get a 64 bit key; there is, of course, using AutoBigField.  It's that the 
default, based on all of the documentation and code samples available, is to 
get a 32 bit key using AutoField, and that's a foot-gun with a long-delayed 
firing time.

The essence of the proposal is to make the public default for new projects 64 
bit keys; we're doing developers a disservice by making the default path 32 bit 
keys.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/C0092F9E-71ED-463D-A0A8-4F00D8F00889%40thebuild.com.


Increase default integer keys to 64 bits

2021-01-28 Thread Christophe Pettus
tl;dr: Introduce new field types to handle auto-incremented ID fields, change 
the PostgreSQL backend to use the preferred syntax

--

One of the most common issues my company runs into on Django sites is that 
models.AutoField defaults to a 32-bit integer (int32).  2^31-1 possible entries 
is just not that many anymore, and by the time the developers realize this and 
need to move to a 64 bit integer key, it's too late to do so conveniently, 
because expanding the field is very painful (in PostgreSQL, at least).

While models.AutoBigField exists, it's barely mentioned in examples, and is 
often overlooked.

Changing AutoField to 64 bits would result in all kinds of breakage; at best, a 
lot of very unplanned and expensive migrations.

My proposal is:

1. Create two new field types to represent auto-incrementing primary keys.  I'd 
suggest IdentityField and SmallIdentityField for int64 and int32, respectively.

2. Change all examples to use SerialField instead of AutoField.

3. As a side note, switch the PostgreSQL backend to use the standard "GENERATED 
BY DEFAULT AS IDENTITY" syntax.  This became available in PostgreSQL version 
10, but the previous version (9.6) reaches EOL in November 2021.

4. At some point in the future, deprecate AutoField and AutoBigField.

This would result in new projects getting 64 bit primary keys by default.  I 
think that's a positive.  For small tables, the size difference hardly matters; 
for big tables, we have saved a major foot-gun of either integer exhaustion or 
a very expensive data migration problem.

--

Comments?

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/F4A50AB6-8CC1-4B6E-BD1B-B865F53A645A%40thebuild.com.


Re: Partial indexes, PR help

2018-07-23 Thread Christophe Pettus


> On Jul 23, 2018, at 13:05, Christophe Pettus  wrote:
> 
> 
>> On Jul 23, 2018, at 12:20, Mads Jensen  wrote:
>> 
>> Q(published__gt=datetime.date(2017, 10, 1))
>> =>
>> "table"."published" > '2017-10-01'::timestamp;
>> 
>> is unfortunate because it turns the function mutable, and the index can 
>> therefore not be created (exact error message can be seen in the Jenkins 
>> output). 
> 
> I think the issue is that you are mixing TIMESTAMP and TIMESTAMPTZ here [...]

To be a bit clearer, this seems to be in effect a mixing-aware-and-unaware 
timestamp issue, just pushed down into PostgreSQL.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/E38CA9A0-47E5-48B6-B08A-6759A6429FAA%40thebuild.com.
For more options, visit https://groups.google.com/d/optout.


Re: Partial indexes, PR help

2018-07-23 Thread Christophe Pettus


> On Jul 23, 2018, at 12:20, Mads Jensen  wrote:
> 
> Q(published__gt=datetime.date(2017, 10, 1))
> =>
> "table"."published" > '2017-10-01'::timestamp;
> 
> is unfortunate because it turns the function mutable, and the index can 
> therefore not be created (exact error message can be seen in the Jenkins 
> output). 

I think the issue is that you are mixing TIMESTAMP and TIMESTAMPTZ here

xof=# create table i (p timestamptz);
CREATE TABLE
xof=# create index on i(p) where p > '2011-01-01'::timestamp;
ERROR:  functions in index predicate must be marked IMMUTABLE
xof=# create index on i(p) where p > '2011-01-01'::timestamptz;
CREATE INDEX

xof=# create table i (p timestamp);
CREATE TABLE
xof=# create index on i(p) where p > '2011-01-01'::timestamptz;
ERROR:  functions in index predicate must be marked IMMUTABLE
xof=# create index on i(p) where p > '2011-01-01'::timestamp;
CREATE INDEX

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/C136730A-1C63-47FB-8B24-C1800EC8C778%40thebuild.com.
For more options, visit https://groups.google.com/d/optout.


Re: Methodology for increasing the number of PBKDF2 iterations

2015-09-22 Thread Christophe Pettus

On Sep 22, 2015, at 10:27 AM, Tim Graham <timogra...@gmail.com> wrote:

> We have access to the plain text password when the user logs in.

Right, so we could *in theory* upgrade the user's password then if we wished 
(not clear if we want to).  Even so, I don't think that would be a DDoS-attack 
level problem, since it's no worse than a user resetting their password.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/1D76A603-9715-422C-8E2D-42281721E229%40thebuild.com.
For more options, visit https://groups.google.com/d/optout.


Re: Methodology for increasing the number of PBKDF2 iterations

2015-09-22 Thread Christophe Pettus

On Sep 22, 2015, at 10:18 AM, Tim Graham <timogra...@gmail.com> wrote:

> As I understand it, the problem with increasing the number of iterations on 
> the slower hasher is that upgrading Django could effectively result in a DDoS 
> attack after you upgrade Django as users passwords are upgraded.

Is that correct?  My understanding was that the passwords were only modified 
when changed.  Given that it is a unidirectional hash, I'm not sure how they 
*would* be rehashed.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/07FDF35A-5495-4BFE-A4CE-BD976537B5B0%40thebuild.com.
For more options, visit https://groups.google.com/d/optout.


Re: Making max_length argument optional

2015-09-22 Thread Christophe Pettus

On Sep 22, 2015, at 1:01 AM, Remco Gerlich <re...@gerlich.nl> wrote:

> Maybe django.contrib.postgres could have a ArbitraryLengthCharField?

Just a note that, on PostgreSQL, that's exactly what TextField is.  There might 
be a use for a field that creates a VARCHAR without length on PostgreSQL, but I 
can't think of it.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/D6E9D391-883D-4CB7-B9E2-78D6ACB8F7A4%40thebuild.com.
For more options, visit https://groups.google.com/d/optout.


Re: Making max_length argument optional

2015-09-21 Thread Christophe Pettus

On Sep 21, 2015, at 7:22 PM, Shai Berger <s...@platonix.com> wrote:

> I'd solve the "need to specify" issue by setting a default that is 
> intentionally smaller than the smallest (core) backend limitation, say 128. 

I'd be OK with that.  Not wild, because I think that having to specify 
max_length is good discipline, but not everyone likes oatmeal, either. :)

> I"d make an "unlimited length text field" a new type of field, explicitly not 
> supported on MySql and Oracle; and I'd suggest that it can live outside core 
> for a while. so we may get an impression of how popular it really is.

We kind of have that: TextField.  The problem is that TextField has very 
different performance characteristics and implementation details on PostgreSQL 
vs MySQL and Oracle.  I don't think we need another: If you know you are 
running on PostgreSQL, you just use TextField, and if you are either targeting 
a different database, or writing one that runs on multiple ones, you probably 
want CharField with a specific length.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/3542EAF7-C2EB-4A4B-94F4-8C7A9EC4AC4E%40thebuild.com.
For more options, visit https://groups.google.com/d/optout.


Re: Making max_length argument optional

2015-09-21 Thread Christophe Pettus

On Sep 21, 2015, at 7:10 PM, "Podrigal, Aron" <ar...@guaranteedplus.com> wrote:

> There is actually another reason to not have to specify a max_length which 
> was mentioned earlier, is because most of the time you don't care about that 
> and is just tedious to have to specify that when you can get it to work 
> without it. Default values has always been here for that reason.

I'm afraid I must disagree that "most of the time you don't care about it."  I 
certainly do.  I'm always reasonably careful to specify a max_length that 
corresponds to the underlying data.

There's no "sensible" default for max_length.  It entirely depends on the data 
you are storing.  Picking a backend-specific max_length means that application 
writers now have no idea how much data the CharField can store: 
1GB-ish-depending-on-encoding?  255?  4000 or something less depending on 
Oracle's encoding?

Requiring a max_length is enforcing a good practice.
--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/46F57CA5-459B-4B2D-B610-EF3FF1190256%40thebuild.com.
For more options, visit https://groups.google.com/d/optout.


Re: Making max_length argument optional

2015-09-21 Thread Christophe Pettus

On Sep 21, 2015, at 6:12 PM, "Podrigal, Aron" <ar...@guaranteedplus.com> wrote:

> The reason for having a max_length set to None, is because that's what I want 
> for my database columns to be in Postgres, and for MySQL I don't care about 
> the length too, I always choose varchar(255) just for because it is required 
> for the database backend.

Well, that's not a practice I think we need to go to great lengths to support.  
If you *really* *must* have a VARCHAR field without a length, you can always 
use a migration to strip it off.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/95E76678-2FAB-46B4-B830-0AC877886EE4%40thebuild.com.
For more options, visit https://groups.google.com/d/optout.


Re: Making max_length argument optional

2015-09-21 Thread Christophe Pettus

On Sep 21, 2015, at 5:49 PM, "Podrigal, Aron" <ar...@guaranteedplus.com> wrote:

> Different schemas?? Schema will always be different for each database backend 
> according to its datatypes.

It means if you specify a CharField without a length, you don't know how many 
characters it can accept without error.  That doesn't seem like something we 
should make a change to accept.

> I really don't understand what your concern is.

The current behavior seems entirely reasonable, and I'm not sure I understand 
what problems it is causing.  Specifying a maximum length on a CharField is not 
just a random habit; it should be done as a way of sanity checking the value to 
a reasonable length.  Sometimes, that's natural to the data (there are no 50 
character telephone numbers or 5000 character email addresses), sometimes it's 
just a way of making sure that something bad doesn't get into the database.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/15D5715A-3EF5-4CC0-830C-8EB714424335%40thebuild.com.
For more options, visit https://groups.google.com/d/optout.


Re: Making max_length argument optional

2015-09-21 Thread Christophe Pettus

On Sep 21, 2015, at 2:49 PM, "Podrigal, Aron" <ar...@guaranteedplus.com> wrote:

> We're not talking about representing all CharFields as TEXT, it is about 
> choosing a sane length as the default for the varchar datatype.

But that means notably different schemas on different backends, for not an 
obvious gain.  What's the benefit there?

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/FE33D41D-30F7-497F-9E2D-4ABC396E4BE6%40thebuild.com.
For more options, visit https://groups.google.com/d/optout.


Re: Making max_length argument optional

2015-09-21 Thread Christophe Pettus

On Sep 21, 2015, at 9:54 AM, 'Tom Evans' via Django developers (Contributions 
to Django itself) <django-developers@googlegroups.com> wrote:
> I'm slightly worried from a DB point of view.

I have to agree, even speaking as PostgreSQL geek.  While VARCHAR and TEXT are 
implemented the same way in PostgreSQL, conceptually they're different things.  
I don't think the relatively modest benefit of having no default justifies the 
problems that result on other platforms.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/DA725AF8-A7CA-449F-B92A-0BCCDB124AD6%40thebuild.com.
For more options, visit https://groups.google.com/d/optout.


Re: future of QuerySet.extra()?

2015-07-31 Thread Christophe Pettus
+1.

On Jul 31, 2015, at 2:12 PM, Marc Tamlyn <marc.tam...@gmail.com> wrote:

> Sounds good to me.
> 
> On 31 July 2015 at 21:00, Tim Graham <timogra...@gmail.com> wrote:
> I had in mind a documentation note like this:
> 
> Use this method as a last resort
> 
> 
> 
> This is an old API that we aim to deprecate at some point in the future. Use 
> it only if you cannot express your query using other queryset methods. If you 
> do need to use it, please file a ticket with your use case so that we can 
> enhance the QuerySet API to allow removing extra(). We are no longer 
> improving or fixing bugs for this method.
> 
> 
> On Friday, July 31, 2015 at 2:07:34 PM UTC-4, Collin Anderson wrote:
> I wonder if there's a way in the docs we can deprecate it as in "we don't 
> recommend you use it", but not actually schedule it for removal.
> 
> On Friday, July 31, 2015 at 2:01:20 PM UTC-4, Marc Tamlyn wrote:
> I don't know about unmaintained, but I think there's a consensus that 
> .extra() has a horrible API and we should do away with it eventually. That 
> said I think there are still enough things that can't be done without it at 
> present. A lot fewer now we have expressions, but still some.
> 
> I'd be happy to put a moratorium on improving it, but we can't deprecate it 
> yet.
> 
> On 31 July 2015 at 18:58, Tim Graham <timog...@gmail.com> wrote:
> In light of the new expressions API, the idea of deprecating QuerySet.extra() 
> has been informally discussed in IRC and elsewhere. I wonder if there is 
> consensus to mark extra() as "unmaintained" and to suggest filing feature 
> requests for functionality that can be performed through extra() but not 
> through other existing QuerySet methods? There are at least several tickets 
> (examples below) of edge cases that don't work with extra(). It seems like a 
> waste of time to leave these tickets as accepted and to triage new issues 
> with extra() if they won't be fixed.
> 
> https://code.djangoproject.com/ticket/24142
> https://code.djangoproject.com/ticket/19434
> https://code.djangoproject.com/ticket/12890
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to django-develop...@googlegroups.com.
> To post to this group, send email to django-d...@googlegroups.com.
> Visit this group at http://groups.google.com/group/django-developers.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/django-developers/6e1be326-3b17-49ca-accf-03eec5ad41ef%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to django-developers+unsubscr...@googlegroups.com.
> To post to this group, send email to django-developers@googlegroups.com.
> Visit this group at http://groups.google.com/group/django-developers.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/django-developers/7c1568b6-f7f1-4aab-9263-af447e45af45%40googlegroups.com.
> 
> For more options, visit https://groups.google.com/d/optout.
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to django-developers+unsubscr...@googlegroups.com.
> To post to this group, send email to django-developers@googlegroups.com.
> Visit this group at http://groups.google.com/group/django-developers.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/django-developers/CAMwjO1FQAJqrXu3HcSpP3xDF%2BA%3DsyG%3DHP90V%3DzrKBHJ%3Dg%3DcfDg%40mail.gmail.com.
> For more options, visit https://groups.google.com/d/optout.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/F2C975E9-D7A1-44BC-BD2A-C05F61917B3C%40thebuild.com.
For more options, visit https://groups.google.com/d/optout.


Re: Support for UNLOGGED tables in PostgreSQL

2015-07-20 Thread Christophe Pettus
I made a small comment about the URL form.  To my ear, the text could use a bit 
of wordsmithing, but I think it's the right content.

On Jul 20, 2015, at 3:10 AM, Federico Capoano <federico.capo...@gmail.com> 
wrote:

> Errata, PR link is https://github.com/django/django/pull/5021 - sorry
> 
> On Mon, Jul 20, 2015 at 11:42 AM, Federico Capoano
> <federico.capo...@gmail.com> wrote:
>> Thank you for all the feedback.
>> 
>> I opened this PR with the proposed addition to the docs:
>> https://github.com/django/django/compare/master...nemesisdesign:patch-1
>> 
>> Let me know if there's anything I can improve.
>> 
>> Just a final note: I think this solution is not the optimal one, let
>> me explain you why. The main reason I came on this list asking for
>> information about UNLOGGED tables in postgresql was exactly because of
>> that page which suggests a few solutions. I spent quite many hours
>> trying all of them (including running the entire DB in RAM), and
>> measuring test execution time with each one of those settings. In the
>> end I figured out that the only settings worth touching in my setup
>> are those 3 I mentioned before.
>> 
>> In order to share what I learnt and hoping to avoid some pain to
>> people that will come on the same path, I quickly wrote a blog post
>> with these suggestions:
>> http://nemesisdesign.net/blog/coding/how-to-speed-up-tests-django-postgresql/
>> 
>> Nothing new really, I found a few other blog posts with similar
>> suggestions, but the suggestions were scattered on different pages and
>> they didn't mention which changes were the most effective ones. That's
>> why I felt the need of writing this.
>> 
>> And by the way, I'm really happy of the outcome!
>> 
>> Federico
>> 
>> On Mon, Jul 20, 2015 at 2:47 AM, Curtis Maloney
>> <cur...@acommoncreative.com> wrote:
>>> I second what Aymeric say rather than take on the burden of
>>> maintaining correct warnings, let's point at the people whose
>>> responsibility it really is :)
>>> 
>>> --
>>> Curtis
>>> 
>>> On 20 July 2015 at 06:44, Aymeric Augustin
>>> <aymeric.augus...@polytechnique.org> wrote:
>>>> I agree with pointing to the relevant section of the PostgreSQL 
>>>> documentation. It will always be more complete, accurate and up-to-date 
>>>> that what we could write.
>>>> 
>>>> --
>>>> Aymeric.
>>>> 
>>>> 
>>>> 
>>>>> On 19 juil. 2015, at 19:43, Christophe Pettus <x...@thebuild.com> wrote:
>>>>> 
>>>>> This can be achieved by pointing to the relevant section in the 
>>>>> PostgreSQL documentation with a general "Test execution may be sped up by 
>>>>> adjusting the data integrity parameters in PostgreSQL; be sure to read 
>>>>> the appropriate warnings before making any changes" warning.
>>>>> 
>>>>> Putting actual recommended settings in the Django documentation seems, at 
>>>>> a minimum, pointlessly duplicative, and ties the Django documentation to 
>>>>> the current state of the world in PostgreSQL gratuitously.
>>>>> 
>>>>> 
>>>>> On Jul 19, 2015, at 10:32 AM, Luke Plant <l.plant...@cantab.net> wrote:
>>>>> 
>>>>>> I agree with Federico on this - as long as we slap a big warning on it — 
>>>>>> "This is dangerous - it could make your database more likely to lose 
>>>>>> data or become corrupted, only use on a development machine where you 
>>>>>> can restore the entire contents of all databases in the cluster easily" 
>>>>>> — I don't see a problem in this being in our docs.
>>>>>> 
>>>>>> If people refuse to read a clear warning, they shouldn't be doing web 
>>>>>> development. They are just as likely to find similar instructions on the 
>>>>>> internet, but without warnings, and having it in our docs with the 
>>>>>> warning will be helpful.
>>>>>> 
>>>>>> Having a fast test suite is such an important part of development that 
>>>>>> it shouldn't be held back by  attempting to protect the world from 
>>>>>> people who cannot be helped.
>>>>>> 
>>>>>> Luke
>>>>>> 
>>>>>> On 16/07/15 16:49, Christophe Pettus wrote:
>&g

Re: Support for UNLOGGED tables in PostgreSQL

2015-07-19 Thread Christophe Pettus
This can be achieved by pointing to the relevant section in the PostgreSQL 
documentation with a general "Test execution may be sped up by adjusting the 
data integrity parameters in PostgreSQL; be sure to read the appropriate 
warnings before making any changes" warning.

Putting actual recommended settings in the Django documentation seems, at a 
minimum, pointlessly duplicative, and ties the Django documentation to the 
current state of the world in PostgreSQL gratuitously.


On Jul 19, 2015, at 10:32 AM, Luke Plant <l.plant...@cantab.net> wrote:

> I agree with Federico on this - as long as we slap a big warning on it — 
> "This is dangerous - it could make your database more likely to lose data or 
> become corrupted, only use on a development machine where you can restore the 
> entire contents of all databases in the cluster easily" — I don't see a 
> problem in this being in our docs.
> 
> If people refuse to read a clear warning, they shouldn't be doing web 
> development. They are just as likely to find similar instructions on the 
> internet, but without warnings, and having it in our docs with the warning 
> will be helpful.
> 
> Having a fast test suite is such an important part of development that it 
> shouldn't be held back by  attempting to protect the world from people who 
> cannot be helped.
> 
> Luke
> 
> On 16/07/15 16:49, Christophe Pettus wrote:
>> On Jul 16, 2015, at 1:16 AM, Federico Capoano <federico.capo...@gmail.com>
>>  wrote:
>> 
>> 
>>> I also don't like the idea of believing django users are too stupid to
>>> understand that this advice si valid for development only. Generally
>>> python and django users are intelligent enough to properly read the
>>> docs and understand what's written on it.
>>> 
>> It's not a matter of being "intelligent" or not.  Developers are busy and 
>> can simply google things, see a particular line, and drop it in without 
>> fully understanding exactly what is going on.  (Simply read this group for a 
>> while if you don't believe this to be the case!)  People already turn off 
>> fsync, in production, after having read the PostgreSQL documentation, 
>> without actually realizing that they've put their database in danger.
>> 
>> Among other things, developers often have local data in their PostgreSQL 
>> instance that is valuable, and advising them to do a setting that runs the 
>> risk of them losing that data seems like a bad idea.
>> 
>> The Django documentation is not the place to go into the ramifications of 
>> fsync (or even synchronous_commit, although that's significantly less risky).
>> 
>> --
>> -- Christophe Pettus
>>
>> x...@thebuild.com
>> 
>> 
>> 
> 
> -- 
> "I was sad because I had no shoes, until I met a man who had no 
> feet. So I said, "Got any shoes you're not using?"  (Steven Wright)
> 
> Luke Plant || 
> http://lukeplant.me.uk/
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to django-developers+unsubscr...@googlegroups.com.
> To post to this group, send email to django-developers@googlegroups.com.
> Visit this group at http://groups.google.com/group/django-developers.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/django-developers/55ABDF21.9060106%40cantab.net.
> For more options, visit https://groups.google.com/d/optout.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/648BE2F3-E869-4E9D-BCB9-248E425D5A1C%40thebuild.com.
For more options, visit https://groups.google.com/d/optout.


Re: Support for UNLOGGED tables in PostgreSQL

2015-07-16 Thread Christophe Pettus

On Jul 16, 2015, at 1:16 AM, Federico Capoano <federico.capo...@gmail.com> 
wrote:

> I also don't like the idea of believing django users are too stupid to
> understand that this advice si valid for development only. Generally
> python and django users are intelligent enough to properly read the
> docs and understand what's written on it.

It's not a matter of being "intelligent" or not.  Developers are busy and can 
simply google things, see a particular line, and drop it in without fully 
understanding exactly what is going on.  (Simply read this group for a while if 
you don't believe this to be the case!)  People already turn off fsync, in 
production, after having read the PostgreSQL documentation, without actually 
realizing that they've put their database in danger.

Among other things, developers often have local data in their PostgreSQL 
instance that is valuable, and advising them to do a setting that runs the risk 
of them losing that data seems like a bad idea.

The Django documentation is not the place to go into the ramifications of fsync 
(or even synchronous_commit, although that's significantly less risky).

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/44C44B64-1FD1-48A3-8DC8-ADD5BCCCA27C%40thebuild.com.
For more options, visit https://groups.google.com/d/optout.


Re: Support for UNLOGGED tables in PostgreSQL

2015-07-16 Thread Christophe Pettus

On Jul 15, 2015, at 8:35 PM, Curtis Maloney <cur...@acommoncreative.com> wrote:

> On 16 July 2015 at 05:01, Shai Berger <s...@platonix.com> wrote:
>> This is a shot in the dark: Could it be that rolling back transactions
>> involving unlogged tables is harder? The idea does make sense, and running 
>> the
>> test suite does an extremely untypical amount of rollbacks.
> 
> I thought at some point I read that unlogged tables didn't support
> transactions... however, the docs don't agree.

Transactions behave the same in PostgreSQL for both logged and unlogged tables 
(except for, of course, the lack of a commit / rollback entry in the WAL), and 
there's no appreciable performance benefit on COMMIT and ROLLBACK time for 
logged vs unlogged.

My guess is that the Django tests are not generating enough data to make the 
WAL activity be a significant time sink.

By the way, I would strongly advise *against* *ever* even mentioning fsync = 
off anywhere in the Django documentation; that is such a horribly bad idea in 
99.95% of real-life situations that steering people towards it as a "go faster" 
button is very unwise.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/9B2A4A22-09FF-4634-AF4A-536C022FC57E%40thebuild.com.
For more options, visit https://groups.google.com/d/optout.


Re: Unfortunate /login behavior in 1.8 (and 1.7)?

2015-04-06 Thread Christophe Pettus
Ignore my silliness... operator error, that the 1.7 change flushed out!

On Apr 6, 2015, at 11:36 PM, Christophe Pettus <x...@thebuild.com> wrote:

> I have a site with a /login URL.  This URL is for customer logins to an 
> ecommerce site, and is distinct from the /admin/login to the Django admin.
> 
> However, having upgraded from 1.6 to 1.8, it appears that /admin/login is 
> getting confused, and running the view associated with the /login URL.  This 
> effectively prevents me from logging into the admin.  I assume this has 
> something to do with the redirection change in 1.7 for the admin login.
> 
> If I change the name of /login to /customer_login, that confusion goes away.
> 
> --
> -- Christophe Pettus
>   x...@thebuild.com
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Django developers  (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to django-developers+unsubscr...@googlegroups.com.
> To post to this group, send email to django-developers@googlegroups.com.
> Visit this group at http://groups.google.com/group/django-developers.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/django-developers/F262F7CB-69ED-4BE3-B1FD-F1FDECA8591F%40thebuild.com.
> For more options, visit https://groups.google.com/d/optout.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/744EBC05-CA1F-47FF-96A9-767F2441EDFB%40thebuild.com.
For more options, visit https://groups.google.com/d/optout.


Unfortunate /login behavior in 1.8 (and 1.7)?

2015-04-06 Thread Christophe Pettus
I have a site with a /login URL.  This URL is for customer logins to an 
ecommerce site, and is distinct from the /admin/login to the Django admin.

However, having upgraded from 1.6 to 1.8, it appears that /admin/login is 
getting confused, and running the view associated with the /login URL.  This 
effectively prevents me from logging into the admin.  I assume this has 
something to do with the redirection change in 1.7 for the admin login.

If I change the name of /login to /customer_login, that confusion goes away.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/F262F7CB-69ED-4BE3-B1FD-F1FDECA8591F%40thebuild.com.
For more options, visit https://groups.google.com/d/optout.


Re: Psycopg2 version support

2015-02-15 Thread Christophe Pettus

On Feb 15, 2015, at 4:14 PM, Tim Graham <timogra...@gmail.com> wrote:

> Is there a scenario where you could pip install Django but not pip install 
> psycopg2?

Installing psycopg2 does require development tools, while Django does not.  I'm 
not offering this as a compelling argument for anything, just an observation.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/F45FF0E9-D2ED-4D3F-A108-29B33027BECD%40thebuild.com.
For more options, visit https://groups.google.com/d/optout.


Re: Configurable safety options for high performance Django systems

2014-11-24 Thread Christophe Pettus

On Nov 24, 2014, at 11:16 AM, Rick van Hattem <wo...@wol.ph> wrote:

> It seems you are misunderstanding what I am trying to do here. The 10,000 (or 
> whatever, that should be configurable) is a number large enough not to bother 
> anyone but small enough not to trigger the OOM system.

There are really only four options that could be implemented at the framework 
level:

1. You add a LIMIT to every query.  As noted, this can change the performance 
of a query, sometimes quite radically, often in a bad way.  An OOM error is 
clear and explicit; a query suddenly running 100x slower is a lot stranger to 
track down.  (Also, unless you decide that returning the entire LIMIT amount is 
an error, you can't really know if the query would have returned *more* than 
that.)

2. You don't add the LIMIT.  At that point, you have no guard against excessive 
memory consumption.

3. You do two queries, one a count(*), one to get the result.  This is a huge 
protocol change, and quite inefficient (and without wrapping it a transaction 
at an elevated isolation mode, not reliable).

4. You use a named cursor.  This is also a huge protocol change.

Those are your choices.

It's not that the *problem* is unusual; it's that being unable to fix the 
problem in the applcication is unusual.  I've encountered this problem, sure, 
and what I did was fix the code; it took like two minutes.  Indeed, I would 
*strongly* advise that if you are issuing queries that you expect to get 100 
results back and are getting a memory-crushing result back, you fundamentally 
don't understand something about your data, and need to address that promptly.  
Running properly on large datasets is a big job; just a patch like this isn't 
going to solve all the issues.

In your particular case, where you have the relatively unusual situation that:

1. You have this problem, and,
2. You can't fix the code to solve this problem.

... you probably have the right answer is having a local patch for Django.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/5224637B-A37E-4B9F-A6F9-B95B405A82BF%40thebuild.com.
For more options, visit https://groups.google.com/d/optout.


Re: Configurable safety options for high performance Django systems

2014-11-24 Thread Christophe Pettus

On Nov 24, 2014, at 3:36 AM, Rick van Hattem <wo...@wol.ph> wrote:
> If you fetch N+1 items you know if there are over N items in your list.

Let's stop there.  Unfortunately, because of the way libpq works, just sending 
the query and checking the result set size won't solve your problem, except for 
an even smaller edge case.

Using the standard (non-named-curosr) protocol, when you get the first result 
back from libpq, *every result* is sent over the wire, and takes up space on 
the client.  Thus, no limitation on the client side (number of Django objects 
created, number of Python objects created by psycopg2) will prevent an 
out-of-memory error.  In my experience (and I've seen this problem a lot), the 
OOM occurs on the libpq results, not on the other parts.

Thus, the proposal only solves the relatively narrow case where the libpq 
result does *not* create an OOM, but creating the Django and psycopg2 objects 
does.

I'll note also that it's not that server that dies in this case; the particular 
thread doing the request gets an exception.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/81A38D7E-E355-46D1-9C21-1C7D43BCC648%40thebuild.com.
For more options, visit https://groups.google.com/d/optout.


Re: Configurable safety options for high performance Django systems

2014-11-24 Thread Christophe Pettus

On Nov 24, 2014, at 1:08 AM, Rick van Hattem <wo...@wol.ph> wrote:

> Indeed, except it's not an "except: pass" but an "except: raise" which I'm 
> proposing. Which makes a world of difference.

Well, as previously noted, this option would introduce another round-trip into 
every database if it's actually going to check (and, in most databases, would 
have to run inside a transaction at, at least, REPEATABLE READ isolation mode 
in order to provide a strong guarantee, so you can add three more statements to 
a lot of interactions right there).  That seems to reduce the "high 
performance" part of the system.

What I'm getting at is: This is a bug in the application.  It's not a 
misfeature in Django.  If you have a bug in an application whose source code 
cannot be changed, that's a shame, but I don't think that Django can be 
expected to introduce configuration options to cover every possible scenario in 
which a previously-written Django application interacts badly with the database 
in a way in which, in theory, it could do extra work (slowing down non-buggy 
applications and introducing more code to QA) to patch the bad application.  To 
me, this is about as logical as putting in DIVIDE_BY_ZERO_RETURNS_NONE with a 
default of False because someone wrote some legacy code that would work if we 
did that.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/C28E966C-8DB6-4E9D-BDBA-F30C7708EBBE%40thebuild.com.
For more options, visit https://groups.google.com/d/optout.


Re: Configurable safety options for high performance Django systems

2014-11-23 Thread Christophe Pettus

On Nov 23, 2014, at 1:53 PM, Rick van Hattem <wo...@wol.ph> wrote:

> Very true, that's a fair point. That's why I'm opting for a configurable 
> option. Patching this within Django has saved me in quite a few cases but it 
> can have drawbacks.

As a DB guy, I have to say that if an application is sending a query that 
expects to get 100 results back but gets 1,000,000 back, you have a bug that 
needs to be tracked down and fixed.  Patching it by limiting the results is 
kind of a database version of "except: pass" to get rid of an inconvenient but 
mysterious exception.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/09B28BB5-13DE-43B3-830F-F10E8AA7B5EC%40thebuild.com.
For more options, visit https://groups.google.com/d/optout.


Re: Configurable safety options for high performance Django systems

2014-11-23 Thread Christophe Pettus

On Nov 23, 2014, at 1:07 PM, Rick van Hattem <wo...@wol.ph> wrote:

> > Not really, cause psycopg already fetched everything.
> 
> Not if Django limits it by default :)

Unfortunately, that's not how it works.  There are three things that take up 
memory as the result of a query result:

1. The Django objects.  These are created in units of 100 at a time.
2. The psycopg2 Python objects from the result.  These are already limited to a 
certain number (I believe to 100) at a time.
3. The results from libpq.  These are not limited, and there is no way of 
limiting them without creating a named cursor, which is a significant change to 
how Django interacts with the database.

In short, without substantial, application-breaking changes, you can't limit 
the amount of memory a query returns unless you add a LIMIT clause to it.  
However, adding a LIMIT clause can often cause performance issues all by itself:

http://thebuild.com/blog/2014/11/18/when-limit-attacks/

There's no clean fix that wouldn't have significant effects on unsuspecting 
applications.
--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CD10DACF-4D0A-458D-BB85-0F7BB8BFF4C0%40thebuild.com.
For more options, visit https://groups.google.com/d/optout.


Re: RFC: "UPSERT" in PostgreSQL

2014-09-28 Thread Christophe Pettus

On Sep 28, 2014, at 12:44 PM, Petite Abeille <petite.abei...@gmail.com> wrote:

> Postgres has convince itself that it somehow cannot support MERGE. Therefore 
> it will not. 


There's no question that PostgreSQL could support SQL MERGE.  But SQL MERGE is 
not what people are asking for when they ask for UPSERT.  PostgreSQL could 
implement UPSERT and *call it* MERGE (with somewhat different syntax, most 
likely)... but how that would be better than implementing UPSERT and calling it 
something that doesn't conflict with existing specification language escapes me.

In short: "Clean MERGE," where I assume "clean" means "with reasonable behavior 
in the presence of concurrent activity" and MERGE means "the MERGE statement 
defined in the standard" is a contradiction in terms, and expecting PostgreSQL 
to square that circle isn't a reasonable request.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/C361C1D9-A035-48BC-A5C2-0EDB88C0E6F2%40thebuild.com.
For more options, visit https://groups.google.com/d/optout.


Re: changing the on_delete=CASCADE default

2014-02-05 Thread Christophe Pettus
After far too long, this ticket has been created:

https://code.djangoproject.com/ticket/21961

If there's general consensus that this feature is worth working on, I'll see 
about a 1.7-targeted patch for it.

On Sep 28, 2013, at 10:16 AM, Anssi Kääriäinen <anssi.kaariai...@thl.fi> wrote:

> 
> 
> On Saturday, September 28, 2013 4:31:18 AM UTC+3, Xof wrote:
> 
> On Sep 27, 2013, at 2:56 PM, Anssi Kääriäinen <anssi.ka...@thl.fi> wrote: 
> 
> >   1. What to do if given DB doesn't support cascades in DB (sqlite at 
> > least, no idea of MySQL)? Initial feeling is that Django should do the 
> > cascades in Python code in these cases. 
> 
> It would behave like the standard version, then, yes. 
> 
> >   2. What to do if you have delete signals + db cascades set for given 
> > model? Options are to do nothing at all, give a warning (manage.py check 
> > might be able to do so) or raise an error in model validation. 
> 
> If we document that the _DB variation doesn't fire signals, I believe that's 
> sufficient. 
> 
> >   3. A model definition like A -- db cascade -> B -- cascade in python -> C 
> > is another problematic case. a_obj.delete() will cascade to B, but then 
> > that deletion will fail because of C constraint not cascading. Again 
> > possibilities are do nothing/warn/error 
> 
> Interesting question.  I believe we can just document that it won't work 
> properly, because in those DBs that support proper cascading behavior, what 
> you get in the B -> C cascade will be an error. 
> 
> >   4. A slight variation of above - generic foreign key cascades - here it 
> > will be impossible to handle the cascades in DB (unless we want to write 
> > custom triggers for this). And, the inconsistent state left behind will not 
> > be spotted by the DB either as there aren't any constraints in the DB for 
> > generic foreign keys. So, this is slightly worse than #3. 
> 
> We can, of course, just disallow using the _DB variations for generic foreign 
> keys. 
> 
> >   5. Parent cascades: If you have model Child(Parent), then there will be 
> > foreign key from child to parent, but not from parent to child. This means 
> > that DB can't cascade child model deletion to the parent model. So, there 
> > is again possibility for inconsistent state. So, if you have Child -- db 
> > cascade -> SomeModel, and you delete somemodel instance then what to do to 
> > get the Child's parent table data deleted? 
> 
> Either: 
> 
> (a) You disallow that. 
> (b) You allow it, but warn that if you delete the child, the parent is not 
> cleaned up. 
> 
> I lean towards (a). 
> 
> Yes, I think we need to disallow  #4 and #5. It will be too easy to miss 
> these edge cases, as things will seem to work correctly.
> 
> The data model in #4 is this:
> 
> class SomeModel(models.Model):
> fk = models.ForeignKey(SomeOtherModel, on_delete=DB_CASCADE)
> gen_rel = GenericRelation(GFKModel)
>  
> This is quite an edge case, but it would be nice to detect & prevent this. I 
> am not sure if GenericRelation actually respects to_delete currently at all.
> 
> For multitable inheritance it will be easiest to prevent db-cascades in all 
> foreign keys, both from parent models and child models. That is likely overly 
> restrictive, the only really problematic case seems to be db cascade foreign 
> key in child models. But it will be possible to improve multitable cascades 
> later on, so lets just get something working implemented first.
> 
> Probably time to move this into Trac... You can open a ticket there and 
> assign it to yourself.
> 
>  - Anssi
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Django developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to django-developers+unsubscr...@googlegroups.com.
> To post to this group, send email to django-developers@googlegroups.com.
> Visit this group at http://groups.google.com/group/django-developers.
> For more options, visit https://groups.google.com/groups/opt_out.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/B85349C0-056B-451B-9A7F-039F6FF886AC%40thebuild.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: changing the on_delete=CASCADE default

2013-09-27 Thread Christophe Pettus

On Sep 27, 2013, at 2:56 PM, Anssi Kääriäinen <anssi.kaariai...@thl.fi> wrote:

>   1. What to do if given DB doesn't support cascades in DB (sqlite at least, 
> no idea of MySQL)? Initial feeling is that Django should do the cascades in 
> Python code in these cases.

It would behave like the standard version, then, yes.

>   2. What to do if you have delete signals + db cascades set for given model? 
> Options are to do nothing at all, give a warning (manage.py check might be 
> able to do so) or raise an error in model validation.

If we document that the _DB variation doesn't fire signals, I believe that's 
sufficient.

>   3. A model definition like A -- db cascade -> B -- cascade in python -> C 
> is another problematic case. a_obj.delete() will cascade to B, but then that 
> deletion will fail because of C constraint not cascading. Again possibilities 
> are do nothing/warn/error

Interesting question.  I believe we can just document that it won't work 
properly, because in those DBs that support proper cascading behavior, what you 
get in the B -> C cascade will be an error.

>   4. A slight variation of above - generic foreign key cascades - here it 
> will be impossible to handle the cascades in DB (unless we want to write 
> custom triggers for this). And, the inconsistent state left behind will not 
> be spotted by the DB either as there aren't any constraints in the DB for 
> generic foreign keys. So, this is slightly worse than #3.

We can, of course, just disallow using the _DB variations for generic foreign 
keys.

>   5. Parent cascades: If you have model Child(Parent), then there will be 
> foreign key from child to parent, but not from parent to child. This means 
> that DB can't cascade child model deletion to the parent model. So, there is 
> again possibility for inconsistent state. So, if you have Child -- db cascade 
> -> SomeModel, and you delete somemodel instance then what to do to get the 
> Child's parent table data deleted?

Either:

(a) You disallow that.
(b) You allow it, but warn that if you delete the child, the parent is not 
cleaned up.

I lean towards (a).

--

The _DB variations should be considered something like .update and .raw; 
they're for performance benefits where you know you are doing.  They don't need 
to solve every edge case.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
For more options, visit https://groups.google.com/groups/opt_out.


Re: changing the on_delete=CASCADE default

2013-09-26 Thread Christophe Pettus

On Sep 26, 2013, at 3:28 PM, Christophe Pettus <x...@thebuild.com> wrote:
> Perhaps a CASCADE_DB and SET_NULL_DB options on on_delete?

And, to be clear, I *am* volunteering to take a go at this code, not just 
whine. :)

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
For more options, visit https://groups.google.com/groups/opt_out.


Re: changing the on_delete=CASCADE default

2013-09-26 Thread Christophe Pettus

On Sep 26, 2013, at 2:32 PM, Carl Meyer <c...@oddbird.net> wrote:
> We already provide the on_delete=DO_NOTHING option for people who want
> to push cascade handling to the database.

It's better than the previous situation, but the steps required to make this 
work make it a non-starter for any but the most trivial of projects.  I do, 
however, accept that we're painted into a corner with the current API.

I would strongly advocate for a way of doing this push, however: It's much more 
efficient for cascading without exotic additions such as signals.  The current 
way one has to do it has several problems:

1. You are, in essence, lying in your model about what is going to happen, by 
saying on_delete=DO_NOTHING and then doing something in the database itself.
2. Since Django creates the foreign key constraints and gives them 
unpredictable names, you have to write a very tedious, error-prone South 
migration to install the appropriate foreign key constraints, something that 
Django could very easily do.

Perhaps a CASCADE_DB and SET_NULL_DB options on on_delete?

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
For more options, visit https://groups.google.com/groups/opt_out.


Re: changing the on_delete=CASCADE default

2013-09-26 Thread Christophe Pettus

On Sep 26, 2013, at 11:16 AM, Carl Meyer <c...@oddbird.net> wrote:
> I think either of these changes, but particularly the latter, is
> significant enough that it deserves a mention here before a decision is
> made.

It's a breaking change, so that's going to be a significant amount of upgrade 
work for existing applications.

I also think we *really* need to push execution of this functionality into the 
database rather than having the Django core do it, if we're going to be making 
more use of on_delete.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Good practices for production settings

2013-03-17 Thread Christophe Pettus

On Mar 17, 2013, at 10:33 AM, Aymeric Augustin wrote:
> Would anyone like to review it?

I only had a chance for a once-over-lightly, but +1 to committing it from my 
pass; it looks very valuable.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-04 Thread Christophe Pettus

On Mar 4, 2013, at 7:24 AM, Aymeric Augustin wrote:

> PostgreSQL and Oracle use the "read committed" ...

Sorry, replied too soon!

> The reasoning and the conclusion still stand.

Agreed.
--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-04 Thread Christophe Pettus

On Mar 4, 2013, at 5:00 AM, Aymeric Augustin wrote:

> PostgreSQL and Oracle use the "repeatable read" isolation level by default.

Without explicitly changing it, PostgreSQL's default is READ COMMITTED.  Or are 
we setting it explicitly to REPEATABLE READ in the new model?

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-03 Thread Christophe Pettus

On Mar 3, 2013, at 10:13 AM, Aymeric Augustin wrote:

> In practice, the solution is probably called @xact. Applying it to each 
> public ORM function should do the trick. Therefore, I'd like to ask your 
> permission to copy it in Django. Technically speaking, this means relicensing 
> it from PostgreSQL to BSD.

Absolutely; it would be my honor.  Just contact me off-list and we can sort out 
the details.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-03 Thread Christophe Pettus

On Mar 2, 2013, at 3:49 PM, Jacob Kaplan-Moss wrote:
> I'm with Aymeric: the current behavior is bad enough, and this is a
> big enough improvement, and the backwards-incompatibility is minor
> enough.

Right now, the only real example I've heard (there might be more is):

1. The ORM generates multiple updating operations for a single API-level 
operation.
2. The developer did nothing to manage their transaction model (no decorator, 
no middleware), but,
3. Is relying on Django to provide a transaction in this case.

That situation does exist, but it does seem pretty edge-case-y.  Does it exist 
in any case besides model inheritance?  If not, could we have the ORM wrap 
those operations in a transaction in that particular case?
--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Switch to database-level autocommit

2013-03-01 Thread Christophe Pettus

On Mar 1, 2013, at 4:48 AM, Aymeric Augustin wrote:
> Yay or nay?

+1.
--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Database pooling vs. persistent connections

2013-02-28 Thread Christophe Pettus

On Feb 28, 2013, at 1:43 PM, David Cramer wrote:

> It is most definitely not an "error" to have less connections available than 
> workers, considering workers may serve different types of requests, and will 
> now persist the database connection even after that request has finished.

If you have more workers than database connections, you have either (a) 
over-configured the number of workers, which is generally a bad thing to do, or 
(b) you are accepting that you will at high-load points get refused 
connections.  I don't see either one as being correct.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Database pooling vs. persistent connections

2013-02-28 Thread Christophe Pettus

On Feb 28, 2013, at 11:09 AM, David Cramer wrote:

> Immediately for anyone who has configured more workers than they have 
> Postgres connections (which I can only imagine is common among people who 
> havent setup infrastructure like pgbouncer) things will start blowing up.

If they have this configuration, it's an error.  The fact that the error is now 
surfacing doesn't make it a correct configuration.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




Re: Database pooling vs. persistent connections

2013-02-28 Thread Christophe Pettus
One comment on the patch (which I generally approve of entirely):

It would be helpful to have a backend method that performers the "restore 
connection between uses" function, rather than just use connection.abort() (of 
course, the default implementation can use that).  For example, on PostgreSQL, 
ABORT; DISCARD ALL is the recommended way of resetting a connection, so being 
able to implement that would be great.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.




commit_on_success leaves incorrect PostgreSQL isolation mode?

2012-03-19 Thread Christophe Pettus
While exploring the Django transaction stuff (in 1.4rc1), I ran across the 
following behavior.  I use commit_on_success as the example here, but the other 
transaction decorators/context managers have the same issue.

It seems to me to be a bug, but I wanted to confirm this before I opened an 
issue.

The configuration is running Django using the psycopg2 backend, with 'OPTIONS': 
{ 'autocommit': True, }
Consider the following code:

from django.db import transaction, DEFAULT_DB_ALIAS, connections
from myapp.mymodels import X

x = X.objects.get(id=1)

print connections[DEFAULT_DB_ALIAS].isolation_level  # As expected, it's 0 
here.

x.myfield = 'Foo'

with commit_on_success():
   x.save()
   print connections[DEFAULT_DB_ALIAS].isolation_level  # As expected, it's 
1 here.

print connections[DEFAULT_DB_ALIAS].isolation_level  # It's now 1 here, but 
shouldn't it be back to 0?


The bug seems to be that the isolation level does not get reset back to 0, even 
when leaving connection management.  This means that any further operations on 
the database will open a new transaction (since psycopg2 will automatically 
open), but this transaction won't be managed in any way.

The bug appears to be in 
django.db.backends.BaseDatabaseWrapper.leave_transaction_management; it calls 
the _leave_transaction_management hook first thing, but this means that 
is_managed() will return true (since the decorators call managed(True)), which 
means that _leave_transaction_management in the psycopg2 backend will not reset 
the transaction isolation level; the code in the psycopg2 backend seems to 
assume that it will be run in the new transaction context, not the previous one.

Or am I missing something?

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: DoS using POST via hash algorithm collision

2011-12-29 Thread Christophe Pettus

On Dec 29, 2011, at 8:12 AM, Daniel Sokolowski wrote:

> So this would effect django because of the CSRF token check --- which 
> requires the hash to be regenerated before comparing it yes?

No, the problem is somewhat different.  The attacker constructs a POST request 
in which the field names are constructed to be a degenerate case of a hash 
table.  Since pretty much every web framework in existence (including Django) 
automatically takes the incoming POST fields and inserts them into a hash table 
(a Python dict being implemented as a hash table), the framework will grind 
through this degenerate case very, very slowly.

If I'm reading the paper correctly, it only applies to 32-bit Python 
implementations, as the 64-bit ones are not practically vulnerable to this 
attack.

It's an interesting result, but I'm not sure how much to be worried about it in 
the field.  A SlowLoris or similar attack would seem to be far more effective 
and less implementation-dependent.
--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: DecimalField model validation

2011-10-06 Thread Christophe Pettus

On Oct 6, 2011, at 9:29 PM, Tai Lee wrote:

> Why is ROUND_HALF_EVEN superior?

ROUND_HALF_EVEN is the standard when doing financial calculations, an extremely 
common use of DecimalField.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Curious (bug?) with db filter __in

2011-10-03 Thread Christophe Pettus

On Oct 3, 2011, at 11:31 AM, Cal Leeming [Simplicity Media Ltd] wrote:

> I can provide exact further info, but this was just a preliminary email to 
> see if this was expected behavior - or actually a bug??

You might try capturing the generated SQL and running it on a command line 
against the DB to see if the problem is in MySQL, or in the Django backend.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Extending templates dynamically

2011-05-06 Thread Christophe Pettus
Having had to do some PHP programming for the first time in a long time, I 
discovered that the Smarty template language has taken Django's template 
inheritance mechanism and adopted it wholesale in version 3:

http://www.smarty.net/docs/en/advanced.features.template.inheritance.tpl

Steal from the best! :)  One additional feature that they added was a dynamic 
way of doing {{ extends }}.  Rather than specifying the tag in the template 
source, the inheritance path can be specified directly in the render-equivalent 
call.  This has proven to be quite useful for those times that an inner 
template is used in multiple wrapper contexts.  Is this something that might be 
worth investigating in Django?  Looking at the Django source, the 
implementation seems quite straight-forward.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Composite primary keys

2011-03-31 Thread Christophe Pettus

On Mar 21, 2011, at 12:20 PM, Jacob Kaplan-Moss wrote:

> I think we're talking slightly different concerns here: I'm mostly
> interested in the Python-side API, and to my eyes a composite field
> matches more closely what's happening on the Python side of things.

I agree 100%!  I think I'm just drawing a different conclusion from that point, 
which is that indexes are more metadata on the database rather than a critical 
part of the Python API: In an imaginary perfect database (like, say, the SQL 
spec envisions), we wouldn't need to talk about indexes as all.

The more I think about it, the less I like including this directly in the field 
declaration part of the model, including my Index type proposal.  It just 
doesn't seem to belong there.

What concerns me about composite fields is that they seem to be a lot of Python 
machinery just to accomplish the goal of allowing this annotation.  If they 
were super-useful in their own right, that would be one thing, but I'm not sure 
that I see the utility of them absent indexes and foreign keys.  I'm also 
bothered, perhaps excessively, about having two different ways of getting at 
the same field in the model just to support this.

So, another proposal:

In the foreign key case, just extending the ForeignKey syntax to allow for 
multiple related fields makes the most sense:

overThere = models.ForeignKey(OtherModel, to_field=('first_name', 
'last_name', ))

For indexes on the table for the model, include the declaration in the Meta 
class, since that's the obvious place to stick indexing:

class SomeModel:

class Meta:
primary_key = 'some_field'
indexes = ['some_field', 'some_other_field', ('field1', 
'-field2', ), ]
raw_indexes = [ 'some_invariant_function(some_field)' ]

(This was proposed by someone else, and isn't original to me; apologies that I 
can't find the email to give credit.)

Of course, the existing syntax would still work as a shortcut for primary_key 
and indexes.

Thoughts?

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: State of X-Sendfile support?

2011-03-28 Thread Christophe Pettus

On Mar 28, 2011, at 9:40 AM, Jacob Kaplan-Moss wrote:

> If I've got that wrong, you need to explain to me (and
> anyone else) why uploads and downloads belong together in the same
> patch and why a simple "just support X-Sendfile and friends" patch
> can't possibly work.

+1.  It's entirely possible my brain is three sizes too small, but I don't see 
the obvious correlation between X-Sendfile (speaking generically) and the other 
features in the patch.  This is a very, very useful feature, but not one that 
has an obvious home in the Django core, especially given the varying 
implementations for specific environments.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Composite primary keys

2011-03-21 Thread Christophe Pettus
I'd like to make one more pitch for a slightly different implementation here.  
My concern with CompositeField isn't based on the fact that it doesn't map 
one-to-one with a field in the table; it's that it doesn't have any of the 
semantics that are associated with a field.  In particular, it can't be:

- Assigned to.
- Iterated over.
- Or even have a value.

My suggestion is to create an Index type that can be included in a class just 
like a field can.  The example we've been using would then look like:

class Foo(Model):
   x = models.FloatField()
   y = models.FloatField()
   a = models.ForeignKey(A)
   b = models.ForeignKey(B)

   coords = models.CompositeIndex((x, y))
   pair = models.CompositeIndex((a, b), primary_key=True)

We could have FieldIndex (the equivalent of the current db_index=True), 
CompositeIndex, and RawIndex, for things like expression indexes and other 
things that can be specified just as a raw SQL string.

I think this is a much better contract to offer in the API than one based on 
field which would have to throw exceptions left and right for most of the 
common field operations.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Composite primary keys

2011-03-16 Thread Christophe Pettus

On Mar 16, 2011, at 9:13 AM, Carl Meyer wrote:

> I'm not expressing an opinion one way or another on composite primary
> key syntax, but I don't agree here that a Django model "field" must
> map one-to-one to a database column.

That's fair, but a composite index lacks some of the characteristics of a field 
(assignability, for example).  Most DBs don't have functions that explicitly 
iterate over indexes, so such a thing isn't really readable, either.

It might be appealing to have a models.Index base class that represents an 
index on a table, and have db_index=True be a shortcut to creating one.  That 
might be more machinery than we want just for composite primary keys though.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Composite primary keys

2011-03-16 Thread Christophe Pettus

On Mar 16, 2011, at 2:24 AM, Johannes Dollinger wrote:

> I would be nice if support for composite primary keys would be implemented as 
> a special case of general composite fields.

It's appealing, but the reality is that no existing back-end actually has such 
an animal as a composite field.  In all of these cases, what we're really 
creating is a composite index on a set of standard fields.  Introducing a more 
powerful index-creation syntax into Django isn't a bad idea, but we shouldn't 
call it a "field" if it is not.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Composite primary keys

2011-03-15 Thread Christophe Pettus

On Mar 15, 2011, at 5:06 PM, Russell Keith-Magee wrote:

> And if you mark
> multiple fields, then you have a composite primary key composed of
> those fields.

A concern here is that composite indexes, like unique, are sensitive to the 
ordering of the fields, which means that the ordering of the fields in the 
class declaration becomes important.  That could, potentially, be surprising.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: #14733: A vote in favor of no validation of .raw() queries

2011-03-14 Thread Christophe Pettus

On Mar 12, 2011, at 12:56 PM, Jacob Kaplan-Moss wrote:

> Christophe, can you write a patch including a new warning to put in the docs?

All set: http://code.djangoproject.com/ticket/14733

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: #14733: A vote in favor of no validation of .raw() queries

2011-03-12 Thread Christophe Pettus

On Mar 11, 2011, at 8:20 PM, Jacob Kaplan-Moss wrote:
> I'd be interested in your thoughts on that: is
> there a way we can prevent folks from shooting themselves in the foot
> this way, or do you think trying itself is futile?

There's no practical way of doing it without doing some kind of 
backend-specific SQL parsing, and that is a low-margin, high-expense business 
to be in.  I'm in favor of not creating any more foot-gun scenarios than we 
need to, but I'm with Russ: The .raw() interface is, by design, one designed 
for people who claim to know what they are doing, so let's just get out of 
their way and let them get on with it.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



#14733: A vote in favor of no validation of .raw() queries

2011-03-09 Thread Christophe Pettus
Hi,

I'd like to offer a vote in favor of accepting the original patch to #14733, 
which removes the validation of the query done in a .raw() operation on a 
QuerySet.

The current situation is that Django requires that any query passed in begin 
with the literal string "SELECT", under the theory that only things beginning 
with SELECT return results ("set-returning operations").  This isn't correct.

In PostgreSQL, as it stands right now, operations which return sets can begin 
with:

SELECT
FETCH
INSERT
WITH
TABLE

(I may have missed some.)

This list isn't static, either; DO might well return sets in the future, 
although it doesn't right now.  And, of course, the exact list of the things 
that can return sets is backend-specific; that's just PG's list.

Given that .raw() is very much a "You must know what you are doing" feature in 
the first place, I don't see the need to be strict about the input, at the cost 
of some very useful functionality.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Transaction Documentation Clarification

2011-02-17 Thread Christophe Pettus

On Feb 17, 2011, at 8:33 AM, Silvio wrote:
> When using @transaction.commit_manually, one needs to ROLLBACK or
> COMMIT, otherwise the transaction handler will raise the
> TransactionManagementError error. That much is clear. But does this
> mean the *entire* view needs to be wrapped in a massive "try/except/
> else" block?

Essentially, yes.  You've diagnosed it exactly: If an exception escapes a view 
function with manual transaction management, and a transaction is left open, 
the exception that escaped will be discarded, and a TransactionManagementError 
exception thrown instead.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: r13363 change to use pg_get_serial_sequence

2010-12-23 Thread Christophe Pettus

On Dec 23, 2010, at 11:35 AM, Eric wrote:
> a) To fix this, one must identify the sequences that are not correct.
> I scoured pg_catalog and friends and cannot identify where PostgreSQL
> exposes the link between the "id" and sequence columns.

Just FYI, it's stored in pg_depend.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: RFC #9964 - fix "missing" db commits by forcing managed transactions to close

2010-12-21 Thread Christophe Pettus

On Dec 21, 2010, at 11:39 AM, Jacob Kaplan-Moss wrote:
> Unless there are objections, I'm going to accept this approach and
> check in a change based on Shai's latest -bugfix patch.

FWIW, +1.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: ForeignKey with null=True

2010-12-16 Thread Christophe Pettus

On Dec 16, 2010, at 2:31 PM, Luke Plant wrote:
> That being so, there is a case for arguing that
> ForeignRelatedObjectsDescriptor should not retrieve objects where the
> field pointed to is NULL - for consistency with the inverse operation.

I agree with this.  If the FK field is NULL, it should never return related 
objects; ditto for the reverse situation.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: ForeignKey with null=True

2010-12-16 Thread Christophe Pettus

On Dec 16, 2010, at 11:14 AM, Luke Plant wrote:
> This isn't true if the field pointed to (i.e. primary key by default)
> allows NULL values - in that case a ForeignKey field with a NULL value
> can and should return a non-empty set of values when the related objects
> lookup is done.

If I'm understanding your point, this isn't true; NULL does not match NULL on a 
join.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Fetching results of a query set

2010-12-11 Thread Christophe Pettus
Hi,

I've been spelunking through the 1.2.3 Model code, and wanted to see if someone 
more familiar with that code than I could answer a question.

In the case of returning the results of a query set, it appears that for most 
back ends Django reads the results from the cursor in units of 
GET_ITERATOR_CHUNK_SIZE (which is hard coded right now to be 100).  So, in the 
case of using .iterator() (no caching of results), it shouldn't have more than 
100 result objects in memory at once, unless the client of the query set is 
saving them.  Am I reading it correctly?

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Purpose of constant_time_compare?

2010-12-08 Thread Christophe Pettus

On Dec 8, 2010, at 12:08 PM, Jonas H. wrote:
> Can the time spent in *one single string comparison* really make such a huge 
> difference?

Yes.

http://codahale.com/a-lesson-in-timing-attacks/

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: .limit() on a QuerySet

2010-11-29 Thread Christophe Pettus

On Nov 29, 2010, at 12:50 PM, Ivan Sagalaev wrote:
> Looks like you're indeed missing queryset slicing[1]. It is lazy.
> 
> [1]: 
> http://docs.djangoproject.com/en/dev/topics/db/queries/#limiting-querysets

Bah, I was confusing indexing (not lazy) and slicing (lazy).  Never mind, and 
thanks. :)

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



.limit() on a QuerySet

2010-11-29 Thread Christophe Pettus
Hi,

Before I put any work into this, I want to know if (a) I'm missing something 
super-obvious in the QuerySet functionality, or (b) this idea has already been 
explored and rejected.

Sometimes, it would be nice to get a slice of a QuerySet but *not* actually 
evaluate the QuerySet; instead, leave it unevaluated.  An example of this would 
be an implementation of blog navigation:

context['previous_entry'] = 
Entry.objects.filter(entry_date__lt=current.entry_date).order_by('-entry_date')[0]
context['next_entry'] = 
Entry.objects.filter(entry_date__gt=current.entry_date).order_by('entry_date')[0]

This works fine, but it grabs the relevant object immediately.  It would be 
handy to have a syntax that continued to defer the execution of the query, in 
case (for example) the navigation is cached by a template fragment {% cache $} 
tag.  Something like:

context['previous_entry'] = 
Entry.objects.filter(entry_date__lt=current.entry_date).order_by('-entry_date').limit(limit=1,
 offset=0)
context['next_entry'] = 
Entry.objects.filter(entry_date__gt=current.entry_date).order_by('entry_date').limit(limit=1,
 offset=0)

Then, in the template, {{ previous_entry.get }} could be used to fetch the 
result.

Thoughts?

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



#12180: Test case advice

2010-11-28 Thread Christophe Pettus
Hi,

I'm updating the patch for #12180 to work with the dev version of 1.3, and 
preparing a test case for it.  Being new to the Django test suite, it's not 
clear to me how to introduce a backend-specific and settings-file-specific test 
(the test case requires PostgreSQL 8.2+ and AUTOCOMMIT: True in the database 
options).  Is there some quick guidance from those more experienced than me?

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Pluggable encryption for django auth (design proposal)

2010-11-28 Thread Christophe Pettus

On Nov 28, 2010, at 10:26 AM, Tom X. Tobin wrote:
> No, I'm not thinking of rainbow tables.  The key word here is
> *single*.  As I said before, a salt *does* help against an attacker
> trying to brute-force multiple passwords from your database, since he
> can't simply test each brute-force result against all your passwords
> at once; he has to start all over from scratch for every single
> password that has a different salt.  If he only cares about one
> *particular* account, the salt doesn't help, no.

Even in your scenario, it only helps as much as the entropy in the password 
selection.  If everyone has a unique password, it doesn't help at all 
(admittedly unlikely).  Again, it's a linear benefit, but not an exponential 
one.

Right.  So, about that proposal... :)

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Pluggable encryption for django auth (design proposal)

2010-11-28 Thread Christophe Pettus

On Nov 27, 2010, at 10:29 PM, Tom X. Tobin wrote:
> The point is that I'm *not* assuming hardware of equivalent speed.
> I'm assuming that a worst-case attacker has hardware significantly
> faster than your webserver at their disposal, so I was curious if the
> purported benefit still held in that case.  Maybe it does; I don't
> know.

Well, yes, it does, for exactly the reason described: The application has to 
encode exactly one password; the attacker has to try billions in order to 
brute-force one.  If you assume, say, one password per week is the slowest 
practical attack, and if it takes 10ms to hash one password, the attacker's 
hardware has to be about 46,654 times more powerful than your web server.

> I'm not arguing that a salt helps against brute-forcing a *single*
> password (it doesn't), but it does in fact help against someone trying
> to brute-force your entire password database (or any subset of more
> than one password), since each password with a different salt lies
> within an entirely different space that must be brute-forced
> separately from the rest.

I'm not sure what you mean by the "space"; I think you are thinking of a 
rainbow dictionary attack, where the hashes are precomputed; a salt does indeed 
help (and probably blocks) that kind of attack.  In the case of a straight 
brute-force attack or a standard dictionary attack without precomputing, the 
only benefit of the salt is that it makes computing the candidate hash a bit 
longer, based on the length of the salt.  It's a trivial amount of time.

Remember, it's extremely inexpensive to brute-force a single MD5 or SHA1 hash, 
and the salt does not make it appreciably more expensive.  If a CUDA 
application can brute force 700 million MD5s per second, doubling the length is 
not really going to make it any more secure.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Pluggable encryption for django auth (design proposal)

2010-11-27 Thread Christophe Pettus
I wrote:
> A dictionary attack works by consulting a precomputed set of passwords and 
> their hashes, (pwd, hash(pwd)).  The attacker then runs down the dictionary, 
> comparing hashes; if they get a hit, they know the password.  The salt 
> defeats this by making the pwd -> hash(pwd) mapping incorrect.

I'm being slightly inaccurate here; what I'm describing above is a rainbow 
dictionary attack, rather than just a plain dictionary attack (which is a brute 
force attempt on the password over a limited range of input values).  Anyway, a 
salt isn't helpful for a plain dictionary attack, either, for the same reason 
as a brute force attack.

Anyway, back to the discussion of the actual proposal. :)
--
-- Christophe Pettus
  x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Pluggable encryption for django auth (design proposal)

2010-11-27 Thread Christophe Pettus

On Nov 27, 2010, at 9:01 PM, Tom X. Tobin wrote:
> But how far are you willing to go in your assumption of the worst-case
> computational ability of your attacker?  Would tuning the hash to
> (say) a 10ms delay for your web server's modest hardware translate
> into a significant delay for an attacker with far more resources?
> (This isn't a rhetorical question; I honestly don't know.)

Let's do the math.  The space of eight alphanumeric character passwords is 
2.8e12.  Even assuming you can cut two orders of magnitude off of that with 
good assumptions about the kind of passwords that people are picking, this 
means that the attacker has to run about 28 billion times more computations 
that you do.  At 10ms per password, it would take them about 447.8 years to 
crack a single password, assuming hardware of equivalent speed.

> It does in fact slow down brute force attacks against multiple
> encrypted passwords; each password with a different salt is within an
> entirely different space that needs to be brute forced separately from
> the other passwords.

Remember how a brute force attack works.  Given a hash x, the attacker does:

hash('' + salt) = x? No, then,
hash('0001' + salt) = x? No, then,
...

The only benefit of the salt here is that it makes the string to be hashed a 
bit longer, but the benefit is linear, not exponential.

A dictionary attack works by consulting a precomputed set of passwords and 
their hashes, (pwd, hash(pwd)).  The attacker then runs down the dictionary, 
comparing hashes; if they get a hit, they know the password.  The salt defeats 
this by making the pwd -> hash(pwd) mapping incorrect.
--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Pluggable encryption for django auth (design proposal)

2010-11-27 Thread Christophe Pettus

On Nov 27, 2010, at 8:05 PM, Tom X. Tobin wrote:

> Your application ends up just
> as hobbled by such an algorithm as a potential attacker.

Actually, no, the situations are really quite asymmetrical.  In order to 
brute-force a password, an attacker has to be able to try many, many thousands 
of combinations per second.  To log in a user, an application has to do it 
exactly once.  A hash computation time of, say, 10ms is probably unnoticeable 
in a login situation, unless you have tens of thousands of users logging in per 
minute (and if this is the case, then you probably have other problems than the 
speed of your password hash algorithm).  But that would pretty much slam the 
door down on any brute force attempt at a password recovery.

> Django already salts the hashes, which is
> asymmetrical in a good way: it helps complicate brute force attacks
> without slowing down Django's ability to test a given password.

A salt is of no benefit on a brute force attack; it's function is to prevent 
dictionary attacks, which are a different animal.

And if you are willing to assume that no attacker can ever get access to your 
database, then you don't have to hash the password at all.

But, as you point out, that's a separate discussion from the value of pluggable 
encryption algorithms.  There was a time that MD5 was the perfect answer; now, 
it's SHA-1.  Different applications will have different needs as far as how 
they write the passwords to disk, and having an architecture to handle this 
seems like a good idea.

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Pluggable encryption for django auth (design proposal)

2010-11-27 Thread Christophe Pettus
Hi, all,

Right now, Django's auth system pretty much uses sha1 hardwired in (literally, 
in the case of User.set_password) for the hash.  For a discussion of why a 
general-purpose hash function is not the best idea in the world for password 
encryption, see:

http://codahale.com/how-to-safely-store-a-password/

I'd like to propose a backwards-compatible method of allowing different hash 
algorithms to be used, while not adding new dependencies on external libraries 
to the core.

1. Add a setting DEFAULT_PASSWORD_HASH.  This contains the code for the 
algorithm to use; if it is absent, 'sha1' is assumed.

2. Add a setting PASSWORD_HASH_FUNCTIONS.  This is a map of algorithm codes to 
callables; the callable has the same parameters as auth.models.get_hexdigest, 
and return the hex digest its parameters (to allow for a single function to 
handle multiple algorithms, the algorithm aprameter to get_hexdigest is 
retained).  For example:

PASSWORD_HASH_FUNCTIONS = { 'bcrypt': 
'myproject.myapp.bcrypt_hex_digest' }

3. auth.models.get_hexdigest is modified such that if the algorithm isn't one 
of the ones it knows about, it consults PASSWORD_HASH_FUNCTIONS and uses the 
matching function, if present.  If there's no match, it fails as it does 
currently.

4. User.set_password() is modified to check the value of DEFAULT_PASSWORD_HASH, 
and uses that algorithm if specified; otherwise, it uses 'sha1' as it does not. 
 (Optional: Adding the algorithm as a default parameter to User.set_password().)

Comments?

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: RFC #9964 - fix "missing" db commits by forcing managed transactions to close

2010-11-25 Thread Christophe Pettus

On Nov 25, 2010, at 7:46 AM, Russell Keith-Magee wrote:

> We need to declare that the current behavior to be a bug. We can
> break backwards compatibility to correct behavior that is clearly
> wrong. I haven't fully thought through the consequences here, but I
> think the combination of the footprint of affected cases, combined
> with the side effects of having dangling connections and transactions
> might almost be enough to convince me that can invoke the bug clause
> to break backwards compatibility. If you (or anyone else) has any
> opinions on this, I'd be interested in hearing them.

I'd definitely argue that the current behavior is a bug.  In the case (not the 
least bit unusual) of Django applications connecting to PostgreSQL through a 
connection pooler (usually pg_bouncer), it's pretty common to see Idle in 
Transaction connections start piling up because of this problem.  Thus, 
real-world consequences arise from this, but I'd also argue it is a bug even 
just considering the behavior within Django.

More below...

> To be clear -- as I understand it, we're talking about any code that:
> 
> * is in a transaction managed block (i.e., between manually invoked
> enter/leave_transaction_management() calls, or within the scope of a
> commit_manually decorator/context manager), and
> 
> * has a select or other manual cursor activity *after* the last
> commit/rollback, but *before* the end of the transaction management
> block, when there hasn't been a model save or other 'dirtying'
> behavior invoked after the last commit/rollback
> 
> At present, such code is allowed to pass, and the transaction dangles.
> The proposed change would declare this situation a bug, requiring a
> manual commit/rollback at the end of any database activity.

That's correct, with one caveat below.

My argument for this being a bug is that the behavior is indeterminate, and the 
most likely behavior is (in my view) surprising.

Right now, the transaction will stay open until the connection closes; this 
will probably cause a rollback at the database, but it's not a promise, simply 
the way the database happens to work.  This is both relying on a pretty gritty 
level of implementation, and doing a rollback rather than a commit is, I'd 
argue, surprising for the most typical cases.

The caveat is that there is also a behavior change to code which uses 
@commit_on_success (or creates similar behavior), does not do anything to cause 
is_dirty to be set, modifies the database in some other way, and then *relies* 
on the typical rollback behavior.  In the proposed fix, such transactions will 
commit instead.  This is, however, pretty much the same as relying on 
uninitialized values in memory to be consistent, and I don't see any reason not 
to declare such code buggy.

I've gotten around this by having my own version of commit_on_success that 
always commits rather than checking the is_dirty flag, but it would be great to 
have this in the core.
--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Ticket 9964 (was Re: Why does transaction management only commit on is_dirty?)

2010-10-23 Thread Christophe Pettus

On Oct 22, 2010, at 4:01 PM, Jacob Kaplan-Moss wrote:
> It's a bug: http://code.djangoproject.com/ticket/9964.
> 
> Looks like the patch there is OK, but still needs some work (there's a
> couple of TODOs still).

Looking at the history of the ticket, it looks like there is some concern about 
keeping the "current behavior" to maintain backwards compatibility.

Which raises the question: Just what is the current behavior that we'd like to 
preserve?  The current situation seems to be quite indeterminate; the 
transaction just stays open until it is closed... somehow, by some means 
(probably a rollback when the connection closes).  Is this really 
explicit-enough behavior that maintaining it is important?  Are applications 
really relying on it?

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Why does transaction management only commit on is_dirty?

2010-10-22 Thread Christophe Pettus

On Oct 22, 2010, at 4:01 PM, Jacob Kaplan-Moss wrote:
> It's a bug: http://code.djangoproject.com/ticket/9964.
> 
> Looks like the patch there is OK, but still needs some work (there's a
> couple of TODOs still).

On it! :)

--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Why does transaction management only commit on is_dirty?

2010-10-22 Thread Christophe Pettus
Why does transaction management only commit on is_dirty?

I realize that the answer to this question is, "Why commit if there are no 
changes to the database?", but it's a bit more complicated than that.

Let's assume a reasonably common case:

1. psycopg2 + PostgreSQL.
2. Transaction middleware enabled, or @commit_on_success decorator.
3. Read-only view function.

In this case, psycopg2 will begin a transaction on the first database read, but 
no COMMIT will ever be sent to the database.  Until the connection actually 
closes, this means that the connection will be in  state 
on the PostgreSQL server, an expensive state to be in.  (A pile-up of  connections is a pretty common occurrence in Django applications, 
in my experience.)

It seems that this problem is trivially fixed by always committing in the 
places that a commit depends on is_dirty().  The commit should be near-free on 
any back-end, since there are no changes, and it eliminates the IIT issue on 
PostgreSQL (and perhaps other problems on MySQL that I'm not as familiar with).

What am I missing?
--
-- Christophe Pettus
   x...@thebuild.com

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Patch uploaded for ticket 12180

2010-01-31 Thread Christophe Pettus


On Jan 31, 2010, at 7:10 PM, Russell Keith-Magee wrote:

However, as noted in [1], you don't need to post to django-dev just to
tell us you uploaded a patch.


OK, sorry for the noise.  Since I had uploaded the patch on 11/6/09, I  
wanted to make sure I hadn't missed a step in the review process.


Thanks!
--
-- Christophe Pettus
   x...@thebuild.com

--
You received this message because you are subscribed to the Google Groups "Django 
developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Patch uploaded for ticket 12180

2010-01-31 Thread Christophe Pettus
I've uploaded a patch for this ticket, which fixes a somewhat obscure  
problem:


There is a bug in the handling of  
InsertQuery.connection.features.can_return_id_from_insert, which is  
causing Django 1.1.1 to throw a ProgrammingError exception when  
inserting a new object/record into the database, using PostgreSQL  
8.4.1, using psycopg2, if that INSERT is the first thing done by a  
view on a particular connection to the database, when DATABASE_OPTIONS  
autocommit: True is set.


Comments welcome, thanks!
--
-- Christophe Pettus
   x...@thebuild.com

--
You received this message because you are subscribed to the Google Groups "Django 
developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: autocommit, INSERT... RETURNING and PostgreSQL 8.2+

2009-11-08 Thread Christophe Pettus


On Nov 8, 2009, at 8:39 AM, Seb Potter wrote:
> transaction pooling

Ah, of course.  Thank you!
--
-- Christophe Pettus
x...@thebuild.com


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



autocommit, INSERT... RETURNING and PostgreSQL 8.2+

2009-11-07 Thread Christophe Pettus

Greetings,

In looking around the code for the psycogp2 backend, it looks like  
autocommit is used, in part, as a checked assertion that the database  
being used is PG 8.2 or greater.  Comments lead me to believe that the  
reason that autocommit is limited to 8.2+ is that INSERT ... RETURNING  
was introduced into 8.2, and that syntax is required for correct  
operation while autocommit is True.  But I'm not sure I understand the  
reasoning; does anyone know why INSERT ... RETURNING is required in  
that case?

Thanks!
--
-- Christophe Pettus
x...@thebuild.com


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---



#10509 (Better handling of database-specific information)

2009-11-07 Thread Christophe Pettus

Greetings,

As part of proposing a patch for a bug I filed (#12180), I ran across  
this ticket, and took the liberty of claiming it.  Since I'm  
relatively new to working on Django code proper, I wanted to start a  
discussion about possible approaches to solving these issues.

I see two basic philosophical approaches:

1. The "safety razor" approach:  Django checks the version of the  
database server software, and either adapts its functionality to it or  
(at least) provides an early and explicit error, even at the cost of  
some performance.

2. The "straight razor" approach:  Django accepts any statements the  
user code makes about the database software at face value, and  
maximizes performance, accepting that the user being wrong will result  
in an obscure error at some random point.

Some specific things I'd like to accomplish:

-- Allow Django to use the INSERT...RETURNING functionality of PG 8.2+  
even if autocommit isn't being used.
-- Get rid of the need to repeatedly call SELECT version().

Some other functionality that's not directly relevant to this ticket,  
but related and useful:

-- Allow for Serializable transactions.

Thoughts?
--
-- Christophe Pettus
x...@thebuild.com


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~--~~~~--~~--~--~---