Re: Transaction APIs do not consult the DB router to choose DB connection

2021-06-01 Thread N Aditya
Hey Augustin,

Just to clarify, from before, you quoted:
> you can store the current HTTP request as a thread-local variable in your 
application and access it from anywhere, for example from a database router.

Also, from your previous message, you quoted
> As a consequence, the only way to make it return a different value at 
different times would be to depend on global variables. This is bad 
software engineering. It doesn't meet Django's standards.

I believe the final suggestion you made was as follows:
> I would recommend running without persistent database connections 
(CONN_MAX_AGE = 0) and switching settings.DATABASE["default"] in a 
middleware at the beginning of every request. Modifying settings at runtime 
isn't officially supported, but this looks like the closest option to the 
intent you expressed.

Below are a few things I'd like to clarify:

   1. Are you referring to thread-locals as `global state/variables` in 
   your previous message ? If so, why suggest something which you consider bad 
   practise ?
   2. As a consequence, if using thread-locals is considered bad software 
   engg, could you please give me a way by which I could pass on request 
   metadata to the router layer cleanly ?
   3. The final recommendation you gave is something that isn't officially 
   supported and *would fail when a threaded server or a fully async 
   request path* is used.
   (Even your favourite search engine might not be of much help here)


Based on your suggestions, I haven't arrived at a soln which can make this 
work.

> Just because something is easy to implement doesn't mean it gets added to 
Django

I missed to type in a couple of words which has created quite the spur 
around this. My apologies for that. What I really wanted to key in was this:

> I don't see any reason for why providing a hook seems so difficult *to 
implement*. 

I guess we all agreed on the fact that this would be fairly easy to 
implement. But I never implied/wanted to imply anything about adding this 
to Django just because the implementation was fairly easy.

Also, you quoted in your previous message:
> It also needs to be generally useful and well designed. Your proposal 
doesn't meet the second criterion and I'm skeptical about the first one.

I believe I've fairly stated my use-case. If you're stating that the 
proposal is not `generally useful` then I believe you're implying that 
routing of DB queries based on request metadata like domain, IP, location 
etc., may not be useful to most folks. I'm skeptical about this. 
Also, regd the `well designed` part, I believe we haven't arrived at a 
clean solution based on your recommendations either. So I believe we'll 
have to get this sorted. 

Also, you quoted:
> Four committers rejected your proposal and suggested alternatives

I don't think that apart from your suggestion, the other folks mentioned 
anything concrete along the lines of a work around. Let me know if I'm 
missing something here. 

To summarise, I've been trying to help from my end as well to try and make 
ends meet. I love using Django and I respect a lot of its design 
philosophies. I would like to be a contributor in the long run and do some 
exciting work. Let me knw how we can take this forward in the best possible 
way. 

Regards, 
Aditya N
On Wednesday, June 2, 2021 at 2:02:35 AM UTC+5:30 Aymeric Augustin wrote:

> Hello,
>
> On 1 Jun 2021, at 14:35, N Aditya  wrote:
>
> All I'm looking for is a hook that the transaction APIs can call before 
> deciding on which database to use. I don't see any reason for why providing 
> a hook seems so difficult. 
>
>
> Just because something is easy to implement doesn't mean it gets added to 
> Django. It also needs to be generally useful and well designed. Your 
> proposal doesn't meet the second criterion and I'm skeptical about the 
> first one.
>
> Indeed, the get_connection() API you're proposing doesn't take any 
> argument. As a consequence, the only way to make it return a different 
> value at different times would be to depend on global variables. This is 
> bad software engineering. It doesn't meet Django's standards.
>
> (If you don't know why having functions that return different values based 
> on global variables is a bad idea, you can search for "global state" in 
> your favorite search engine. I'm pretty sure you can find a good blog post 
> explaining why. And yes Django suffers from some legacy in this area. We 
> are trying to avoid taking more debt.)
>
> Four committers rejected your proposal and suggested alternatives. We're 
> trying to help. Pushing the same idea again won't work any better. Please 
> take into account the objections and try to find a design that addresses 
> them, or implement one of the alternatives we offered.
>
> Thank you,
>
> -- 
> Aymeric.
>
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this 

Re: `Model.validate_unique` excluding partial unique constraint

2021-06-01 Thread charettes
Hello there,

Partial unique constraints are currently not supported during validation 
for reasons described in this ticket[0].

For example (inspired by this Github comment[1]), if you define the 
following model

class Article(models.Model):
slug = models.CharField(max_length=100)
deleted_at = models.DateTimeField(null=True)

class Meta:
constraints = [
UniqueConstraint('slug', condition=Q(deleted_at=None), 
name='unique_slug'),
]

Then validate_unique must perform the following query to determine if the 
constraint is violated

SELECT NOT (%(deleted_at)s IS NULL) OR NOT EXISTS(SELECT 1 FROM article 
WHERE NOT id = %(id)s AND slug = %(slug)s AND deleted_at IS NULL)

In other words, the validation of a partial unique constraint must check 
that either of these conditions are true
1. The provided instance doesn't match the condition
2. There's no existing rows matching the unique constraint (excluding the 
current instance if it already exists)

This is not something Django supports right now.

In order to add proper support for this feature I believe (personal opinion 
here feedback is welcome) we should follow these steps:

1. Add support for Expression.check(using: str) -> bool that would 
translate IsNull(deleted_at, True).check('alias') into a backend compatible 
'SELECT %(deleted_at)s IS NULL' query and return whether or not it passed. 
That would also allow the constructions of forms like

(~Q(IsNull(deleted_at, True)) | 
~Exists(Article.objects.exclude(pk=pk).filter(slug=slug, 
deleted_at=None)).check(using)

2. Add support for Constraint.validate(instance, excluded_fields) as 
described in [0] that would build on top of Expression.check to implement 
proper UniqueConstraint, CheckConstraint, and ExclusionConstraint 
validation and allow for third-party app (e.g. django-rest-framework which 
doesn't use model level validation[2]) to take advantage of this feature. 
For example the unique_for_(date|month|year) feature of Date(Time)?Field 
could be deprecated in favour of Constraint subclasses that implement 
as_sql to enforce SQL level constraint if available by the current backend 
and implement .validate to replace the special case logic we have currently 
in place for these options[3].

I hope this clarify the current situation.

Cheers,
Simon

[0] https://code.djangoproject.com/ticket/30581#comment:7
[1] https://github.com/django/django/pull/10796#discussion_r244216763
[2] https://github.com/encode/django-rest-framework/issues/7173
[3] 
https://github.com/django/django/blob/e703b152c6148ddda1b072a4353e9a41dca87f90/django/db/models/base.py#L1062-L1084

Le mardi 1 juin 2021 à 11:18:23 UTC-4, gaga...@gmail.com a écrit :

> Hi,
>
> I changed several models from fields using `unique=True` to using 
> `UniqueConstraint` with a condition in the Meta.
>
> As a side-effect, the uniqueness are no longer validated during cleaning 
> of a Form and an integrity error is raised. This is because partial unique 
> indexes are excluded :
>
> https://github.com/django/django/blob/e703b152c6148ddda1b072a4353e9a41dca87f90/django/db/models/options.py#L865-L874
>
> It seems that `total_unique_constraints` is also used to check for fields 
> that should be unique (related fields and USERNAME_FIELD specifically).
>
> I tried modifying `total_unique_constraints` and the only tests which 
> failed were related to the above concern and 
> `test_total_ordering_optimization_meta_constraints` which also uses `
> total_unique_constraints`. My application works fine and the validation 
> error are correctly raised in my forms.
>
> The current behaviour of `Model.validate_unique` is also not the one I 
> expected as my conditional `UniqueConstraint` were not used (which caused 
> the integrity error).
>
> Am I missing something? Or should we use all constraints (including 
> partial) in `Model.validate_unique`?
>
> If this is indeed what should be done, adding an `all_unique_constraints` 
> next to `total_unique_constraints` and using it in `Model.validate_unique` 
> instead of `total_unique_constraints` would do the trick. I don't mind 
> opening a ticket and doing the PR if needed.
>
> Thanks.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/4972f6d0-c590-473d-8571-063738baf2ccn%40googlegroups.com.


Re: Transaction APIs do not consult the DB router to choose DB connection

2021-06-01 Thread Aymeric Augustin
Hello,

> On 1 Jun 2021, at 14:35, N Aditya  wrote:
> 
> All I'm looking for is a hook that the transaction APIs can call before 
> deciding on which database to use. I don't see any reason for why providing a 
> hook seems so difficult. 

Just because something is easy to implement doesn't mean it gets added to 
Django. It also needs to be generally useful and well designed. Your proposal 
doesn't meet the second criterion and I'm skeptical about the first one.

Indeed, the get_connection() API you're proposing doesn't take any argument. As 
a consequence, the only way to make it return a different value at different 
times would be to depend on global variables. This is bad software engineering. 
It doesn't meet Django's standards.

(If you don't know why having functions that return different values based on 
global variables is a bad idea, you can search for "global state" in your 
favorite search engine. I'm pretty sure you can find a good blog post 
explaining why. And yes Django suffers from some legacy in this area. We are 
trying to avoid taking more debt.)

Four committers rejected your proposal and suggested alternatives. We're trying 
to help. Pushing the same idea again won't work any better. Please take into 
account the objections and try to find a design that addresses them, or 
implement one of the alternatives we offered.

Thank you,

-- 
Aymeric.




-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/940A0E63-8879-4F6E-AF27-ED5F8880611D%40polytechnique.org.


Re: Transaction APIs do not consult the DB router to choose DB connection

2021-06-01 Thread N Aditya
Hey Lokesh, 

Just out of curiosity, I'd like to clarify the expected behaviour. 

If a db_for_transaction method is implemented, I believe it would be 
consulted for transaction APIs like atomic etc., Reads and writes which 
happen within that transaction are nevertheless going to consult their 
respective router methods. Is your expectation that the router somehow 
detect that there's an open transaction block somewhere above and scope all 
reads/writes that happen within it to be routed to the DB it was tied to ? 
i.e the DB alias used by the transaction takes precedence over what the 
router methods namely db_for_read/db_for_write may return. 

If this is your expectation, I think it might be hard to bake this into 
core Django. We could receive more views on the same from the community.

Looking at it from another perspective, I believe this can be achieved 
externally if core Django could expose hooks such as transaction.on_enter / 
transaction.on_exit to which callbacks can be hooked which could set 
some state info in thread-locals/contextvars which can later be parsed by 
the db_for_read/db_for_write methods to determine whether to use this info 
to make its decision or fallback to the default behaviour.

Also, something I'd like to clarify is that my use-case for a hook such as 
*db_for_transaction* is different. You could read above to know more about 
this.

Regards,
Aditya N
On Tuesday, June 1, 2021 at 11:34:55 PM UTC+5:30 lokeshw...@gmail.com wrote:

> Hi Everyone,
>
> Our use case is that we have a writer primary database and read-only 
> replica databases. The replicas can sometimes be a bit behind the primary
> We have written a Router like this
>
> class CustomRouter:
> def db_for_read(self, model, **hints):
> return 'reader'
>
> def db_for_write(self, model, **hints):
> return 'default'
>
> def allow_relation(self, obj1, obj2, **hints):
> return None
>
> def allow_migrate(self, db, app_label, model_name=None, **hints):
> return None
>
> So this is all fine for normal queries. But we observed that even within a 
> transaction.atomic() block, the read queries go to a different database, 
> and write queries go to a different database
>
> So I was hoping *db_for_transaction* proposal may improve this behavior.
> On Tuesday, June 1, 2021 at 6:32:44 PM UTC+5:30 f.apo...@gmail.com wrote:
>
>> On Tuesday, June 1, 2021 at 2:35:17 PM UTC+2 gojeta...@gmail.com wrote:
>>
>>> I don't see any reason for why providing a hook seems so difficult. 
>>>
>>
>> It is more code to maintain, needs tests etc and increases complexity. 
>> Just because something is easy on the surface, doesn't mean it will be easy 
>> in the longrun.
>>
>> A simple implementation can be: (From message-3 of this conversation):
>>>
>>>
>>> > #settings.py
>>>
>>> TRANSACTION_DB_SELECTOR = "path_to_some_callable"
>>>
>>> #transaction.py
>>> ...
>>> transaction_db_selector = import_string(settings.TRANSACTION_DB_SELECTOR)
>>> def get_connection():
>>> if transaction_db_selector:
>>> using = transaction_db_selector()
>>> if using is None:
>>> using = DEFAULT_DB_ALIAS
>>> return connections[using]
>>>
>>
>> I do not think that completely ignoring the `using` that was passed in by 
>> the user would be a very good idea (for the case when there is also a 
>> `TRANSACTION_DB_SELECTOR` set).
>>
>> Cheers,
>> Florian
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/4375763e-7e48-4426-bc49-587fdbba12c9n%40googlegroups.com.


Re: How to do a counter in DB cache class?

2021-06-01 Thread 'Mike Lissner' via Django developers (Contributions to Django itself)
Wow, that's pretty great! Did you consider merging this functionality into
django itself? I hadn't seen anything related before now.

On Tue, Jun 1, 2021 at 11:38 AM 'Adam Johnson' via Django developers
(Contributions to Django itself)  wrote:

> Hi Mike!
>
> Probabilistic culling probably is the best we can do in the DB cache,
> aside from moving culling to a background task.
>
> I wrote an implementation of this probabilistic culling in the
> django-mysql cache backend (which is mysql only):
> https://django-mysql.readthedocs.io/en/latest/cache.html#culling . This
> also shows a way of pushing the cull into a background task by setting the
> probability to 0 and calling the cull() method.
>
> Counting in the DB is a bad idea since it will put every write through a
> single hot row, and if transactions are active that will even serialize
> requests - ouch!
>
> An in-python counter, with a random offset per process (actually process
> id *plus* thread id), could also work. But it would be trickier imo.
>
> Thanks,
>
> Adm
>
> On Tue, 1 Jun 2021 at 18:28, 'Mike Lissner' via Django developers
> (Contributions to Django itself) 
> wrote:
>
>> This might be more of a Python question than a Django one, but in this
>> ticket  I'm hoping to make
>> the DB cache a bit more performant by having it not cull stale entries
>> *every* time somebody adds, changes, or touches a cache key.
>>
>> The idea is to use a setting or or class attribute so that the cache is
>> instead culled every 50 times or 1000 times or whatever. Most of this is
>> easy, but is there a good way to maintain a counter of how many times a key
>> has been set?
>>
>> The best I've come up with is just to do a mod of a random number, and
>> let that be good enough. E.g.:
>>
>> random_number = randint(0, CULL_EVERY_X)
>> if random_number == CULL_EVERY_X:
>> cull()
>>
>> That's pretty sloppy, but it'd work (on average), and it'd handle things
>> like counting in a multi-process set up.
>>
>> The other approach, I suppose, would be to keep a counter in the DB and
>> just increment it as needed, but it'd be nice to do something in memory
>> even if it's a little less accurate, since this counter doesn't have to be
>> exact.
>>
>> Anybody have thoughts on this?
>>
>> Thanks,
>>
>>
>> Mike
>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Django developers (Contributions to Django itself)" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to django-developers+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/django-developers/7f74e5f0-3206-4956-9ede-79788eda7982n%40googlegroups.com
>> 
>> .
>>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/django-developers/qFxdRNLDGuA/unsubscribe
> .
> To unsubscribe from this group and all its topics, send an email to
> django-developers+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-developers/CAMyDDM2jSqZNGuBAobuuh%3D6%3DgWyD7SD0TTEzgUxO9CdqRhXotg%40mail.gmail.com
> 
> .
>


-- 
Mike Lissner
Executive Director
Free Law Project
https://free.law

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CAKs1xOEofZLoe7RMCR%2BxT6vkCbiFjGjMrJprfLGsVubXR6n9mQ%40mail.gmail.com.


Re: How to do a counter in DB cache class?

2021-06-01 Thread 'Adam Johnson' via Django developers (Contributions to Django itself)
Hi Mike!

Probabilistic culling probably is the best we can do in the DB cache, aside
from moving culling to a background task.

I wrote an implementation of this probabilistic culling in the django-mysql
cache backend (which is mysql only):
https://django-mysql.readthedocs.io/en/latest/cache.html#culling . This
also shows a way of pushing the cull into a background task by setting the
probability to 0 and calling the cull() method.

Counting in the DB is a bad idea since it will put every write through a
single hot row, and if transactions are active that will even serialize
requests - ouch!

An in-python counter, with a random offset per process (actually process id
*plus* thread id), could also work. But it would be trickier imo.

Thanks,

Adm

On Tue, 1 Jun 2021 at 18:28, 'Mike Lissner' via Django developers
(Contributions to Django itself)  wrote:

> This might be more of a Python question than a Django one, but in this
> ticket  I'm hoping to make
> the DB cache a bit more performant by having it not cull stale entries
> *every* time somebody adds, changes, or touches a cache key.
>
> The idea is to use a setting or or class attribute so that the cache is
> instead culled every 50 times or 1000 times or whatever. Most of this is
> easy, but is there a good way to maintain a counter of how many times a key
> has been set?
>
> The best I've come up with is just to do a mod of a random number, and let
> that be good enough. E.g.:
>
> random_number = randint(0, CULL_EVERY_X)
> if random_number == CULL_EVERY_X:
> cull()
>
> That's pretty sloppy, but it'd work (on average), and it'd handle things
> like counting in a multi-process set up.
>
> The other approach, I suppose, would be to keep a counter in the DB and
> just increment it as needed, but it'd be nice to do something in memory
> even if it's a little less accurate, since this counter doesn't have to be
> exact.
>
> Anybody have thoughts on this?
>
> Thanks,
>
>
> Mike
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to django-developers+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-developers/7f74e5f0-3206-4956-9ede-79788eda7982n%40googlegroups.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CAMyDDM2jSqZNGuBAobuuh%3D6%3DgWyD7SD0TTEzgUxO9CdqRhXotg%40mail.gmail.com.


Re: Transaction APIs do not consult the DB router to choose DB connection

2021-06-01 Thread Lokesh Dokara
Hi Everyone,

Our use case is that we have a writer primary database and read-only 
replica databases. The replicas can sometimes be a bit behind the primary
We have written a Router like this

class CustomRouter:
def db_for_read(self, model, **hints):
return 'reader'

def db_for_write(self, model, **hints):
return 'default'

def allow_relation(self, obj1, obj2, **hints):
return None

def allow_migrate(self, db, app_label, model_name=None, **hints):
return None

So this is all fine for normal queries. But we observed that even within a 
transaction.atomic() block, the read queries go to a different database, 
and write queries go to a different database

So I was hoping *db_for_transaction* proposal may improve this behavior.
On Tuesday, June 1, 2021 at 6:32:44 PM UTC+5:30 f.apo...@gmail.com wrote:

> On Tuesday, June 1, 2021 at 2:35:17 PM UTC+2 gojeta...@gmail.com wrote:
>
>> I don't see any reason for why providing a hook seems so difficult. 
>>
>
> It is more code to maintain, needs tests etc and increases complexity. 
> Just because something is easy on the surface, doesn't mean it will be easy 
> in the longrun.
>
> A simple implementation can be: (From message-3 of this conversation):
>>
>>
>> > #settings.py
>>
>> TRANSACTION_DB_SELECTOR = "path_to_some_callable"
>>
>> #transaction.py
>> ...
>> transaction_db_selector = import_string(settings.TRANSACTION_DB_SELECTOR)
>> def get_connection():
>> if transaction_db_selector:
>> using = transaction_db_selector()
>> if using is None:
>> using = DEFAULT_DB_ALIAS
>> return connections[using]
>>
>
> I do not think that completely ignoring the `using` that was passed in by 
> the user would be a very good idea (for the case when there is also a 
> `TRANSACTION_DB_SELECTOR` set).
>
> Cheers,
> Florian
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/d8bd0f76-a2c7-4b66-b66d-c7e1f042e60en%40googlegroups.com.


How to do a counter in DB cache class?

2021-06-01 Thread 'Mike Lissner' via Django developers (Contributions to Django itself)
This might be more of a Python question than a Django one, but in this 
ticket  I'm hoping to make the 
DB cache a bit more performant by having it not cull stale entries *every* 
time somebody adds, changes, or touches a cache key. 

The idea is to use a setting or or class attribute so that the cache is 
instead culled every 50 times or 1000 times or whatever. Most of this is 
easy, but is there a good way to maintain a counter of how many times a key 
has been set? 

The best I've come up with is just to do a mod of a random number, and let 
that be good enough. E.g.: 

random_number = randint(0, CULL_EVERY_X)
if random_number == CULL_EVERY_X: 
cull()

That's pretty sloppy, but it'd work (on average), and it'd handle things 
like counting in a multi-process set up. 

The other approach, I suppose, would be to keep a counter in the DB and 
just increment it as needed, but it'd be nice to do something in memory 
even if it's a little less accurate, since this counter doesn't have to be 
exact.

Anybody have thoughts on this?

Thanks,


Mike


-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/7f74e5f0-3206-4956-9ede-79788eda7982n%40googlegroups.com.


Re: Transaction APIs do not consult the DB router to choose DB connection

2021-06-01 Thread N Aditya
Hey Florian, 

I'd like to point out that the code snippet which I wrote is just a scratch 
version of how it might look. I'd first like to reach consensus on the idea 
itself before writing something concrete. 
Obviously it wouldn't be as naive as overriding something that the user 
provided explicitly. 

Regards,
Aditya N

On Tuesday, June 1, 2021 at 6:32:44 PM UTC+5:30 f.apo...@gmail.com wrote:

> On Tuesday, June 1, 2021 at 2:35:17 PM UTC+2 gojeta...@gmail.com wrote:
>
>> I don't see any reason for why providing a hook seems so difficult. 
>>
>
> It is more code to maintain, needs tests etc and increases complexity. 
> Just because something is easy on the surface, doesn't mean it will be easy 
> in the longrun.
>
> A simple implementation can be: (From message-3 of this conversation):
>>
>>
>> > #settings.py
>>
>> TRANSACTION_DB_SELECTOR = "path_to_some_callable"
>>
>> #transaction.py
>> ...
>> transaction_db_selector = import_string(settings.TRANSACTION_DB_SELECTOR)
>> def get_connection():
>> if transaction_db_selector:
>> using = transaction_db_selector()
>> if using is None:
>> using = DEFAULT_DB_ALIAS
>> return connections[using]
>>
>
> I do not think that completely ignoring the `using` that was passed in by 
> the user would be a very good idea (for the case when there is also a 
> `TRANSACTION_DB_SELECTOR` set).
>
> Cheers,
> Florian
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/9e56624a-f486-4089-b065-13b77aff47d2n%40googlegroups.com.


`Model.validate_unique` excluding partial unique constraint

2021-06-01 Thread Gaga Ro
Hi,

I changed several models from fields using `unique=True` to using 
`UniqueConstraint` with a condition in the Meta.

As a side-effect, the uniqueness are no longer validated during cleaning of 
a Form and an integrity error is raised. This is because partial unique 
indexes are excluded :
https://github.com/django/django/blob/e703b152c6148ddda1b072a4353e9a41dca87f90/django/db/models/options.py#L865-L874

It seems that `total_unique_constraints` is also used to check for fields 
that should be unique (related fields and USERNAME_FIELD specifically).

I tried modifying `total_unique_constraints` and the only tests which 
failed were related to the above concern and 
`test_total_ordering_optimization_meta_constraints` which also uses `
total_unique_constraints`. My application works fine and the validation 
error are correctly raised in my forms.

The current behaviour of `Model.validate_unique` is also not the one I 
expected as my conditional `UniqueConstraint` were not used (which caused 
the integrity error).

Am I missing something? Or should we use all constraints (including 
partial) in `Model.validate_unique`?

If this is indeed what should be done, adding an `all_unique_constraints` 
next to `total_unique_constraints` and using it in `Model.validate_unique` 
instead of `total_unique_constraints` would do the trick. I don't mind 
opening a ticket and doing the PR if needed.

Thanks.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/09c08268-e094-4152-94e2-265d93481891n%40googlegroups.com.


Re: Transaction APIs do not consult the DB router to choose DB connection

2021-06-01 Thread Florian Apolloner
On Tuesday, June 1, 2021 at 2:35:17 PM UTC+2 gojeta...@gmail.com wrote:

> I don't see any reason for why providing a hook seems so difficult. 
>

It is more code to maintain, needs tests etc and increases complexity. Just 
because something is easy on the surface, doesn't mean it will be easy in 
the longrun.

A simple implementation can be: (From message-3 of this conversation):
>
>
> > #settings.py
>
> TRANSACTION_DB_SELECTOR = "path_to_some_callable"
>
> #transaction.py
> ...
> transaction_db_selector = import_string(settings.TRANSACTION_DB_SELECTOR)
> def get_connection():
> if transaction_db_selector:
> using = transaction_db_selector()
> if using is None:
> using = DEFAULT_DB_ALIAS
> return connections[using]
>

I do not think that completely ignoring the `using` that was passed in by 
the user would be a very good idea (for the case when there is also a 
`TRANSACTION_DB_SELECTOR` set).

Cheers,
Florian

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/016c2d90-302c-45d5-82de-48de66404146n%40googlegroups.com.


Re: Transaction APIs do not consult the DB router to choose DB connection

2021-06-01 Thread Florian Apolloner
On Tuesday, June 1, 2021 at 1:45:29 PM UTC+2 Aymeric Augustin wrote:

> I would recommend running without persistent database connections 
> (CONN_MAX_AGE = 0) and switching settings.DATABASE["default"] in a 
> middleware at the beginning of every request. Modifying settings at runtime 
> isn't officially supported, but this looks like the closest option to the 
> intent you expressed.
>

Funny that you should mention this. I thought about this when doing 
something similar for ARA, but settled down on a custom database backend: 
https://github.com/ansible-community/ara/blob/master/ara/server/db/backends/distributed_sqlite/base.py
 
-- it does work surprisingly well and I think it might be more resilient 
than settings shenanigans. Not that I am particularly proud of that backend 
but I thought I'd share it here for the off-chance someone finds your post 
via a search engine and wonders how to implement such a thing.

Cheers,
Florian  

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/42ea1452-8699-41f5-b6c8-2b2d319477den%40googlegroups.com.


Re: Transaction APIs do not consult the DB router to choose DB connection

2021-06-01 Thread N Aditya
Hey, 

I agree to the fact that the request object being passed to the DB layer 
(routers/transaction APIs) may not be pragmatic according to the Loose 
coupling approach. 

Quoting what you said: 
> you can store the current HTTP request as a thread-local variable in your 
application and access it from anywhere, for example from a database router.

>From *The Gap section* in the above message:
> To add further context, in the library that I built, though the routers 
framework currently doesn't receive the request object, I was able to work 
around using contextvars but the *essence is that the hooks were there*. 
I'm unable to do the same for the transaction APIs without monkey patching 
since such *a hook is unavailable*.

Both of us are saying the same thing. 

Setting something in thread-local/contextvars and accessing it via the DB 
router -> Cool. Works fine for all DB queries/migrations.
Setting something in thread-local/contextvars and accessing it via  
-> For transaction APIs (not just atomic but also the other APIs such as 
savepoint, savepoinnt_rollback etc.,) the dash is what I'm looking for. 

All I'm looking for is a hook that the transaction APIs can call before 
deciding on which database to use. I don't see any reason for why providing 
a hook seems so difficult. 
A simple implementation can be: (From message-3 of this conversation):

> #settings.py

TRANSACTION_DB_SELECTOR = "path_to_some_callable"

#transaction.py
...
transaction_db_selector = import_string(settings.TRANSACTION_DB_SELECTOR)
def get_connection():
if transaction_db_selector:
using = transaction_db_selector()
if using is None:
using = DEFAULT_DB_ALIAS
return connections[using]

Let me know your thoughts on this. 




On Tuesday, June 1, 2021 at 5:15:29 PM UTC+5:30 Aymeric Augustin wrote:

> Hello,
>
> The first item in Django's design philosophies is Loose coupling 
> .
>  
> Per this principle, the database layer shouldn't know about the HTTP layer. 
> This is a strong reason for keeping the HTTP request object away from the 
> database layer (e.g. database routers or the transactions API). This is why 
> "a hook which can be invoked with the request object as its param to make a 
> decision" is extremely unlikely to be accepted.
>
> If you disagree with the consequences of this philosophy, you can store 
> the current HTTP request as a thread-local variable in your application and 
> access it from anywhere, for example from a database router.
>
> Since I'm the original author of the transactions API, I've been thinking 
> about your suggestion for the last few days. However, I'm not seeing what 
> context could meaningfully be passed to `transaction.atomic()` besides the 
> existing `using` parameter. Passing the HTTP request object is not an 
> option.
>
> I would recommend running without persistent database connections 
> (CONN_MAX_AGE = 0) and switching settings.DATABASE["default"] in a 
> middleware at the beginning of every request. Modifying settings at runtime 
> isn't officially supported, but this looks like the closest option to the 
> intent you expressed.
>
> Best regards,
>
> -- 
> Aymeric.
>
>
>
> On 1 Jun 2021, at 12:09, N Aditya  wrote:
>
> Hey all, 
>
> I believe there's a major misunderstanding of what I've been trying to 
> convey. Let me rephrase:
>
> *Problem*: Include the request object as a decision param in the routers 
> framework/transaction APIs based on which a DB can be chosen i.e all 
> queries(reads/writes) and transactions are directed to a specific database 
> without having to explicitly mention it. This could be location based, IP 
> based, domain based or based on any other metadata available in the 
> request. 
>
> *Explanation*: 
> 1. This use-case is an extension of the examples provided in Django docs 
> for multi-database support wherein based on the *model (or) based on the 
> type of operation(read/write) *the database is chosen via the routers 
> framework. I need to achieve the same but this time it wouldn't be based on 
> the model or type of operation, rather it would be based on *some 
> metadata in the request*. 
> 2. The multi-tenancy library which I've built just adds solidity to this 
> use-case. Simply put, it achieves routing based on request headers. 
>
> *The gap*: I believe the essence of the *routers framework* is to *provide 
> a hook* by which ORM queries can be directed to a DB based on some 
> condition *without having to explicitly specify it in every query*. A 
> similar *hook is missing* for all of the transaction APIs. This is the 
> gap I'm trying to bridge. To add further context, in the library that I 
> built, though the routers framework currently doesn't receive the request 
> object, I was able to work around using contextvars but the *essence is 
> that the hooks were there*. I'm unable to do the same for the transaction 
> 

Re: Transaction APIs do not consult the DB router to choose DB connection

2021-06-01 Thread Aymeric Augustin
Hello,

The first item in Django's design philosophies is Loose coupling 
.
 Per this principle, the database layer shouldn't know about the HTTP layer. 
This is a strong reason for keeping the HTTP request object away from the 
database layer (e.g. database routers or the transactions API). This is why "a 
hook which can be invoked with the request object as its param to make a 
decision" is extremely unlikely to be accepted.

If you disagree with the consequences of this philosophy, you can store the 
current HTTP request as a thread-local variable in your application and access 
it from anywhere, for example from a database router.

Since I'm the original author of the transactions API, I've been thinking about 
your suggestion for the last few days. However, I'm not seeing what context 
could meaningfully be passed to `transaction.atomic()` besides the existing 
`using` parameter. Passing the HTTP request object is not an option.

I would recommend running without persistent database connections (CONN_MAX_AGE 
= 0) and switching settings.DATABASE["default"] in a middleware at the 
beginning of every request. Modifying settings at runtime isn't officially 
supported, but this looks like the closest option to the intent you expressed.

Best regards,

-- 
Aymeric.



> On 1 Jun 2021, at 12:09, N Aditya  wrote:
> 
> Hey all, 
> 
> I believe there's a major misunderstanding of what I've been trying to 
> convey. Let me rephrase:
> 
> Problem: Include the request object as a decision param in the routers 
> framework/transaction APIs based on which a DB can be chosen i.e all 
> queries(reads/writes) and transactions are directed to a specific database 
> without having to explicitly mention it. This could be location based, IP 
> based, domain based or based on any other metadata available in the request. 
> 
> Explanation: 
> 1. This use-case is an extension of the examples provided in Django docs for 
> multi-database support wherein based on the model (or) based on the type of 
> operation(read/write) the database is chosen via the routers framework. I 
> need to achieve the same but this time it wouldn't be based on the model or 
> type of operation, rather it would be based on some metadata in the request. 
> 2. The multi-tenancy library which I've built just adds solidity to this 
> use-case. Simply put, it achieves routing based on request headers. 
> 
> The gap: I believe the essence of the routers framework is to provide a hook 
> by which ORM queries can be directed to a DB based on some condition without 
> having to explicitly specify it in every query. A similar hook is missing for 
> all of the transaction APIs. This is the gap I'm trying to bridge. To add 
> further context, in the library that I built, though the routers framework 
> currently doesn't receive the request object, I was able to work around using 
> contextvars but the essence is that the hooks were there. I'm unable to do 
> the same for the transaction APIs without monkey patching since such a hook 
> is unavailable.
> 
> > I don't think this is a strong argument by any mean. To me this is 
> > basically saying "it might be useful in the future so lets add it now" 
> > without providing any details about why it might be useful in the first 
> > place.
> 
> I thought I made it clear but reinstating anyway. I'd like the routers 
> framework to include the request object as part of the arguments list to each 
> of its methods so that decisions made can be based on it as well. If so, 
> there could be a separate method(hook) for transaction APIs as well, like 
> db_for_transaction which can take in the request object to make a decision. 
> It could include something else as well in the future. (Mentioned in 
> message-1 of this thread. Probably I forgot to mention the request object 
> part).
> If the above is difficult to achieve, then we could at the least expose a 
> hook which can be invoked with the request object as its param to make a 
> decision. I believe this should be fairly simple to implement and maintain. 
> (Mentioned in message-3 of this thread).
> 
> If you guys feel that this use-case is still invalid, I'd like to know a 
> logical explanation apart from maintenance overhead issues and waiting for 
> more people to come up with the same problem. 
> 
> 
> 
> Simon, regd the multi-tenancy thing, I've done some extensive research on the 
> libraries available currently to achieve multi-tenancy (including yours) in 
> Django and none of them have adopted the isolated DB approach and declared it 
> production ready. There are multiple libraries using the schema per tenant 
> approach but the downsides are: 
> 
> 1. It is DB implementation specific i.e only DBs' which are using schemas 
> (like PostgreSQL) can take advantage of it. If they're using MongoDB or other 
> data 

Re: Transaction APIs do not consult the DB router to choose DB connection

2021-06-01 Thread N Aditya
Hey all,

I believe there's a major misunderstanding of what I've been trying to
convey. Let me rephrase:

*Problem*: Include the request object as a decision param in the routers
framework/transaction APIs based on which a DB can be chosen i.e all
queries(reads/writes) and transactions are directed to a specific database
without having to explicitly mention it. This could be location based, IP
based, domain based or based on any other metadata available in the
request.

*Explanation*:
1. This use-case is an extension of the examples provided in Django docs
for multi-database support wherein based on the *model (or) based on the
type of operation(read/write) *the database is chosen via the routers
framework. I need to achieve the same but this time it wouldn't be based on
the model or type of operation, rather it would be based on *some metadata
in the request*.
2. The multi-tenancy library which I've built just adds solidity to this
use-case. Simply put, it achieves routing based on request headers.

*The gap*: I believe the essence of the *routers framework* is to *provide
a hook* by which ORM queries can be directed to a DB based on some
condition *without having to explicitly specify it in every query*. A
similar *hook is missing* for all of the transaction APIs. This is the gap
I'm trying to bridge. To add further context, in the library that I built,
though the routers framework currently doesn't receive the request object,
I was able to work around using contextvars but the *essence is that the
hooks were there*. I'm unable to do the same for the transaction APIs
without monkey patching since such *a hook is unavailable*.

> I don't think this is a strong argument by any mean. To me this is
basically saying "it might be useful in the future so lets add it now"
without providing any details about why it might be useful in the first
place.


   1. I thought I made it clear but reinstating anyway. I'd like the
   routers framework to include the request object as part of the arguments
   list to each of its methods so that decisions made can be based on it as
   well. If so, there could be a separate method(hook) for transaction APIs as
   well, like db_for_transaction which can take in the request object to make
   a decision. It could include something else as well in the future.
   (Mentioned in message-1 of this thread. Probably I forgot to mention the
   request object part).
   2. If the above is difficult to achieve, then we could at the least
   expose a hook which can be invoked with the request object as its param to
   make a decision. I believe this should be fairly *simple to implement
   and maintain*. (Mentioned in message-3 of this thread).


If you guys feel that this use-case is still invalid, I'd like to know a
logical explanation apart from maintenance overhead issues and waiting for
more people to come up with the same problem.



Simon, regd the multi-tenancy thing, I've done some extensive research on
the libraries available currently to achieve multi-tenancy (including
yours) in Django and none of them have adopted the isolated DB approach and
declared it production ready. There are multiple libraries using the schema
per tenant approach but the downsides are:

1. It is DB implementation specific i.e only DBs' which are using schemas
(like PostgreSQL) can take advantage of it. If they're using MongoDB or
other data stores, they're out of luck.
2. In a microservices architecture, one may be using a combination of DBs'
for various purposes. For eg: PostgreSQL for OLTP, ElasticSearch for
real-time search etc., In such cases, they're out of luck again.
3.  A lot of B2B SAAS companies have a mandate that their data cannot be
collocated in the same physical space so as to avoid data leaks and
breaches, especially in healthcare systems (HIPAA).

Also, there are a few libraries using the shared table FK approach to
inject relationship filters dynamically but after reading code, I found
that it doesn't cover all intricate/edge cases like:
1. m2m relations between two tenant independent models.
2. Not all methods of the QuerySet have been overridden properly.

Though I agree with your point that all these libraries (including mine)
might not scale well for tenants in the range 50 - 100K, I don't see why it
wouldn't work for tenants in the order of a few 1000s. Also, I'd genuinely
like to know why the DATABASES dictionary or the CACHES dictionary would
run into connection leaks/KeyErrors with such an approach.

Regards,
Aditya N

On Tue, Jun 1, 2021 at 12:04 PM Florian Apolloner 
wrote:

>
>
> On Monday, May 31, 2021 at 12:13:58 PM UTC+2 Adam Johnson wrote:
>
>> I'm also -1 on changing anything in Django right now.
>>
>
> -1 as well, I simply see no way how something like:
>
> ```
> with transaction.atomic():
>  Model1.objects.create()
>  Model2.objects.create()
> ```
>
> will allow for any useful behavior via the 

Re: Transaction APIs do not consult the DB router to choose DB connection

2021-06-01 Thread Florian Apolloner


On Monday, May 31, 2021 at 12:13:58 PM UTC+2 Adam Johnson wrote:

> I'm also -1 on changing anything in Django right now.
>

-1 as well, I simply see no way how something like:

```
with transaction.atomic():
 Model1.objects.create()
 Model2.objects.create()
```

will allow for any useful behavior via the router. Unless 
transaction.atomic could inspect the contents and figure out the involved 
dependencies, but this is surely way more than what could go into core for 
now.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/6292ccbb-bc83-48d0-b08e-aa765a29ce32n%40googlegroups.com.