Re: Transaction APIs do not consult the DB router to choose DB connection

2021-06-03 Thread N Aditya
Hey Augustin, 

I'd like to clarify a few things:

1. If atomic(using=...) is the way to go and the same has been implemented 
for ORM queries as well, why introduce something like the routers framework 
in the first place ?

2. Also generally speaking, was the intention of building the routers 
framework only to allow routing based on models and other hints available 
as part of the args list or to expose hooks that can achieve routing of 
DB queries based on custom logic that could either depend on the args list 
or not (The use-cases are plenty. It can be from some state being pushed 
into a KV store based on some logic that gets executed in the view etc., 
not just thread-local state. Also, considering thread-local state here to 
be globals seems completely unfair since in actuality, it is tied to the 
request's scope and not global for all requests).

i. If your answer is the former, then you guys should strongly 
re-consider the design since they can easily be perceived and used as hooks 
to achieve the latter. I would end my discussion here.
ii. If your answer is the latter, then I'm looking for a similar hook 
for the transaction APIs as well. I would want to continue the discussion 
here.

Also, if possible, I'd like to hear opinions about the above from the 
authors of the routers framework as well. 

---

Also, I believe you quoted this in your previous message:
> Indeed, you'd need something slightly smarter, but since connections are 
thread-local in Django, it should be manageable.

Using some kind of thread sync primitive like a Lock/Semaphore at the 
middleware level to achieve this with stability doesn't seem manageable. 
It's just bad. Also, if for some reason within the view, the connection 
gets broken and I try to re-establish it, I have no guarantees as to what 
settings.databases["default"] might look like at that point and so I'll 
have to fetch the config that was used to initially create the connection 
from somewhere and then re-establish it again using a thread-sync 
primitive. I'm not even going to talk about the repercussions of using this 
approach with a ThreadPoolExecutor/ProcessPoolExecutor from within a view.


Regards, 
Aditya N
On Thursday, June 3, 2021 at 2:48:27 AM UTC+5:30 Aymeric Augustin wrote:

> On 2 Jun 2021, at 07:49, N Aditya  wrote:
>
> Below are a few things I'd like to clarify:
>
>1. Are you referring to thread-locals as `global state/variables` in 
>your previous message ?
>
> Yes.
>
>
>1. If so, why suggest something which you consider bad practise ?
>
> You rejected the good practice before — atomic(using=...) so I'm looking 
> for something that matches your constraint.
>
>
>1. As a consequence, if using thread-locals is considered bad software 
>engg, could you please give me a way by which I could pass on request 
>metadata to the router layer cleanly ?
>
> Pass it explicitly with `atomic(using=...)`. I'm aware that it doesn't 
> work for third-party libraries that you don't control. I don't have a 
> perfect solution.
>
>
>1. The final recommendation you gave is something that isn't 
>officially supported and *would fail when a threaded server or a fully 
>async request path* is used.
>(Even your favourite search engine might not be of much help here)
>
> Indeed, you'd need something slightly smarter, but since connections are 
> thread-local in Django, it should be manageable.
>
> I don't think you can meaningfully put "fully async request path" and 
> "database transaction managed by the Django ORM" in the same sentence. 
> You'd run the transaction in a thread pool, which brings you back to by 
> previous case.
>
> -- 
> Aymeric.
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/c63dc1ca-188c-48ff-aeb7-3a6ce8fedbe6n%40googlegroups.com.


Re: Transaction APIs do not consult the DB router to choose DB connection

2021-06-01 Thread N Aditya
Hey Augustin,

Just to clarify, from before, you quoted:
> you can store the current HTTP request as a thread-local variable in your 
application and access it from anywhere, for example from a database router.

Also, from your previous message, you quoted
> As a consequence, the only way to make it return a different value at 
different times would be to depend on global variables. This is bad 
software engineering. It doesn't meet Django's standards.

I believe the final suggestion you made was as follows:
> I would recommend running without persistent database connections 
(CONN_MAX_AGE = 0) and switching settings.DATABASE["default"] in a 
middleware at the beginning of every request. Modifying settings at runtime 
isn't officially supported, but this looks like the closest option to the 
intent you expressed.

Below are a few things I'd like to clarify:

   1. Are you referring to thread-locals as `global state/variables` in 
   your previous message ? If so, why suggest something which you consider bad 
   practise ?
   2. As a consequence, if using thread-locals is considered bad software 
   engg, could you please give me a way by which I could pass on request 
   metadata to the router layer cleanly ?
   3. The final recommendation you gave is something that isn't officially 
   supported and *would fail when a threaded server or a fully async 
   request path* is used.
   (Even your favourite search engine might not be of much help here)


Based on your suggestions, I haven't arrived at a soln which can make this 
work.

> Just because something is easy to implement doesn't mean it gets added to 
Django

I missed to type in a couple of words which has created quite the spur 
around this. My apologies for that. What I really wanted to key in was this:

> I don't see any reason for why providing a hook seems so difficult *to 
implement*. 

I guess we all agreed on the fact that this would be fairly easy to 
implement. But I never implied/wanted to imply anything about adding this 
to Django just because the implementation was fairly easy.

Also, you quoted in your previous message:
> It also needs to be generally useful and well designed. Your proposal 
doesn't meet the second criterion and I'm skeptical about the first one.

I believe I've fairly stated my use-case. If you're stating that the 
proposal is not `generally useful` then I believe you're implying that 
routing of DB queries based on request metadata like domain, IP, location 
etc., may not be useful to most folks. I'm skeptical about this. 
Also, regd the `well designed` part, I believe we haven't arrived at a 
clean solution based on your recommendations either. So I believe we'll 
have to get this sorted. 

Also, you quoted:
> Four committers rejected your proposal and suggested alternatives

I don't think that apart from your suggestion, the other folks mentioned 
anything concrete along the lines of a work around. Let me know if I'm 
missing something here. 

To summarise, I've been trying to help from my end as well to try and make 
ends meet. I love using Django and I respect a lot of its design 
philosophies. I would like to be a contributor in the long run and do some 
exciting work. Let me knw how we can take this forward in the best possible 
way. 

Regards, 
Aditya N
On Wednesday, June 2, 2021 at 2:02:35 AM UTC+5:30 Aymeric Augustin wrote:

> Hello,
>
> On 1 Jun 2021, at 14:35, N Aditya  wrote:
>
> All I'm looking for is a hook that the transaction APIs can call before 
> deciding on which database to use. I don't see any reason for why providing 
> a hook seems so difficult. 
>
>
> Just because something is easy to implement doesn't mean it gets added to 
> Django. It also needs to be generally useful and well designed. Your 
> proposal doesn't meet the second criterion and I'm skeptical about the 
> first one.
>
> Indeed, the get_connection() API you're proposing doesn't take any 
> argument. As a consequence, the only way to make it return a different 
> value at different times would be to depend on global variables. This is 
> bad software engineering. It doesn't meet Django's standards.
>
> (If you don't know why having functions that return different values based 
> on global variables is a bad idea, you can search for "global state" in 
> your favorite search engine. I'm pretty sure you can find a good blog post 
> explaining why. And yes Django suffers from some legacy in this area. We 
> are trying to avoid taking more debt.)
>
> Four committers rejected your proposal and suggested alternatives. We're 
> trying to help. Pushing the same idea again won't work any better. Please 
> take into account the objections and try to find a desi

Re: Transaction APIs do not consult the DB router to choose DB connection

2021-06-01 Thread N Aditya
Hey Lokesh, 

Just out of curiosity, I'd like to clarify the expected behaviour. 

If a db_for_transaction method is implemented, I believe it would be 
consulted for transaction APIs like atomic etc., Reads and writes which 
happen within that transaction are nevertheless going to consult their 
respective router methods. Is your expectation that the router somehow 
detect that there's an open transaction block somewhere above and scope all 
reads/writes that happen within it to be routed to the DB it was tied to ? 
i.e the DB alias used by the transaction takes precedence over what the 
router methods namely db_for_read/db_for_write may return. 

If this is your expectation, I think it might be hard to bake this into 
core Django. We could receive more views on the same from the community.

Looking at it from another perspective, I believe this can be achieved 
externally if core Django could expose hooks such as transaction.on_enter / 
transaction.on_exit to which callbacks can be hooked which could set 
some state info in thread-locals/contextvars which can later be parsed by 
the db_for_read/db_for_write methods to determine whether to use this info 
to make its decision or fallback to the default behaviour.

Also, something I'd like to clarify is that my use-case for a hook such as 
*db_for_transaction* is different. You could read above to know more about 
this.

Regards,
Aditya N
On Tuesday, June 1, 2021 at 11:34:55 PM UTC+5:30 lokeshw...@gmail.com wrote:

> Hi Everyone,
>
> Our use case is that we have a writer primary database and read-only 
> replica databases. The replicas can sometimes be a bit behind the primary
> We have written a Router like this
>
> class CustomRouter:
> def db_for_read(self, model, **hints):
> return 'reader'
>
> def db_for_write(self, model, **hints):
> return 'default'
>
> def allow_relation(self, obj1, obj2, **hints):
> return None
>
> def allow_migrate(self, db, app_label, model_name=None, **hints):
> return None
>
> So this is all fine for normal queries. But we observed that even within a 
> transaction.atomic() block, the read queries go to a different database, 
> and write queries go to a different database
>
> So I was hoping *db_for_transaction* proposal may improve this behavior.
> On Tuesday, June 1, 2021 at 6:32:44 PM UTC+5:30 f.apo...@gmail.com wrote:
>
>> On Tuesday, June 1, 2021 at 2:35:17 PM UTC+2 gojeta...@gmail.com wrote:
>>
>>> I don't see any reason for why providing a hook seems so difficult. 
>>>
>>
>> It is more code to maintain, needs tests etc and increases complexity. 
>> Just because something is easy on the surface, doesn't mean it will be easy 
>> in the longrun.
>>
>> A simple implementation can be: (From message-3 of this conversation):
>>>
>>>
>>> > #settings.py
>>>
>>> TRANSACTION_DB_SELECTOR = "path_to_some_callable"
>>>
>>> #transaction.py
>>> ...
>>> transaction_db_selector = import_string(settings.TRANSACTION_DB_SELECTOR)
>>> def get_connection():
>>> if transaction_db_selector:
>>> using = transaction_db_selector()
>>> if using is None:
>>> using = DEFAULT_DB_ALIAS
>>> return connections[using]
>>>
>>
>> I do not think that completely ignoring the `using` that was passed in by 
>> the user would be a very good idea (for the case when there is also a 
>> `TRANSACTION_DB_SELECTOR` set).
>>
>> Cheers,
>> Florian
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/4375763e-7e48-4426-bc49-587fdbba12c9n%40googlegroups.com.


Re: Transaction APIs do not consult the DB router to choose DB connection

2021-06-01 Thread N Aditya
Hey Florian, 

I'd like to point out that the code snippet which I wrote is just a scratch 
version of how it might look. I'd first like to reach consensus on the idea 
itself before writing something concrete. 
Obviously it wouldn't be as naive as overriding something that the user 
provided explicitly. 

Regards,
Aditya N

On Tuesday, June 1, 2021 at 6:32:44 PM UTC+5:30 f.apo...@gmail.com wrote:

> On Tuesday, June 1, 2021 at 2:35:17 PM UTC+2 gojeta...@gmail.com wrote:
>
>> I don't see any reason for why providing a hook seems so difficult. 
>>
>
> It is more code to maintain, needs tests etc and increases complexity. 
> Just because something is easy on the surface, doesn't mean it will be easy 
> in the longrun.
>
> A simple implementation can be: (From message-3 of this conversation):
>>
>>
>> > #settings.py
>>
>> TRANSACTION_DB_SELECTOR = "path_to_some_callable"
>>
>> #transaction.py
>> ...
>> transaction_db_selector = import_string(settings.TRANSACTION_DB_SELECTOR)
>> def get_connection():
>> if transaction_db_selector:
>> using = transaction_db_selector()
>> if using is None:
>> using = DEFAULT_DB_ALIAS
>> return connections[using]
>>
>
> I do not think that completely ignoring the `using` that was passed in by 
> the user would be a very good idea (for the case when there is also a 
> `TRANSACTION_DB_SELECTOR` set).
>
> Cheers,
> Florian
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/9e56624a-f486-4089-b065-13b77aff47d2n%40googlegroups.com.


Re: Transaction APIs do not consult the DB router to choose DB connection

2021-06-01 Thread N Aditya
Hey, 

I agree to the fact that the request object being passed to the DB layer 
(routers/transaction APIs) may not be pragmatic according to the Loose 
coupling approach. 

Quoting what you said: 
> you can store the current HTTP request as a thread-local variable in your 
application and access it from anywhere, for example from a database router.

>From *The Gap section* in the above message:
> To add further context, in the library that I built, though the routers 
framework currently doesn't receive the request object, I was able to work 
around using contextvars but the *essence is that the hooks were there*. 
I'm unable to do the same for the transaction APIs without monkey patching 
since such *a hook is unavailable*.

Both of us are saying the same thing. 

Setting something in thread-local/contextvars and accessing it via the DB 
router -> Cool. Works fine for all DB queries/migrations.
Setting something in thread-local/contextvars and accessing it via  
-> For transaction APIs (not just atomic but also the other APIs such as 
savepoint, savepoinnt_rollback etc.,) the dash is what I'm looking for. 

All I'm looking for is a hook that the transaction APIs can call before 
deciding on which database to use. I don't see any reason for why providing 
a hook seems so difficult. 
A simple implementation can be: (From message-3 of this conversation):

> #settings.py

TRANSACTION_DB_SELECTOR = "path_to_some_callable"

#transaction.py
...
transaction_db_selector = import_string(settings.TRANSACTION_DB_SELECTOR)
def get_connection():
if transaction_db_selector:
using = transaction_db_selector()
if using is None:
using = DEFAULT_DB_ALIAS
return connections[using]

Let me know your thoughts on this. 




On Tuesday, June 1, 2021 at 5:15:29 PM UTC+5:30 Aymeric Augustin wrote:

> Hello,
>
> The first item in Django's design philosophies is Loose coupling 
> <https://docs.djangoproject.com/en/stable/misc/design-philosophies/#loose-coupling>.
>  
> Per this principle, the database layer shouldn't know about the HTTP layer. 
> This is a strong reason for keeping the HTTP request object away from the 
> database layer (e.g. database routers or the transactions API). This is why 
> "a hook which can be invoked with the request object as its param to make a 
> decision" is extremely unlikely to be accepted.
>
> If you disagree with the consequences of this philosophy, you can store 
> the current HTTP request as a thread-local variable in your application and 
> access it from anywhere, for example from a database router.
>
> Since I'm the original author of the transactions API, I've been thinking 
> about your suggestion for the last few days. However, I'm not seeing what 
> context could meaningfully be passed to `transaction.atomic()` besides the 
> existing `using` parameter. Passing the HTTP request object is not an 
> option.
>
> I would recommend running without persistent database connections 
> (CONN_MAX_AGE = 0) and switching settings.DATABASE["default"] in a 
> middleware at the beginning of every request. Modifying settings at runtime 
> isn't officially supported, but this looks like the closest option to the 
> intent you expressed.
>
> Best regards,
>
> -- 
> Aymeric.
>
>
>
> On 1 Jun 2021, at 12:09, N Aditya  wrote:
>
> Hey all, 
>
> I believe there's a major misunderstanding of what I've been trying to 
> convey. Let me rephrase:
>
> *Problem*: Include the request object as a decision param in the routers 
> framework/transaction APIs based on which a DB can be chosen i.e all 
> queries(reads/writes) and transactions are directed to a specific database 
> without having to explicitly mention it. This could be location based, IP 
> based, domain based or based on any other metadata available in the 
> request. 
>
> *Explanation*: 
> 1. This use-case is an extension of the examples provided in Django docs 
> for multi-database support wherein based on the *model (or) based on the 
> type of operation(read/write) *the database is chosen via the routers 
> framework. I need to achieve the same but this time it wouldn't be based on 
> the model or type of operation, rather it would be based on *some 
> metadata in the request*. 
> 2. The multi-tenancy library which I've built just adds solidity to this 
> use-case. Simply put, it achieves routing based on request headers. 
>
> *The gap*: I believe the essence of the *routers framework* is to *provide 
> a hook* by which ORM queries can be directed to a DB based on some 
> condition *without having to explicitly specify it in every query*. A 
> similar *hook is missing* for all of the transaction APIs. 

Re: Transaction APIs do not consult the DB router to choose DB connection

2021-06-01 Thread N Aditya
Hey all,

I believe there's a major misunderstanding of what I've been trying to
convey. Let me rephrase:

*Problem*: Include the request object as a decision param in the routers
framework/transaction APIs based on which a DB can be chosen i.e all
queries(reads/writes) and transactions are directed to a specific database
without having to explicitly mention it. This could be location based, IP
based, domain based or based on any other metadata available in the
request.

*Explanation*:
1. This use-case is an extension of the examples provided in Django docs
for multi-database support wherein based on the *model (or) based on the
type of operation(read/write) *the database is chosen via the routers
framework. I need to achieve the same but this time it wouldn't be based on
the model or type of operation, rather it would be based on *some metadata
in the request*.
2. The multi-tenancy library which I've built just adds solidity to this
use-case. Simply put, it achieves routing based on request headers.

*The gap*: I believe the essence of the *routers framework* is to *provide
a hook* by which ORM queries can be directed to a DB based on some
condition *without having to explicitly specify it in every query*. A
similar *hook is missing* for all of the transaction APIs. This is the gap
I'm trying to bridge. To add further context, in the library that I built,
though the routers framework currently doesn't receive the request object,
I was able to work around using contextvars but the *essence is that the
hooks were there*. I'm unable to do the same for the transaction APIs
without monkey patching since such *a hook is unavailable*.

> I don't think this is a strong argument by any mean. To me this is
basically saying "it might be useful in the future so lets add it now"
without providing any details about why it might be useful in the first
place.


   1. I thought I made it clear but reinstating anyway. I'd like the
   routers framework to include the request object as part of the arguments
   list to each of its methods so that decisions made can be based on it as
   well. If so, there could be a separate method(hook) for transaction APIs as
   well, like db_for_transaction which can take in the request object to make
   a decision. It could include something else as well in the future.
   (Mentioned in message-1 of this thread. Probably I forgot to mention the
   request object part).
   2. If the above is difficult to achieve, then we could at the least
   expose a hook which can be invoked with the request object as its param to
   make a decision. I believe this should be fairly *simple to implement
   and maintain*. (Mentioned in message-3 of this thread).


If you guys feel that this use-case is still invalid, I'd like to know a
logical explanation apart from maintenance overhead issues and waiting for
more people to come up with the same problem.



Simon, regd the multi-tenancy thing, I've done some extensive research on
the libraries available currently to achieve multi-tenancy (including
yours) in Django and none of them have adopted the isolated DB approach and
declared it production ready. There are multiple libraries using the schema
per tenant approach but the downsides are:

1. It is DB implementation specific i.e only DBs' which are using schemas
(like PostgreSQL) can take advantage of it. If they're using MongoDB or
other data stores, they're out of luck.
2. In a microservices architecture, one may be using a combination of DBs'
for various purposes. For eg: PostgreSQL for OLTP, ElasticSearch for
real-time search etc., In such cases, they're out of luck again.
3.  A lot of B2B SAAS companies have a mandate that their data cannot be
collocated in the same physical space so as to avoid data leaks and
breaches, especially in healthcare systems (HIPAA).

Also, there are a few libraries using the shared table FK approach to
inject relationship filters dynamically but after reading code, I found
that it doesn't cover all intricate/edge cases like:
1. m2m relations between two tenant independent models.
2. Not all methods of the QuerySet have been overridden properly.

Though I agree with your point that all these libraries (including mine)
might not scale well for tenants in the range 50 - 100K, I don't see why it
wouldn't work for tenants in the order of a few 1000s. Also, I'd genuinely
like to know why the DATABASES dictionary or the CACHES dictionary would
run into connection leaks/KeyErrors with such an approach.

Regards,
Aditya N

On Tue, Jun 1, 2021 at 12:04 PM Florian Apolloner 
wrote:

>
>
> On Monday, May 31, 2021 at 12:13:58 PM UTC+2 Adam Johnson wrote:
>
>> I'm also -1 on changing anything in Django right now.
>>
>
> -1 as well, I simply see no way how something like:
>
> ```
> with transaction.atomic():
>  Model1.objects.create()
>  Model2.objects.create()
> ```
>
> will allow for any useful behavior via the rout

Re: Transaction APIs do not consult the DB router to choose DB connection

2021-05-31 Thread N Aditya
Hey Adam/Simon, 

How can we take this forward ? 

Regards,
Aditya N

On Friday, May 28, 2021 at 3:04:14 PM UTC+5:30 N Aditya wrote:

> Hey Adam, 
>
> Also, after giving it a bit of thought, I figured that integrating this 
> logic with the routers framework isn't entirely necessary. 
> So I came up with another perspective to the solution as well which I've 
> illustrated in message-3 of this thread.
>
> Both approaches work for me. 
>
> Let me know your thoughts on the same.
>
> Regards, 
> Aditya N
>
> On Friday, May 28, 2021 at 2:55:02 PM UTC+5:30 N Aditya wrote:
>
>> Hey Adam, 
>>
>> "atomic does already call DB routers" -> Firstly after reading code, I 
>> don't think the transaction APIs consult the routers. Secondly, I think I 
>> already answered it in the initial discussion.
>>
>> FYI from message-1:
>>
>>
>>1. Having a separate method for transaction is good because it need 
>>not be mangled up with the other methods i.e db_for_read / db_for_write 
>> as 
>>they clearly indicate certain operational semantics which aren't tied to 
>> a 
>>transaction since it can have both reads and writes within it. 
>>2. Also, if in the future, there is some special info that can be 
>>delivered to the transaction based on which a decision regd which DB to 
>> use 
>>could be made, then it would be cleanly isolated into its own method.
>>
>> Regards, 
>> Aditya N
>>
>> On Friday, May 28, 2021 at 2:03:16 PM UTC+5:30 Adam Johnson wrote:
>>
>>> Aditya - you didn't answer Simon's first question: "Can you think of 
>>> places where this db_for_transaction hook would differ in any way from 
>>> what db_for_write returns?" I think this is the crux of the issue. 
>>> atomic does already call DB routers - in what case could you need to return 
>>> a different result for atomic() than for a write query?
>>>
>>> On Fri, 28 May 2021 at 09:17, N Aditya  wrote:
>>>
>>>> Hey Simon, 
>>>>
>>>> It would be possible to write a method as you've suggested above. But 
>>>> it entails the following problems:
>>>> 1. There would have to be a wrapper for each of the transaction APIs, 
>>>> not just transaction.atomic since all of them take the using kwarg.
>>>> 2.  I'm trying to build a library that helps achieve multi-tenancy (the 
>>>> isolated DB approach) by dynamically routing DB queries to the appropriate 
>>>> databases at runtime. For more info, check this out -> PyCon India 
>>>> 2021 proposal 
>>>> <https://in.pycon.org/cfp/2021/proposals/django-and-multi-tenancy-the-isolated-database-approach~dR6oO/>.
>>>>  
>>>> If that be the case, for any new code base, I could expose a bunch of 
>>>> methods as suggested above and ask them to incorporate it. But for 
>>>> older/existing code bases which already directly use various transaction 
>>>> APIs, it would be hard to replace all lines with another method. 
>>>>
>>>> By inspecting code, I figured that there is this method named 
>>>> get_connection in the transaction.py module which is being used by all the 
>>>> other transaction APIs for fetching the DB connection. Currently I've 
>>>> patched this method with my own to achieve the functionality mentioned 
>>>> above. Might I suggest an alternate implementation for this method as 
>>>> follows:
>>>>
>>>> #settings.py
>>>>
>>>> TRANSACTION_DB_SELECTOR = "path_to_some_callable"
>>>>
>>>> #transaction.py
>>>> ...
>>>> transaction_db_selector = 
>>>> import_string(settings.TRANSACTION_DB_SELECTOR)
>>>> def get_connection():
>>>> if transaction_db_selector:
>>>> using = transaction_db_selector()
>>>> if using is None:
>>>> using = DEFAULT_DB_ALIAS
>>>> return connections[using]
>>>>
>>>> Let me know your thoughts on this. 
>>>>
>>>> Regards, 
>>>> Aditya N
>>>>
>>>> On Friday, May 28, 2021 at 10:05:26 AM UTC+5:30 charettes wrote:
>>>>
>>>>> Ticket that directed to the mailing list for wider feedback 
>>>>> https://code.djangoproject.com/ticket/32788
>>>>>
>>>>> ---
>>>>>
>>>>> Can you think of pl

Re: Transaction APIs do not consult the DB router to choose DB connection

2021-05-28 Thread N Aditya
Hey Adam, 

Also, after giving it a bit of thought, I figured that integrating this 
logic with the routers framework isn't entirely necessary. 
So I came up with another perspective to the solution as well which I've 
illustrated in message-3 of this thread.

Both approaches work for me. 

Let me know your thoughts on the same.

Regards, 
Aditya N

On Friday, May 28, 2021 at 2:55:02 PM UTC+5:30 N Aditya wrote:

> Hey Adam, 
>
> "atomic does already call DB routers" -> Firstly after reading code, I 
> don't think the transaction APIs consult the routers. Secondly, I think I 
> already answered it in the initial discussion.
>
> FYI from message-1:
>
>
>1. Having a separate method for transaction is good because it need 
>not be mangled up with the other methods i.e db_for_read / db_for_write as 
>they clearly indicate certain operational semantics which aren't tied to a 
>transaction since it can have both reads and writes within it. 
>2. Also, if in the future, there is some special info that can be 
>delivered to the transaction based on which a decision regd which DB to 
> use 
>could be made, then it would be cleanly isolated into its own method.
>
> Regards, 
> Aditya N
>
> On Friday, May 28, 2021 at 2:03:16 PM UTC+5:30 Adam Johnson wrote:
>
>> Aditya - you didn't answer Simon's first question: "Can you think of 
>> places where this db_for_transaction hook would differ in any way from 
>> what db_for_write returns?" I think this is the crux of the issue. 
>> atomic does already call DB routers - in what case could you need to return 
>> a different result for atomic() than for a write query?
>>
>> On Fri, 28 May 2021 at 09:17, N Aditya  wrote:
>>
>>> Hey Simon, 
>>>
>>> It would be possible to write a method as you've suggested above. But it 
>>> entails the following problems:
>>> 1. There would have to be a wrapper for each of the transaction APIs, 
>>> not just transaction.atomic since all of them take the using kwarg.
>>> 2.  I'm trying to build a library that helps achieve multi-tenancy (the 
>>> isolated DB approach) by dynamically routing DB queries to the appropriate 
>>> databases at runtime. For more info, check this out -> PyCon India 2021 
>>> proposal 
>>> <https://in.pycon.org/cfp/2021/proposals/django-and-multi-tenancy-the-isolated-database-approach~dR6oO/>.
>>>  
>>> If that be the case, for any new code base, I could expose a bunch of 
>>> methods as suggested above and ask them to incorporate it. But for 
>>> older/existing code bases which already directly use various transaction 
>>> APIs, it would be hard to replace all lines with another method. 
>>>
>>> By inspecting code, I figured that there is this method named 
>>> get_connection in the transaction.py module which is being used by all the 
>>> other transaction APIs for fetching the DB connection. Currently I've 
>>> patched this method with my own to achieve the functionality mentioned 
>>> above. Might I suggest an alternate implementation for this method as 
>>> follows:
>>>
>>> #settings.py
>>>
>>> TRANSACTION_DB_SELECTOR = "path_to_some_callable"
>>>
>>> #transaction.py
>>> ...
>>> transaction_db_selector = import_string(settings.TRANSACTION_DB_SELECTOR)
>>> def get_connection():
>>> if transaction_db_selector:
>>> using = transaction_db_selector()
>>> if using is None:
>>> using = DEFAULT_DB_ALIAS
>>> return connections[using]
>>>
>>> Let me know your thoughts on this. 
>>>
>>> Regards, 
>>> Aditya N
>>>
>>> On Friday, May 28, 2021 at 10:05:26 AM UTC+5:30 charettes wrote:
>>>
>>>> Ticket that directed to the mailing list for wider feedback 
>>>> https://code.djangoproject.com/ticket/32788
>>>>
>>>> ---
>>>>
>>>> Can you think of places where this db_for_transaction hook would 
>>>> differ in any way from what db_for_write returns? That's what Django 
>>>> uses internally in such instances
>>>>
>>>>1. ​
>>>>
>>>> https://github.com/django/django/blob/0d67481a6664a1e66d875eef59b96ed489060601/django/db/models/base.py#L745-L760
>>>>  
>>>>
>>>> <https://github.com/django/django/blob/0d67481a6664a1e66d875eef59b96ed489060601/django/db/models/base.py#L745-L760>
>>

Re: Transaction APIs do not consult the DB router to choose DB connection

2021-05-28 Thread N Aditya
Hey Adam, 

"atomic does already call DB routers" -> Firstly after reading code, I 
don't think the transaction APIs consult the routers. Secondly, I think I 
already answered it in the initial discussion.

FYI from message-1:


   1. Having a separate method for transaction is good because it need not 
   be mangled up with the other methods i.e db_for_read / db_for_write as they 
   clearly indicate certain operational semantics which aren't tied to a 
   transaction since it can have both reads and writes within it. 
   2. Also, if in the future, there is some special info that can be 
   delivered to the transaction based on which a decision regd which DB to use 
   could be made, then it would be cleanly isolated into its own method.

Regards, 
Aditya N

On Friday, May 28, 2021 at 2:03:16 PM UTC+5:30 Adam Johnson wrote:

> Aditya - you didn't answer Simon's first question: "Can you think of 
> places where this db_for_transaction hook would differ in any way from 
> what db_for_write returns?" I think this is the crux of the issue. atomic 
> does already call DB routers - in what case could you need to return a 
> different result for atomic() than for a write query?
>
> On Fri, 28 May 2021 at 09:17, N Aditya  wrote:
>
>> Hey Simon, 
>>
>> It would be possible to write a method as you've suggested above. But it 
>> entails the following problems:
>> 1. There would have to be a wrapper for each of the transaction APIs, not 
>> just transaction.atomic since all of them take the using kwarg.
>> 2.  I'm trying to build a library that helps achieve multi-tenancy (the 
>> isolated DB approach) by dynamically routing DB queries to the appropriate 
>> databases at runtime. For more info, check this out -> PyCon India 2021 
>> proposal 
>> <https://in.pycon.org/cfp/2021/proposals/django-and-multi-tenancy-the-isolated-database-approach~dR6oO/>.
>>  
>> If that be the case, for any new code base, I could expose a bunch of 
>> methods as suggested above and ask them to incorporate it. But for 
>> older/existing code bases which already directly use various transaction 
>> APIs, it would be hard to replace all lines with another method. 
>>
>> By inspecting code, I figured that there is this method named 
>> get_connection in the transaction.py module which is being used by all the 
>> other transaction APIs for fetching the DB connection. Currently I've 
>> patched this method with my own to achieve the functionality mentioned 
>> above. Might I suggest an alternate implementation for this method as 
>> follows:
>>
>> #settings.py
>>
>> TRANSACTION_DB_SELECTOR = "path_to_some_callable"
>>
>> #transaction.py
>> ...
>> transaction_db_selector = import_string(settings.TRANSACTION_DB_SELECTOR)
>> def get_connection():
>> if transaction_db_selector:
>> using = transaction_db_selector()
>> if using is None:
>> using = DEFAULT_DB_ALIAS
>> return connections[using]
>>
>> Let me know your thoughts on this. 
>>
>> Regards, 
>> Aditya N
>>
>> On Friday, May 28, 2021 at 10:05:26 AM UTC+5:30 charettes wrote:
>>
>>> Ticket that directed to the mailing list for wider feedback 
>>> https://code.djangoproject.com/ticket/32788
>>>
>>> ---
>>>
>>> Can you think of places where this db_for_transaction hook would differ 
>>> in any way from what db_for_write returns? That's what Django uses 
>>> internally in such instances
>>>
>>>1. ​
>>>
>>> https://github.com/django/django/blob/0d67481a6664a1e66d875eef59b96ed489060601/django/db/models/base.py#L745-L760
>>>  
>>>
>>> <https://github.com/django/django/blob/0d67481a6664a1e66d875eef59b96ed489060601/django/db/models/base.py#L745-L760>
>>>  
>>>2. ​
>>>
>>> https://github.com/django/django/blob/0d67481a6664a1e66d875eef59b96ed489060601/django/db/models/base.py#L950
>>>  
>>>
>>> <https://github.com/django/django/blob/0d67481a6664a1e66d875eef59b96ed489060601/django/db/models/base.py#L950>
>>>  
>>>3. ​
>>>
>>> https://github.com/django/django/blob/0d67481a6664a1e66d875eef59b96ed489060601/django/db/models/deletion.py#L400
>>>  
>>>
>>> <https://github.com/django/django/blob/0d67481a6664a1e66d875eef59b96ed489060601/django/db/models/deletion.py#L400>
>>>  
>>>
>>> I get that your asking for a way to avoid explicitly passing 
>>> atomic(using) all o

Re: Transaction APIs do not consult the DB router to choose DB connection

2021-05-28 Thread N Aditya
Hey Simon, 

It would be possible to write a method as you've suggested above. But it 
entails the following problems:
1. There would have to be a wrapper for each of the transaction APIs, not 
just transaction.atomic since all of them take the using kwarg.
2.  I'm trying to build a library that helps achieve multi-tenancy (the 
isolated DB approach) by dynamically routing DB queries to the appropriate 
databases at runtime. For more info, check this out -> PyCon India 2021 
proposal 
.
 
If that be the case, for any new code base, I could expose a bunch of 
methods as suggested above and ask them to incorporate it. But for 
older/existing code bases which already directly use various transaction 
APIs, it would be hard to replace all lines with another method. 

By inspecting code, I figured that there is this method named 
get_connection in the transaction.py module which is being used by all the 
other transaction APIs for fetching the DB connection. Currently I've 
patched this method with my own to achieve the functionality mentioned 
above. Might I suggest an alternate implementation for this method as 
follows:

#settings.py

TRANSACTION_DB_SELECTOR = "path_to_some_callable"

#transaction.py
...
transaction_db_selector = import_string(settings.TRANSACTION_DB_SELECTOR)
def get_connection():
if transaction_db_selector:
using = transaction_db_selector()
if using is None:
using = DEFAULT_DB_ALIAS
return connections[using]

Let me know your thoughts on this. 

Regards, 
Aditya N

On Friday, May 28, 2021 at 10:05:26 AM UTC+5:30 charettes wrote:

> Ticket that directed to the mailing list for wider feedback 
> https://code.djangoproject.com/ticket/32788
>
> ---
>
> Can you think of places where this db_for_transaction hook would differ 
> in any way from what db_for_write returns? That's what Django uses 
> internally in such instances
>
>1. ​
>
> https://github.com/django/django/blob/0d67481a6664a1e66d875eef59b96ed489060601/django/db/models/base.py#L745-L760
>  
>
> 
>  
>2. ​
>
> https://github.com/django/django/blob/0d67481a6664a1e66d875eef59b96ed489060601/django/db/models/base.py#L950
>  
>
> 
>  
>3. ​
>
> https://github.com/django/django/blob/0d67481a6664a1e66d875eef59b96ed489060601/django/db/models/deletion.py#L400
>  
>
> 
>  
>
> I get that your asking for a way to avoid explicitly passing atomic(using) 
> all over the place but I'm having a hard time figuring out in which cases a 
> db_for_transaction hook, which cannot be provided any contextual details 
> like other router methods do, can take an an advised decision without 
> relying on contextual/global state.
>
> Is there a reason you can't replace your wide spread internal uses of 
> atomic by something along these lines
>
> def routed_atomic(savepoint=True, durable=False):
>
> using = get_global_using()
>
> return transaction.atomic(using, savepoint=savepoint, durable=durable)
>
> Cheers,
>
> Simon
> Le jeudi 27 mai 2021 à 12:18:50 UTC-4, gojeta...@gmail.com a écrit :
>
>> From the Django docs, for any ORM query made, the DB alias to use is 
>> determined by the following rules:
>>
>>- Checks if the using keyword is used either as a parameter in the 
>>function call or chained with QuerySet.
>>- Consults the DB routers in order until a match is found.
>>- Falls back to the default router which always returns default as 
>>the alias.
>>
>> Using the router scheme works perfectly for ORM queries. However, when it 
>> comes to using transaction APIs ​(like the transaction.atomic construct), 
>> it becomes mandatory to specify the *using* kwarg explicitly in all of 
>> its publicly exposed APIs if a DB other than the default alias has to be 
>> chosen.
>>
>> To illustrate why this is a problem, consider the following scenario:
>>
>>- A DB router exists such that it directs queries to a specific 
>>database based on a value set in *thread-local* storage by a 
>>middleware.
>>- When ORM queries from within the view are fired, they automatically 
>>get forwarded to the right DB *without explicitly citing* the using 
>> keyword 
>>owing to the router.
>>- But if the transaction.atomic construct is used, the using keyword 
>>would have to be explicitly specified each time. While this might seem 
>>fine, it creates the following problems:
>>   1. For large code bases, it becomes highly unwieldy to make sure 
>>   that the *using* keyword has been mentioned in every transaction 
>> 

Transaction APIs do not consult the DB router to choose DB connection

2021-05-27 Thread N Aditya


>From the Django docs, for any ORM query made, the DB alias to use is 
determined by the following rules:

   - Checks if the using keyword is used either as a parameter in the 
   function call or chained with QuerySet.
   - Consults the DB routers in order until a match is found.
   - Falls back to the default router which always returns default as the 
   alias.

Using the router scheme works perfectly for ORM queries. However, when it 
comes to using transaction APIs ​(like the transaction.atomic construct), 
it becomes mandatory to specify the *using* kwarg explicitly in all of its 
publicly exposed APIs if a DB other than the default alias has to be chosen.

To illustrate why this is a problem, consider the following scenario:

   - A DB router exists such that it directs queries to a specific database 
   based on a value set in *thread-local* storage by a middleware.
   - When ORM queries from within the view are fired, they automatically 
   get forwarded to the right DB *without explicitly citing* the using keyword 
   owing to the router.
   - But if the transaction.atomic construct is used, the using keyword 
   would have to be explicitly specified each time. While this might seem 
   fine, it creates the following problems:
  1. For large code bases, it becomes highly unwieldy to make sure that 
  the *using* keyword has been mentioned in every transaction API call. 
  Also, if one tries to implement the above scheme in an already existing 
  code base, it would be impractical to add the using keyword in all lines 
  calling the transaction APIs.
  2. It doesn't gel well with the the routers framework.
   
A proposed workaround could to be to add a separate method named 
*db_for_transaction* to the routers framework which would be consulted by 
the transaction APIs first, before falling back to using the default DB 
alias.


   1. Having a separate method for transaction is good because it need not 
   be mangled up with the other methods i.e db_for_read / db_for_write as they 
   clearly indicate certain operational semantics which aren't tied to a 
   transaction since it can have both reads and writes within it. 
   2. Also, if in the future, there is some special info that can be 
   delivered to the transaction based on which a decision regd which DB to use 
   could be made, then it would be cleanly isolated into its own method.


If we can finalise on this, I could submit a PR for the same.

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/b62206d5-b001-413c-a58e-e78c08596497n%40googlegroups.com.