Re: Having a MongoDB connector for Django

2017-09-14 Thread Patryk Zawadzki
W dniu poniedziałek, 11 września 2017 05:28:55 UTC+2 użytkownik Nes Dis 
napisał:
>
> Thank you all for your very insightful comments. I personally am a big 
> user/contributor to the framework myself and would like to see it thrive 
> and progress with respect to other competing frameworks.
>
> I am sure most are aware, of this argument about MongoDB increasing in 
> popularity. Several members are of the opinion that not supporting this 
> backend (which in my opinion is not too difficult) will not dent Django's 
> popularity. 
>

This is my personal opinion so take with a grain of salt but I believe 
MongoDB is popular among two very different groups of people:

1) those who think convenience of project setup (ie. I can bootstrap my 
project within 15 minutes) and believe that going schema-less is saving 
them time

2) those who deal with terabytes upon terabytes of unstructured data

During my career I've been part of both groups.

All projects that fell under group 1 had to eventually migrate to SQL. 
MongoDB is easy to setup and convenient to use for proof-of-concept toys 
but it was built with a very specific goal in mind: to handle data at 
SCALE. This means there are no transactions (you can't lock data when 
transactions can take hours to complete), there is no atomicity (all 
updates are distributed and row updates happen in parallel, you can easily 
end up with partial successes where some rows are updated and some raise 
errors, there's no ghost read protection as all updates are final), there 
is also no way for the database to reject invalid data (as there is no 
schema enforcement, the only rule is: junk in; junk out). Whatever 
resources you save by initially choosing MongoDB you pay for ten-fold each 
time you encounter a problem that is easily solvable in the ACID RDBMS 
world and that just does not exist in group 2 projects.

All projects that fell under group 2 had to build their logic with 
MongoDB's architecture in mind. To take advantage of the scalability you 
need to carefully craft all of your commands. You only select data that you 
know you'll absolutely need (which means that you do actually have multiple 
implicit schemas or "interfaces" certain rows conform to), you maintain a 
fleet of background jobs that recalculate certain denormalizations to keep 
them in approximate sync with data (as updates can and do happen in 
parallel and recalculating aggregate queries on terabytes of data can be 
prohibitively costly), and you do complex updates by sending a JavaScript 
function along with the query instead of fetching millions of rows and 
issuing millions of update commands. There would be no advantage for us to 
be able to use a ModelForm to manage schema-less documents.

Cheers

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/8166d971-5e77-42f5-b5db-27d9df7b8778%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Database "execute hooks" for instrumentation

2017-09-14 Thread Shai Berger
In case you're interested and want to see this in 2.0, please help:

https://code.djangoproject.com/ticket/28595
https://github.com/django/django/pull/9078

On Friday 14 April 2017 02:33:06 Adam Johnson wrote:
> django-perf-rec  would love this,
> it currently monkey patches connection.ops.last_executed_query to listen to
> all the queries going on.
> 
> On 7 April 2017 at 16:10, Shai Berger  wrote:
> > On Friday 07 April 2017 17:47:51 Carl Meyer wrote:
> > > Hi Shai,
> > > 
> > > On 04/07/2017 06:02 AM, Shai Berger wrote:
> > > > This is an idea that came up during the djangocon-europe conference:
> > Add
> > 
> > > > the ability to install general instrumentation hooks around the
> > 
> > database
> > 
> > > > "execute" and "executemany" calls.
> > > > 
> > > > Such hooks would allow all sorts of interesting features. For one,
> > > > they could replace the current special-case allowing
> > > > assertNumQueries & friends to record queries out of debug mode (it's
> > > > an ugly hack,
> > 
> > really),
> > 
> > > > but they could also support my imagined use case -- a context-manager
> > > > which could prevent database access during execution of some code
> > > > (I'm thinking mostly of using it around "render()" calls and
> > > > serialization, to make sure all database access is being done in the
> > > > view).
> > > 
> > > Another use-case is for preventing database access during tests unless
> > > specifically requested by the test (e.g. pytest-django does this,
> > > currently via monkeypatching).
> > 
> > Yep. This feels right.
> > 
> > > > My idea for implementation is to keep a thread-local stack of
> > > > context- managers, and have them wrap each call of "execute". We
> > > > could actually even use one such context-manager instead of the
> > > > existing
> > > > CursorDebugWrapper.
> > > 
> > > Why a new thread-local instead of explicitly per-connection and stored
> > > on the connection?
> > 
> > Sorry for implying that it would be a new thread-local, I just hadn't
> > thought
> > it through when I wrote this. Of course it goes on the (already
> > thread-local)
> > connection.
> > 
> > Shai.