Re: general interest in faster bulk_update implementation

2022-10-18 Thread jobe...@gmail.com
Will there *not* be a Django ORM implementation of psycopg3 COPY FROM when 
that lands? And, I guess I'll need to figure out when that lands/would land.

On Tuesday, October 18, 2022 at 11:07:51 AM UTC-4 j.bre...@netzkolchose.de 
wrote:

> > pretty quickly, so if you need testing input (Django 3.2, Postgres) I
> > can offer feedback from what I find.
>
> Yes testing would be awesome, esp. for edge cases (test coverage for 
> default cases is pretty complete for `fast_update` I think).
>
> > Can you tell me more about this statement:
> > > *Note* copy_update will probably never leave the alpha/PoC-state, as
> > psycopg3 brings great COPY support, which does a more secure value
> > conversion and has a very fast C-version.
>
> Well I created the `copy_update` alternative for postgres just to see 
> the advantage of COPY FROM over UPDATE FROM VALUES. The impl uses 
> psycopg2's COPY interface, which got heavily revamped in psycopg3, 
> including proper value adapters written C. V2 does not have this yet, 
> therefore I had to create the value encoders in python, which are less 
> strict about values and still ~3 times slower than the C adapters in 
> psycopg3.
> The message above is meant as a warning, that I dont plan to put too 
> much effort into polishing this soon outdated implementation.
>
> > Where can I learn more about that COPY statement, and how/where that
> > statement might be integrated with the Django ORM?
>
> Plz check the postgres docs 
> (https://www.postgresql.org/docs/current/sql-copy.html), it covers all 
> important low level details. Furthermore check psycopg3 docs (and also 
> psycopg2 docs, if you want to get your hands on the `copy__update` impl).
> I dont think that driving `bulk_update` by COPY FROM for postgres is a 
> good idea, there are quite some semantic differences, also it is slower 
> for tiny changesets than UPDATE FROM VALUES, thought it starts to shine 
> for changesets >1000 (up to ~4 times faster for a 1M changeset compared 
> to `fast_update` in my tests). Maybe it can be added to the postgres 
> subpackage, if there is demand for it.
>
> Feel free to create issues or to comment on open ones. Important pending 
> issues are:
> - proper duplicate check 
> (https://github.com/netzkolchose/django-fast-update/issues/13)
> - good story whether to integrate support for f-expressions back or to 
> keep them out (currently unsupported, as the steps to get this working 
> are very cumbersome)
>
> Cheers,
> Jerch
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/30bdbb05-cc70-4936-8b26-2435fa27ee08n%40googlegroups.com.


Re: general interest in faster bulk_update implementation

2022-10-18 Thread jobe...@gmail.com
Jerch,

I love that you're improving the `bulk_update` performance with your 
package. I am definitely looking to adopt it. I can start working on it 
pretty quickly, so if you need testing input (Django 3.2, Postgres) I can 
offer feedback from what I find.

Can you tell me more about this statement:
> *Note* copy_update will probably never leave the alpha/PoC-state, as 
psycopg3 brings great COPY support, which does a more secure value 
conversion and has a very fast C-version.

Where can I learn more about that COPY statement, and how/where that 
statement might be integrated with the Django ORM?

On Saturday, April 30, 2022 at 3:17:01 PM UTC-4 j.bre...@netzkolchose.de 
wrote:

> Released the second version of fast_update 
> (https://pypi.org/project/django-fast-update/), based on some findings 
> above, e.g. it now should work with all recent db engine versions 
> supported by django (despite oracle).
>
> Would be happy to get some tests/feedback, before moving things closer 
> to django itself.
>
> Cheers,
> Jerch
>
>
> Am 29.04.22 um 09:34 schrieb Jörg Breitbart:
> > Have found workarounds for older db engines, which makes the more 
> > demanding version requirements from above obsolete. Db support with 
> > these workaround would be:
> > 
> > - SQLite 3.15+ (should work with Python 3.7+ installer, Ubuntu 18 LTS)
> > - MySQL 5.7+ (older versions should work too, not tested)
> > 
> > The workarounds construct the literal values tables from multiple 
> > SELECTs + UNION ALL, which is perfwise slightly worse for sqlite (~40% 
> > slower), but on par for mysql (well, mysql runs much earlier into stack 
> > issues than with TVC, but this can be configured by the user).
> > 
> > Downside - this creates 2 more code paths for 2 db engine versions, that 
> > would need to be tested with the test battery. The nuisance can be 
> > removed by a later release, once db version support is dropped for other 
> > reasons.
> > 
> > I also found possible fast update pattern for:
> > - oracle 19c (prolly older as well, UNION ALL + correlated update)
> > - oracle 21c (UNION ALL + join update)
> > - SQL Server 2014+ (FROM VALUES pattern)
> > 
> > but this needs anyone else to test and integrate, since I have no 
> > development environments for those. So whether they can gain significant 
> > performance remains uncertain until actually adopted.
> > 
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/60090f39-7a47-403a-baad-442b6b2b7488n%40googlegroups.com.