Re: [sqlalchemy] jsonb_set in PostgreSQL 9.5+

2017-05-16 Thread Zsolt Ero
Thanks a lot for the explanation, it's all clear now!

On 15 May 2017 at 18:56, mike bayer  wrote:
>
>
> On 05/15/2017 11:56 AM, Zsolt Ero wrote:
>>
>> I might not be understanding something, but for me there are two
>> different concepts here:
>>
>> map_obj = dbsession.query(Map).get(id_)
>>
>> is an object in memory, loaded with a long SELECT statement, allowing
>> us to get and set different attributes and the session / transaction
>> manager commits the auto-detected changes.
>>
>> Whereas with
>>
>> dbsession.query(Map).filter(Map.id == id_).update(
>> {"screenshots": func.jsonb_set(Map.screenshots, '{size}',
>> '"filename.jpg"')}, synchronize_session='fetch')
>>
>> there is no object in memory, what we are writing here is just a nicer
>> syntax for a one line SQL UPDATE query. Even the triggered SELECT
>> statement is just querying for a single .id, which we have anyway.
>
>
> the purpose of synchronize_session is only if you happened to run *both*
> Python statements, so that you have map_obj present as a local variable, and
> wish to expire the now stale value of map_obj.screenshots, so that when you
> next access it, a SELECT is emitted to get the most recent value, e.g. the
> one that's the result of your UPDATE statement.
>
> the "fetch" strategy is actually wasteful here because it runs the query()
> as a SELECT in order to locate the primary keys of the objects that might be
> locally present, but your query is simple enough that this is already
> apparent.  the "fetch" strategy currently doesn't even bother to get the new
> value right now and just expires, I forgot about this. this is why you see
> just the one wasteful SELECT statement.   "fetch" probably should be
> improved to actually fetch and directly update the values for the instances
> it locates, not sure why it wasn't done that way to start.
>
> the "evaluate" strategy does everything in Python, but also won't work
> because it currently expects that the values which were set are also
> evaluatable in Python, also should be improved to at least do a simple
> "expire" for attributes that can't be evaluated in Python.
>
> in this case the only strategy left is sychronize_session=False. However if
> you have map_obj in memory, you'd need to run refresh() on it to get the new
> JSON value if you care about it.
>
>
>
>
>
>
>>
>> Zsolt
>>
>>
>>
>>
>> On 15 May 2017 at 17:29, mike bayer  wrote:
>>>
>>>
>>>
>>> On 05/15/2017 10:54 AM, Zsolt Ero wrote:


 Thanks, it is all clear now. Just out of interest, what is the point
 of synchronize_session='fetch'?
>>>
>>>
>>>
>>> that will do a SELECT and get the new value back and update your ORM
>>> object
>>> in memory.  Set synchronize_session=False if you don't care.
>>>
>>>
>>>

 For me all it does is a simple SELECT maps.id AS maps_id FROM maps
 WHERE maps.id = %(id_1)s

 All I get as a return value is 0: not successful (probably id didn't
 exist), while 1: successful. It is the same behaviour both with
 'fetch' and False.

 Zsolt

 On 15 May 2017 at 16:33, mike bayer  wrote:
>
>
>
>
> On 05/15/2017 10:31 AM, Zsolt Ero wrote:
>>
>>
>>
>> I'm trying to run your example, but it doesn't work:
>>
>> from sqlalchemy import func
>>
>> m = request.dbsession.query(models.Map).get(3)
>> m.screenshots = func.jsonb_set(m.screenshots, '{key}', '"value"')
>> request.dbsession.flush()
>>
>> It ends up in a (psycopg2.ProgrammingError) can't adapt type 'dict'.
>
>
>
>
> jsonb_set(models.Map.screenshots, ...)
>
> because this works against the column, not the value
>
>
>
>
>
>>
>> Also, from the generated SQL it seems to me that it's also doing the
>> full JSONB update from client side, not just inserting a key into the
>> database server side.
>>
>> UPDATE maps SET screenshots=jsonb_set(%(jsonb_set_1)s,
>> %(jsonb_set_2)s, %(jsonb_set_3)s) WHERE maps.id = %(maps_id)s
>> {'maps_id': 3, 'jsonb_set_3': '"value"', 'jsonb_set_2': '{key}',
>> 'jsonb_set_1': {u'small': u'2ad139ee69cdcd9e.jpg', u'full':
>> u'68b3f51491ff1501.jpg'}}
>>
>> On 15 May 2017 at 16:18, Zsolt Ero  wrote:
>>>
>>>
>>>
>>> Thanks for the answer. My use case is the following:
>>>
>>> I have an object (map_obj), which has screenshots in two sizes. I'm
>>> using JSONB columns to store the screenshot filenames.
>>>
>>> Now, the two screenshot sizes are generated in parallel. The code is
>>> like the following:
>>>
>>> map_obj = query(...by id...)
>>> filename = generate_screenshot(size)  # long running screenshot
>>> generation
>>>
>>> try:
>>>dbsession.refresh(map_obj, ['screenshots'])
>>>map_obj.screenshots = 

Re: [sqlalchemy] jsonb_set in PostgreSQL 9.5+

2017-05-15 Thread mike bayer



On 05/15/2017 11:56 AM, Zsolt Ero wrote:

I might not be understanding something, but for me there are two
different concepts here:

map_obj = dbsession.query(Map).get(id_)

is an object in memory, loaded with a long SELECT statement, allowing
us to get and set different attributes and the session / transaction
manager commits the auto-detected changes.

Whereas with

dbsession.query(Map).filter(Map.id == id_).update(
{"screenshots": func.jsonb_set(Map.screenshots, '{size}',
'"filename.jpg"')}, synchronize_session='fetch')

there is no object in memory, what we are writing here is just a nicer
syntax for a one line SQL UPDATE query. Even the triggered SELECT
statement is just querying for a single .id, which we have anyway.


the purpose of synchronize_session is only if you happened to run *both* 
Python statements, so that you have map_obj present as a local variable, 
and wish to expire the now stale value of map_obj.screenshots, so that 
when you next access it, a SELECT is emitted to get the most recent 
value, e.g. the one that's the result of your UPDATE statement.


the "fetch" strategy is actually wasteful here because it runs the 
query() as a SELECT in order to locate the primary keys of the objects 
that might be locally present, but your query is simple enough that this 
is already apparent.  the "fetch" strategy currently doesn't even bother 
to get the new value right now and just expires, I forgot about this. 
this is why you see just the one wasteful SELECT statement.   "fetch" 
probably should be improved to actually fetch and directly update the 
values for the instances it locates, not sure why it wasn't done that 
way to start.


the "evaluate" strategy does everything in Python, but also won't work 
because it currently expects that the values which were set are also 
evaluatable in Python, also should be improved to at least do a simple 
"expire" for attributes that can't be evaluated in Python.


in this case the only strategy left is sychronize_session=False. 
However if you have map_obj in memory, you'd need to run refresh() on it 
to get the new JSON value if you care about it.








Zsolt




On 15 May 2017 at 17:29, mike bayer  wrote:



On 05/15/2017 10:54 AM, Zsolt Ero wrote:


Thanks, it is all clear now. Just out of interest, what is the point
of synchronize_session='fetch'?



that will do a SELECT and get the new value back and update your ORM object
in memory.  Set synchronize_session=False if you don't care.





For me all it does is a simple SELECT maps.id AS maps_id FROM maps
WHERE maps.id = %(id_1)s

All I get as a return value is 0: not successful (probably id didn't
exist), while 1: successful. It is the same behaviour both with
'fetch' and False.

Zsolt

On 15 May 2017 at 16:33, mike bayer  wrote:




On 05/15/2017 10:31 AM, Zsolt Ero wrote:



I'm trying to run your example, but it doesn't work:

from sqlalchemy import func

m = request.dbsession.query(models.Map).get(3)
m.screenshots = func.jsonb_set(m.screenshots, '{key}', '"value"')
request.dbsession.flush()

It ends up in a (psycopg2.ProgrammingError) can't adapt type 'dict'.




jsonb_set(models.Map.screenshots, ...)

because this works against the column, not the value







Also, from the generated SQL it seems to me that it's also doing the
full JSONB update from client side, not just inserting a key into the
database server side.

UPDATE maps SET screenshots=jsonb_set(%(jsonb_set_1)s,
%(jsonb_set_2)s, %(jsonb_set_3)s) WHERE maps.id = %(maps_id)s
{'maps_id': 3, 'jsonb_set_3': '"value"', 'jsonb_set_2': '{key}',
'jsonb_set_1': {u'small': u'2ad139ee69cdcd9e.jpg', u'full':
u'68b3f51491ff1501.jpg'}}

On 15 May 2017 at 16:18, Zsolt Ero  wrote:



Thanks for the answer. My use case is the following:

I have an object (map_obj), which has screenshots in two sizes. I'm
using JSONB columns to store the screenshot filenames.

Now, the two screenshot sizes are generated in parallel. The code is
like the following:

map_obj = query(...by id...)
filename = generate_screenshot(size)  # long running screenshot
generation

try:
   dbsession.refresh(map_obj, ['screenshots'])
   map_obj.screenshots = dict(map_obj.screenshots, **{size:
filename})
except Exception as e:
   logger.warning(...)

It worked well for 99.9% of the cases. The problem is that in the rare
case when both screenshots got rendered within a few milliseconds, one
of the screenshots got lost.

The simple solution was to add lockmode='update' to the refresh, so
this way the refreshes are blocking until the other finishes the
update.

But since this means locking a full row, I was thinking a simple JSONB
insertion would probably be better, since I can avoid locking the row.

Zsolt




On 15 May 2017 at 15:58, mike bayer  wrote:





On 05/15/2017 09:32 AM, Zsolt Ero wrote:




In PostgreSQL 9.5+ it is finally possible to modify a 

Re: [sqlalchemy] jsonb_set in PostgreSQL 9.5+

2017-05-15 Thread Zsolt Ero
I might not be understanding something, but for me there are two
different concepts here:

map_obj = dbsession.query(Map).get(id_)

is an object in memory, loaded with a long SELECT statement, allowing
us to get and set different attributes and the session / transaction
manager commits the auto-detected changes.

Whereas with

dbsession.query(Map).filter(Map.id == id_).update(
   {"screenshots": func.jsonb_set(Map.screenshots, '{size}',
'"filename.jpg"')}, synchronize_session='fetch')

there is no object in memory, what we are writing here is just a nicer
syntax for a one line SQL UPDATE query. Even the triggered SELECT
statement is just querying for a single .id, which we have anyway.

Zsolt




On 15 May 2017 at 17:29, mike bayer  wrote:
>
>
> On 05/15/2017 10:54 AM, Zsolt Ero wrote:
>>
>> Thanks, it is all clear now. Just out of interest, what is the point
>> of synchronize_session='fetch'?
>
>
> that will do a SELECT and get the new value back and update your ORM object
> in memory.  Set synchronize_session=False if you don't care.
>
>
>
>>
>> For me all it does is a simple SELECT maps.id AS maps_id FROM maps
>> WHERE maps.id = %(id_1)s
>>
>> All I get as a return value is 0: not successful (probably id didn't
>> exist), while 1: successful. It is the same behaviour both with
>> 'fetch' and False.
>>
>> Zsolt
>>
>> On 15 May 2017 at 16:33, mike bayer  wrote:
>>>
>>>
>>>
>>> On 05/15/2017 10:31 AM, Zsolt Ero wrote:


 I'm trying to run your example, but it doesn't work:

 from sqlalchemy import func

 m = request.dbsession.query(models.Map).get(3)
 m.screenshots = func.jsonb_set(m.screenshots, '{key}', '"value"')
 request.dbsession.flush()

 It ends up in a (psycopg2.ProgrammingError) can't adapt type 'dict'.
>>>
>>>
>>>
>>> jsonb_set(models.Map.screenshots, ...)
>>>
>>> because this works against the column, not the value
>>>
>>>
>>>
>>>
>>>

 Also, from the generated SQL it seems to me that it's also doing the
 full JSONB update from client side, not just inserting a key into the
 database server side.

 UPDATE maps SET screenshots=jsonb_set(%(jsonb_set_1)s,
 %(jsonb_set_2)s, %(jsonb_set_3)s) WHERE maps.id = %(maps_id)s
 {'maps_id': 3, 'jsonb_set_3': '"value"', 'jsonb_set_2': '{key}',
 'jsonb_set_1': {u'small': u'2ad139ee69cdcd9e.jpg', u'full':
 u'68b3f51491ff1501.jpg'}}

 On 15 May 2017 at 16:18, Zsolt Ero  wrote:
>
>
> Thanks for the answer. My use case is the following:
>
> I have an object (map_obj), which has screenshots in two sizes. I'm
> using JSONB columns to store the screenshot filenames.
>
> Now, the two screenshot sizes are generated in parallel. The code is
> like the following:
>
> map_obj = query(...by id...)
> filename = generate_screenshot(size)  # long running screenshot
> generation
>
> try:
>   dbsession.refresh(map_obj, ['screenshots'])
>   map_obj.screenshots = dict(map_obj.screenshots, **{size:
> filename})
> except Exception as e:
>   logger.warning(...)
>
> It worked well for 99.9% of the cases. The problem is that in the rare
> case when both screenshots got rendered within a few milliseconds, one
> of the screenshots got lost.
>
> The simple solution was to add lockmode='update' to the refresh, so
> this way the refreshes are blocking until the other finishes the
> update.
>
> But since this means locking a full row, I was thinking a simple JSONB
> insertion would probably be better, since I can avoid locking the row.
>
> Zsolt
>
>
>
>
> On 15 May 2017 at 15:58, mike bayer  wrote:
>>
>>
>>
>>
>> On 05/15/2017 09:32 AM, Zsolt Ero wrote:
>>>
>>>
>>>
>>> In PostgreSQL 9.5+ it is finally possible to modify a single key
>>> inside
>>> a
>>> JSONB column. Usage is something like this:
>>>
>>> update maps set screenshots=jsonb_set(screenshots, '{key}',
>>> '"value"')
>>> where id = 10688
>>>
>>> Is it possible to write this query using the ORM somehow? If not,
>>> please
>>> take it as a feature request.
>>
>>
>>
>>
>>
>> You can use that function directly:
>>
>> my_object = session.query(Maps).get(5)
>>
>> my_object.screenshots = func.jsonb_set(my_object.screenshots, '{key}',
>> '"value"')
>>
>> session.flush()
>>
>>
>> as far as "transparent" ORM use of that, like this:
>>
>> my_object.screenshots[key] = "value"
>>
>> right now that is a mutation of the value, and assuming you were using
>> MutableDict to detect this as an ORM change event, the ORM considers
>> "screenshots" to be a single value that would be the target of an
>> UPDATE,
>> meaning 

Re: [sqlalchemy] jsonb_set in PostgreSQL 9.5+

2017-05-15 Thread mike bayer



On 05/15/2017 10:54 AM, Zsolt Ero wrote:

Thanks, it is all clear now. Just out of interest, what is the point
of synchronize_session='fetch'?


that will do a SELECT and get the new value back and update your ORM 
object in memory.  Set synchronize_session=False if you don't care.





For me all it does is a simple SELECT maps.id AS maps_id FROM maps
WHERE maps.id = %(id_1)s

All I get as a return value is 0: not successful (probably id didn't
exist), while 1: successful. It is the same behaviour both with
'fetch' and False.

Zsolt

On 15 May 2017 at 16:33, mike bayer  wrote:



On 05/15/2017 10:31 AM, Zsolt Ero wrote:


I'm trying to run your example, but it doesn't work:

from sqlalchemy import func

m = request.dbsession.query(models.Map).get(3)
m.screenshots = func.jsonb_set(m.screenshots, '{key}', '"value"')
request.dbsession.flush()

It ends up in a (psycopg2.ProgrammingError) can't adapt type 'dict'.



jsonb_set(models.Map.screenshots, ...)

because this works against the column, not the value







Also, from the generated SQL it seems to me that it's also doing the
full JSONB update from client side, not just inserting a key into the
database server side.

UPDATE maps SET screenshots=jsonb_set(%(jsonb_set_1)s,
%(jsonb_set_2)s, %(jsonb_set_3)s) WHERE maps.id = %(maps_id)s
{'maps_id': 3, 'jsonb_set_3': '"value"', 'jsonb_set_2': '{key}',
'jsonb_set_1': {u'small': u'2ad139ee69cdcd9e.jpg', u'full':
u'68b3f51491ff1501.jpg'}}

On 15 May 2017 at 16:18, Zsolt Ero  wrote:


Thanks for the answer. My use case is the following:

I have an object (map_obj), which has screenshots in two sizes. I'm
using JSONB columns to store the screenshot filenames.

Now, the two screenshot sizes are generated in parallel. The code is
like the following:

map_obj = query(...by id...)
filename = generate_screenshot(size)  # long running screenshot
generation

try:
  dbsession.refresh(map_obj, ['screenshots'])
  map_obj.screenshots = dict(map_obj.screenshots, **{size: filename})
except Exception as e:
  logger.warning(...)

It worked well for 99.9% of the cases. The problem is that in the rare
case when both screenshots got rendered within a few milliseconds, one
of the screenshots got lost.

The simple solution was to add lockmode='update' to the refresh, so
this way the refreshes are blocking until the other finishes the
update.

But since this means locking a full row, I was thinking a simple JSONB
insertion would probably be better, since I can avoid locking the row.

Zsolt




On 15 May 2017 at 15:58, mike bayer  wrote:




On 05/15/2017 09:32 AM, Zsolt Ero wrote:



In PostgreSQL 9.5+ it is finally possible to modify a single key inside
a
JSONB column. Usage is something like this:

update maps set screenshots=jsonb_set(screenshots, '{key}', '"value"')
where id = 10688

Is it possible to write this query using the ORM somehow? If not,
please
take it as a feature request.





You can use that function directly:

my_object = session.query(Maps).get(5)

my_object.screenshots = func.jsonb_set(my_object.screenshots, '{key}',
'"value"')

session.flush()


as far as "transparent" ORM use of that, like this:

my_object.screenshots[key] = "value"

right now that is a mutation of the value, and assuming you were using
MutableDict to detect this as an ORM change event, the ORM considers
"screenshots" to be a single value that would be the target of an
UPDATE,
meaning the whole JSON dictionary is passed into the UPDATE. There is no
infrastructure for the ORM to automatically turn certain column updates
into
finely-detailed SQL function calls.   I can imagine that there might be
some
event-based way to make this happen transparently within the flush,
however,
but I'd need to spend some time poking around to work out how that might
work.


I'm not familiar with what the advantage to jsonb_set() would be and I
can
only guess it's some kind of performance advantage.   I'd be curious to
see
under what scenarios being able to set one element of the JSON vs.
UPDATEing
the whole thing is a performance advantage significant compared to the
usual
overhead of the ORM flush process; that is, Postgresql is really fast,
and
for this optimization to be significant, you probably need to be calling
the
Core function directly anyway rather than going through the whole ORM
flush
process.   But this is all based on my assumption as to what your goal
of
using this function is.







--
SQLAlchemy -
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and
Verifiable Example. See http://stackoverflow.com/help/mcve for a full
description.
---
You received this message because you are subscribed to the Google
Groups
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send
an
email to sqlalchemy+unsubscr...@googlegroups.com

Re: [sqlalchemy] jsonb_set in PostgreSQL 9.5+

2017-05-15 Thread Jonathan Vanasco

On Monday, May 15, 2017 at 9:58:57 AM UTC-4, Mike Bayer wrote:
>
> I'd be curious to see under what scenarios being able to set one element 
> of the JSON 
> vs. UPDATEing the whole thing is a performance advantage significant 
> compared to the usual overhead of the ORM flush process; that is, 
> Postgresql is really fast, and for this optimization to be significant, 
> you probably need to be calling the Core function directly anyway rather 
> than going through the whole ORM flush process.  


I did a bunch of tests on this a while back, but in regards to the 
very-similar HTSORE column and some tests on JSONB.

The big takeaways--

* after a certain amount of data is in the column, the most significant 
issue is bandwidth and timing from the payload transfer.
* there is a decent performance update if you're in a sweet spot where the 
column payload is TOASTable and that's the only update.  in that instance, 
postgres just updates the toast table  -- otherwise it does the standard 
routine of "mark the old row for deletion, copy the row and update it as 
the new row".  toasting a jsonb column has been tweaked a lot, the last 
time I checked it had to be "just right" -- big enough to toast, but small 
enough to fit in a single toast column.  

tldr, it won't noticeably affect performance for most situations.  

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


Re: [sqlalchemy] jsonb_set in PostgreSQL 9.5+

2017-05-15 Thread Zsolt Ero
Thanks, it is all clear now. Just out of interest, what is the point
of synchronize_session='fetch'?

For me all it does is a simple SELECT maps.id AS maps_id FROM maps
WHERE maps.id = %(id_1)s

All I get as a return value is 0: not successful (probably id didn't
exist), while 1: successful. It is the same behaviour both with
'fetch' and False.

Zsolt

On 15 May 2017 at 16:33, mike bayer  wrote:
>
>
> On 05/15/2017 10:31 AM, Zsolt Ero wrote:
>>
>> I'm trying to run your example, but it doesn't work:
>>
>> from sqlalchemy import func
>>
>> m = request.dbsession.query(models.Map).get(3)
>> m.screenshots = func.jsonb_set(m.screenshots, '{key}', '"value"')
>> request.dbsession.flush()
>>
>> It ends up in a (psycopg2.ProgrammingError) can't adapt type 'dict'.
>
>
> jsonb_set(models.Map.screenshots, ...)
>
> because this works against the column, not the value
>
>
>
>
>
>>
>> Also, from the generated SQL it seems to me that it's also doing the
>> full JSONB update from client side, not just inserting a key into the
>> database server side.
>>
>> UPDATE maps SET screenshots=jsonb_set(%(jsonb_set_1)s,
>> %(jsonb_set_2)s, %(jsonb_set_3)s) WHERE maps.id = %(maps_id)s
>> {'maps_id': 3, 'jsonb_set_3': '"value"', 'jsonb_set_2': '{key}',
>> 'jsonb_set_1': {u'small': u'2ad139ee69cdcd9e.jpg', u'full':
>> u'68b3f51491ff1501.jpg'}}
>>
>> On 15 May 2017 at 16:18, Zsolt Ero  wrote:
>>>
>>> Thanks for the answer. My use case is the following:
>>>
>>> I have an object (map_obj), which has screenshots in two sizes. I'm
>>> using JSONB columns to store the screenshot filenames.
>>>
>>> Now, the two screenshot sizes are generated in parallel. The code is
>>> like the following:
>>>
>>> map_obj = query(...by id...)
>>> filename = generate_screenshot(size)  # long running screenshot
>>> generation
>>>
>>> try:
>>>  dbsession.refresh(map_obj, ['screenshots'])
>>>  map_obj.screenshots = dict(map_obj.screenshots, **{size: filename})
>>> except Exception as e:
>>>  logger.warning(...)
>>>
>>> It worked well for 99.9% of the cases. The problem is that in the rare
>>> case when both screenshots got rendered within a few milliseconds, one
>>> of the screenshots got lost.
>>>
>>> The simple solution was to add lockmode='update' to the refresh, so
>>> this way the refreshes are blocking until the other finishes the
>>> update.
>>>
>>> But since this means locking a full row, I was thinking a simple JSONB
>>> insertion would probably be better, since I can avoid locking the row.
>>>
>>> Zsolt
>>>
>>>
>>>
>>>
>>> On 15 May 2017 at 15:58, mike bayer  wrote:



 On 05/15/2017 09:32 AM, Zsolt Ero wrote:
>
>
> In PostgreSQL 9.5+ it is finally possible to modify a single key inside
> a
> JSONB column. Usage is something like this:
>
> update maps set screenshots=jsonb_set(screenshots, '{key}', '"value"')
> where id = 10688
>
> Is it possible to write this query using the ORM somehow? If not,
> please
> take it as a feature request.




 You can use that function directly:

 my_object = session.query(Maps).get(5)

 my_object.screenshots = func.jsonb_set(my_object.screenshots, '{key}',
 '"value"')

 session.flush()


 as far as "transparent" ORM use of that, like this:

 my_object.screenshots[key] = "value"

 right now that is a mutation of the value, and assuming you were using
 MutableDict to detect this as an ORM change event, the ORM considers
 "screenshots" to be a single value that would be the target of an
 UPDATE,
 meaning the whole JSON dictionary is passed into the UPDATE. There is no
 infrastructure for the ORM to automatically turn certain column updates
 into
 finely-detailed SQL function calls.   I can imagine that there might be
 some
 event-based way to make this happen transparently within the flush,
 however,
 but I'd need to spend some time poking around to work out how that might
 work.


 I'm not familiar with what the advantage to jsonb_set() would be and I
 can
 only guess it's some kind of performance advantage.   I'd be curious to
 see
 under what scenarios being able to set one element of the JSON vs.
 UPDATEing
 the whole thing is a performance advantage significant compared to the
 usual
 overhead of the ORM flush process; that is, Postgresql is really fast,
 and
 for this optimization to be significant, you probably need to be calling
 the
 Core function directly anyway rather than going through the whole ORM
 flush
 process.   But this is all based on my assumption as to what your goal
 of
 using this function is.



>
>
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>

Re: [sqlalchemy] jsonb_set in PostgreSQL 9.5+

2017-05-15 Thread mike bayer



On 05/15/2017 10:31 AM, Zsolt Ero wrote:

I'm trying to run your example, but it doesn't work:

from sqlalchemy import func

m = request.dbsession.query(models.Map).get(3)
m.screenshots = func.jsonb_set(m.screenshots, '{key}', '"value"')
request.dbsession.flush()

It ends up in a (psycopg2.ProgrammingError) can't adapt type 'dict'.


jsonb_set(models.Map.screenshots, ...)

because this works against the column, not the value






Also, from the generated SQL it seems to me that it's also doing the
full JSONB update from client side, not just inserting a key into the
database server side.

UPDATE maps SET screenshots=jsonb_set(%(jsonb_set_1)s,
%(jsonb_set_2)s, %(jsonb_set_3)s) WHERE maps.id = %(maps_id)s
{'maps_id': 3, 'jsonb_set_3': '"value"', 'jsonb_set_2': '{key}',
'jsonb_set_1': {u'small': u'2ad139ee69cdcd9e.jpg', u'full':
u'68b3f51491ff1501.jpg'}}

On 15 May 2017 at 16:18, Zsolt Ero  wrote:

Thanks for the answer. My use case is the following:

I have an object (map_obj), which has screenshots in two sizes. I'm
using JSONB columns to store the screenshot filenames.

Now, the two screenshot sizes are generated in parallel. The code is
like the following:

map_obj = query(...by id...)
filename = generate_screenshot(size)  # long running screenshot generation

try:
 dbsession.refresh(map_obj, ['screenshots'])
 map_obj.screenshots = dict(map_obj.screenshots, **{size: filename})
except Exception as e:
 logger.warning(...)

It worked well for 99.9% of the cases. The problem is that in the rare
case when both screenshots got rendered within a few milliseconds, one
of the screenshots got lost.

The simple solution was to add lockmode='update' to the refresh, so
this way the refreshes are blocking until the other finishes the
update.

But since this means locking a full row, I was thinking a simple JSONB
insertion would probably be better, since I can avoid locking the row.

Zsolt




On 15 May 2017 at 15:58, mike bayer  wrote:



On 05/15/2017 09:32 AM, Zsolt Ero wrote:


In PostgreSQL 9.5+ it is finally possible to modify a single key inside a
JSONB column. Usage is something like this:

update maps set screenshots=jsonb_set(screenshots, '{key}', '"value"')
where id = 10688

Is it possible to write this query using the ORM somehow? If not, please
take it as a feature request.




You can use that function directly:

my_object = session.query(Maps).get(5)

my_object.screenshots = func.jsonb_set(my_object.screenshots, '{key}',
'"value"')

session.flush()


as far as "transparent" ORM use of that, like this:

my_object.screenshots[key] = "value"

right now that is a mutation of the value, and assuming you were using
MutableDict to detect this as an ORM change event, the ORM considers
"screenshots" to be a single value that would be the target of an UPDATE,
meaning the whole JSON dictionary is passed into the UPDATE. There is no
infrastructure for the ORM to automatically turn certain column updates into
finely-detailed SQL function calls.   I can imagine that there might be some
event-based way to make this happen transparently within the flush, however,
but I'd need to spend some time poking around to work out how that might
work.


I'm not familiar with what the advantage to jsonb_set() would be and I can
only guess it's some kind of performance advantage.   I'd be curious to see
under what scenarios being able to set one element of the JSON vs. UPDATEing
the whole thing is a performance advantage significant compared to the usual
overhead of the ORM flush process; that is, Postgresql is really fast, and
for this optimization to be significant, you probably need to be calling the
Core function directly anyway rather than going through the whole ORM flush
process.   But this is all based on my assumption as to what your goal of
using this function is.







--
SQLAlchemy -
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and
Verifiable Example. See http://stackoverflow.com/help/mcve for a full
description.
---
You received this message because you are subscribed to the Google Groups
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to sqlalchemy+unsubscr...@googlegroups.com
.
To post to this group, send email to sqlalchemy@googlegroups.com
.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.



--
SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and
Verifiable Example.  See  http://stackoverflow.com/help/mcve for a full
description.
--- You received this message because you are subscribed to a topic in the
Google Groups "sqlalchemy" group.

Re: [sqlalchemy] jsonb_set in PostgreSQL 9.5+

2017-05-15 Thread mike bayer



On 05/15/2017 10:18 AM, Zsolt Ero wrote:

Thanks for the answer. My use case is the following:

I have an object (map_obj), which has screenshots in two sizes. I'm
using JSONB columns to store the screenshot filenames.

Now, the two screenshot sizes are generated in parallel. The code is
like the following:

map_obj = query(...by id...)
filename = generate_screenshot(size)  # long running screenshot generation

try:
 dbsession.refresh(map_obj, ['screenshots'])
 map_obj.screenshots = dict(map_obj.screenshots, **{size: filename})
except Exception as e:
 logger.warning(...)

It worked well for 99.9% of the cases. The problem is that in the rare
case when both screenshots got rendered within a few milliseconds, one
of the screenshots got lost.

The simple solution was to add lockmode='update' to the refresh, so
this way the refreshes are blocking until the other finishes the
update.

But since this means locking a full row, I was thinking a simple JSONB
insertion would probably be better, since I can avoid locking the row.



OK since you're looking to get around a race and do an "atomic" update, 
I'd recommend running UPDATE straight, with ORM you can get this with 
query.update()


session.query(YourClass).filter(YourClass.id == 
whatever).update({"screenshots": func.jsonb_set(YourClass.screenshots, 
"key", "value")}, synchronize_session='fetch')


that will also refetch the current value of the row





Zsolt




On 15 May 2017 at 15:58, mike bayer  wrote:



On 05/15/2017 09:32 AM, Zsolt Ero wrote:


In PostgreSQL 9.5+ it is finally possible to modify a single key inside a
JSONB column. Usage is something like this:

update maps set screenshots=jsonb_set(screenshots, '{key}', '"value"')
where id = 10688

Is it possible to write this query using the ORM somehow? If not, please
take it as a feature request.




You can use that function directly:

my_object = session.query(Maps).get(5)

my_object.screenshots = func.jsonb_set(my_object.screenshots, '{key}',
'"value"')

session.flush()


as far as "transparent" ORM use of that, like this:

my_object.screenshots[key] = "value"

right now that is a mutation of the value, and assuming you were using
MutableDict to detect this as an ORM change event, the ORM considers
"screenshots" to be a single value that would be the target of an UPDATE,
meaning the whole JSON dictionary is passed into the UPDATE. There is no
infrastructure for the ORM to automatically turn certain column updates into
finely-detailed SQL function calls.   I can imagine that there might be some
event-based way to make this happen transparently within the flush, however,
but I'd need to spend some time poking around to work out how that might
work.


I'm not familiar with what the advantage to jsonb_set() would be and I can
only guess it's some kind of performance advantage.   I'd be curious to see
under what scenarios being able to set one element of the JSON vs. UPDATEing
the whole thing is a performance advantage significant compared to the usual
overhead of the ORM flush process; that is, Postgresql is really fast, and
for this optimization to be significant, you probably need to be calling the
Core function directly anyway rather than going through the whole ORM flush
process.   But this is all based on my assumption as to what your goal of
using this function is.







--
SQLAlchemy -
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and
Verifiable Example. See http://stackoverflow.com/help/mcve for a full
description.
---
You received this message because you are subscribed to the Google Groups
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to sqlalchemy+unsubscr...@googlegroups.com
.
To post to this group, send email to sqlalchemy@googlegroups.com
.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.



--
SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and
Verifiable Example.  See  http://stackoverflow.com/help/mcve for a full
description.
--- You received this message because you are subscribed to a topic in the
Google Groups "sqlalchemy" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/sqlalchemy/hjjIyEC8KHQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.




--
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper



Re: [sqlalchemy] jsonb_set in PostgreSQL 9.5+

2017-05-15 Thread Zsolt Ero
I'm trying to run your example, but it doesn't work:

from sqlalchemy import func

m = request.dbsession.query(models.Map).get(3)
m.screenshots = func.jsonb_set(m.screenshots, '{key}', '"value"')
request.dbsession.flush()

It ends up in a (psycopg2.ProgrammingError) can't adapt type 'dict'.

Also, from the generated SQL it seems to me that it's also doing the
full JSONB update from client side, not just inserting a key into the
database server side.

UPDATE maps SET screenshots=jsonb_set(%(jsonb_set_1)s,
%(jsonb_set_2)s, %(jsonb_set_3)s) WHERE maps.id = %(maps_id)s
{'maps_id': 3, 'jsonb_set_3': '"value"', 'jsonb_set_2': '{key}',
'jsonb_set_1': {u'small': u'2ad139ee69cdcd9e.jpg', u'full':
u'68b3f51491ff1501.jpg'}}

On 15 May 2017 at 16:18, Zsolt Ero  wrote:
> Thanks for the answer. My use case is the following:
>
> I have an object (map_obj), which has screenshots in two sizes. I'm
> using JSONB columns to store the screenshot filenames.
>
> Now, the two screenshot sizes are generated in parallel. The code is
> like the following:
>
> map_obj = query(...by id...)
> filename = generate_screenshot(size)  # long running screenshot generation
>
> try:
> dbsession.refresh(map_obj, ['screenshots'])
> map_obj.screenshots = dict(map_obj.screenshots, **{size: filename})
> except Exception as e:
> logger.warning(...)
>
> It worked well for 99.9% of the cases. The problem is that in the rare
> case when both screenshots got rendered within a few milliseconds, one
> of the screenshots got lost.
>
> The simple solution was to add lockmode='update' to the refresh, so
> this way the refreshes are blocking until the other finishes the
> update.
>
> But since this means locking a full row, I was thinking a simple JSONB
> insertion would probably be better, since I can avoid locking the row.
>
> Zsolt
>
>
>
>
> On 15 May 2017 at 15:58, mike bayer  wrote:
>>
>>
>> On 05/15/2017 09:32 AM, Zsolt Ero wrote:
>>>
>>> In PostgreSQL 9.5+ it is finally possible to modify a single key inside a
>>> JSONB column. Usage is something like this:
>>>
>>> update maps set screenshots=jsonb_set(screenshots, '{key}', '"value"')
>>> where id = 10688
>>>
>>> Is it possible to write this query using the ORM somehow? If not, please
>>> take it as a feature request.
>>
>>
>>
>> You can use that function directly:
>>
>> my_object = session.query(Maps).get(5)
>>
>> my_object.screenshots = func.jsonb_set(my_object.screenshots, '{key}',
>> '"value"')
>>
>> session.flush()
>>
>>
>> as far as "transparent" ORM use of that, like this:
>>
>> my_object.screenshots[key] = "value"
>>
>> right now that is a mutation of the value, and assuming you were using
>> MutableDict to detect this as an ORM change event, the ORM considers
>> "screenshots" to be a single value that would be the target of an UPDATE,
>> meaning the whole JSON dictionary is passed into the UPDATE. There is no
>> infrastructure for the ORM to automatically turn certain column updates into
>> finely-detailed SQL function calls.   I can imagine that there might be some
>> event-based way to make this happen transparently within the flush, however,
>> but I'd need to spend some time poking around to work out how that might
>> work.
>>
>>
>> I'm not familiar with what the advantage to jsonb_set() would be and I can
>> only guess it's some kind of performance advantage.   I'd be curious to see
>> under what scenarios being able to set one element of the JSON vs. UPDATEing
>> the whole thing is a performance advantage significant compared to the usual
>> overhead of the ORM flush process; that is, Postgresql is really fast, and
>> for this optimization to be significant, you probably need to be calling the
>> Core function directly anyway rather than going through the whole ORM flush
>> process.   But this is all based on my assumption as to what your goal of
>> using this function is.
>>
>>
>>
>>>
>>>
>>>
>>> --
>>> SQLAlchemy -
>>> The Python SQL Toolkit and Object Relational Mapper
>>>
>>> http://www.sqlalchemy.org/
>>>
>>> To post example code, please provide an MCVE: Minimal, Complete, and
>>> Verifiable Example. See http://stackoverflow.com/help/mcve for a full
>>> description.
>>> ---
>>> You received this message because you are subscribed to the Google Groups
>>> "sqlalchemy" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an
>>> email to sqlalchemy+unsubscr...@googlegroups.com
>>> .
>>> To post to this group, send email to sqlalchemy@googlegroups.com
>>> .
>>> Visit this group at https://groups.google.com/group/sqlalchemy.
>>> For more options, visit https://groups.google.com/d/optout.
>>
>>
>> --
>> SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper
>>
>> http://www.sqlalchemy.org/
>>
>> To post example code, please provide an MCVE: Minimal, Complete, and
>> Verifiable Example.  See  

Re: [sqlalchemy] jsonb_set in PostgreSQL 9.5+

2017-05-15 Thread Zsolt Ero
Thanks for the answer. My use case is the following:

I have an object (map_obj), which has screenshots in two sizes. I'm
using JSONB columns to store the screenshot filenames.

Now, the two screenshot sizes are generated in parallel. The code is
like the following:

map_obj = query(...by id...)
filename = generate_screenshot(size)  # long running screenshot generation

try:
dbsession.refresh(map_obj, ['screenshots'])
map_obj.screenshots = dict(map_obj.screenshots, **{size: filename})
except Exception as e:
logger.warning(...)

It worked well for 99.9% of the cases. The problem is that in the rare
case when both screenshots got rendered within a few milliseconds, one
of the screenshots got lost.

The simple solution was to add lockmode='update' to the refresh, so
this way the refreshes are blocking until the other finishes the
update.

But since this means locking a full row, I was thinking a simple JSONB
insertion would probably be better, since I can avoid locking the row.

Zsolt




On 15 May 2017 at 15:58, mike bayer  wrote:
>
>
> On 05/15/2017 09:32 AM, Zsolt Ero wrote:
>>
>> In PostgreSQL 9.5+ it is finally possible to modify a single key inside a
>> JSONB column. Usage is something like this:
>>
>> update maps set screenshots=jsonb_set(screenshots, '{key}', '"value"')
>> where id = 10688
>>
>> Is it possible to write this query using the ORM somehow? If not, please
>> take it as a feature request.
>
>
>
> You can use that function directly:
>
> my_object = session.query(Maps).get(5)
>
> my_object.screenshots = func.jsonb_set(my_object.screenshots, '{key}',
> '"value"')
>
> session.flush()
>
>
> as far as "transparent" ORM use of that, like this:
>
> my_object.screenshots[key] = "value"
>
> right now that is a mutation of the value, and assuming you were using
> MutableDict to detect this as an ORM change event, the ORM considers
> "screenshots" to be a single value that would be the target of an UPDATE,
> meaning the whole JSON dictionary is passed into the UPDATE. There is no
> infrastructure for the ORM to automatically turn certain column updates into
> finely-detailed SQL function calls.   I can imagine that there might be some
> event-based way to make this happen transparently within the flush, however,
> but I'd need to spend some time poking around to work out how that might
> work.
>
>
> I'm not familiar with what the advantage to jsonb_set() would be and I can
> only guess it's some kind of performance advantage.   I'd be curious to see
> under what scenarios being able to set one element of the JSON vs. UPDATEing
> the whole thing is a performance advantage significant compared to the usual
> overhead of the ORM flush process; that is, Postgresql is really fast, and
> for this optimization to be significant, you probably need to be calling the
> Core function directly anyway rather than going through the whole ORM flush
> process.   But this is all based on my assumption as to what your goal of
> using this function is.
>
>
>
>>
>>
>>
>> --
>> SQLAlchemy -
>> The Python SQL Toolkit and Object Relational Mapper
>>
>> http://www.sqlalchemy.org/
>>
>> To post example code, please provide an MCVE: Minimal, Complete, and
>> Verifiable Example. See http://stackoverflow.com/help/mcve for a full
>> description.
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "sqlalchemy" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to sqlalchemy+unsubscr...@googlegroups.com
>> .
>> To post to this group, send email to sqlalchemy@googlegroups.com
>> .
>> Visit this group at https://groups.google.com/group/sqlalchemy.
>> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and
> Verifiable Example.  See  http://stackoverflow.com/help/mcve for a full
> description.
> --- You received this message because you are subscribed to a topic in the
> Google Groups "sqlalchemy" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/sqlalchemy/hjjIyEC8KHQ/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> sqlalchemy+unsubscr...@googlegroups.com.
> To post to this group, send email to sqlalchemy@googlegroups.com.
> Visit this group at https://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" 

Re: [sqlalchemy] jsonb_set in PostgreSQL 9.5+

2017-05-15 Thread mike bayer



On 05/15/2017 09:32 AM, Zsolt Ero wrote:
In PostgreSQL 9.5+ it is finally possible to modify a single key inside 
a JSONB column. Usage is something like this:


update maps set screenshots=jsonb_set(screenshots, '{key}', '"value"') 
where id = 10688


Is it possible to write this query using the ORM somehow? If not, please 
take it as a feature request.



You can use that function directly:

my_object = session.query(Maps).get(5)

my_object.screenshots = func.jsonb_set(my_object.screenshots, '{key}', 
'"value"')


session.flush()


as far as "transparent" ORM use of that, like this:

my_object.screenshots[key] = "value"

right now that is a mutation of the value, and assuming you were using 
MutableDict to detect this as an ORM change event, the ORM considers 
"screenshots" to be a single value that would be the target of an 
UPDATE, meaning the whole JSON dictionary is passed into the UPDATE. 
There is no infrastructure for the ORM to automatically turn certain 
column updates into finely-detailed SQL function calls.   I can imagine 
that there might be some event-based way to make this happen 
transparently within the flush, however, but I'd need to spend some time 
poking around to work out how that might work.



I'm not familiar with what the advantage to jsonb_set() would be and I 
can only guess it's some kind of performance advantage.   I'd be curious 
to see under what scenarios being able to set one element of the JSON 
vs. UPDATEing the whole thing is a performance advantage significant 
compared to the usual overhead of the ORM flush process; that is, 
Postgresql is really fast, and for this optimization to be significant, 
you probably need to be calling the Core function directly anyway rather 
than going through the whole ORM flush process.   But this is all based 
on my assumption as to what your goal of using this function is.








--
SQLAlchemy -
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and 
Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
description.

---
You received this message because you are subscribed to the Google 
Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send 
an email to sqlalchemy+unsubscr...@googlegroups.com 
.
To post to this group, send email to sqlalchemy@googlegroups.com 
.

Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


--
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper


http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.

To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.


[sqlalchemy] jsonb_set in PostgreSQL 9.5+

2017-05-15 Thread Zsolt Ero
In PostgreSQL 9.5+ it is finally possible to modify a single key inside a 
JSONB column. Usage is something like this:

update maps set screenshots=jsonb_set(screenshots, '{key}', '"value"') 
where id = 10688

Is it possible to write this query using the ORM somehow? If not, please 
take it as a feature request.



-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.