Re: [GENERAL] performance considerations of jsonb vs separate rows

2017-06-19 Thread Merlin Moncure
On Mon, Jun 19, 2017 at 2:29 PM, Rob Nikander  wrote:
> I'm wondering about the tradeoffs, specifically: is it possible to update
> one piece of a jsonb value without having to rewrite the entire field? There
> are cases where that data field was getting pretty big (500kb). Would you
> expect any obvious performance differences between these two options?

You are basically asking, what are the relative
advantages/disadvantages of document model vs data store?  This is a
complicated discussion.  Here are some tradeoffs:

*) Document is always read/written in bulk.  Row data reads/writes are
more discrete (but generally postgres always reads/writes 8kb
minimum!)

*) for documents transaction tracking is for the entire document.
This is more efficient for storage but can have very serious
consequences if sub-portions of the document are updated under heavy
concurrency.

*) Documents are a pain if the structure changes in such a way so as
to require invalidation of all of them.

*) Documents can be a *real* pain if the data relationships change in
some fundamental way.   This is a pain with traditional tables as
well, but relational type models tend to be the most flexible vs other
approaches.

Basically there is a reason why SQL and relational type systems won
the 'data wars' of the 1970's and 1980's.   There are downsides to the
basic approach (especially performance due to joining) but the
simplicity and elegance of being able to model just about any problem
tends to compensate certain performance disadvantages.

Document style storage tends to move the database model out of the
database and into the application (which is a very poor tradeoff IMO)
and fall over when some of the initial assumptions with respect to the
document modeling discrete business units break down; you end up
storing the same information over and over in different documents
which causes all kinds of problems.  They do tend to work well in low-
or no- update applications however.

merlin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] performance considerations of jsonb vs separate rows

2017-06-19 Thread Dmitry Dolgov
> On 19 June 2017 at 21:29, Rob Nikander  wrote:
>
> I'm wondering about the tradeoffs, specifically: is it possible to update
one piece of a jsonb value without having to rewrite the entire field?
There are cases where that data field was getting pretty big (500kb). Would
you expect any obvious performance differences between these two options?

Unfortunately no, an entire jsonb field has to be written back even if
you've touched only one key.
>From my own benchmarks it looks like you'll scarcely notice this (e.g. in
comparison with MongoDB) only if you work
with small enough documents (about 2kb), and everything above this limit
more or less seriously hit the performance.
You can take a look at this presentation [1] from Oleg Bartunov, it
contains results of some benchmarks (from slide 44).

[1]: http://www.sai.msu.su/~megera/postgres/talks/jsonb-pgconf.us-2017.pdf


Re: [GENERAL] performance considerations of jsonb vs separate rows

2017-06-19 Thread Andreas Kretschmer
Am 19. Juni 2017 21:29:40 MESZ schrieb Rob Nikander :

>
>I'm wondering about the tradeoffs, specifically: is it possible to
>update one piece of a jsonb value without having to rewrite the entire
>field? 


Updates in PostgreSQL are always Delete & Insert. So the answer is no. For 
session-data please consider unlogged tables.


Regards, Andreas


-- 
2ndQuadrant - The PostgreSQL Support Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] performance considerations of jsonb vs separate rows

2017-06-19 Thread Rob Nikander
Hi,

I’ve got a web app where I want to store user’s session data. The schema in 
this data changes a lot so it may be useful here to store the session 
properties in either a jsonb column, or in multiple rows. Something like:

  session_id | data 
  100  { a: 1, bar: 2 ...
  101  { a: 3, baz: 123 …

or

  session_id | name | value
  100  a  1
  100  bar2
  101  baz123
  101  a  3
  ... 

The app currently does something like option 1, but on an older pre-jsonb 
version of postgres, so the field is just text. I’m hoping to upgrade Postgres 
soon so jsonb is an option.

I'm wondering about the tradeoffs, specifically: is it possible to update one 
piece of a jsonb value without having to rewrite the entire field? There are 
cases where that data field was getting pretty big (500kb). Would you expect 
any obvious performance differences between these two options?

Yes, I’ll need to build performance tests myself, but that’s a lot of work to 
get two realistic situations with millions of rows, so I’m wondering about 
guesses or common knowledge on this.

thanks,
Rob